In [1]:
project: getenv `project_id
csbucketname: getenv `csbucketname

# From BigQuery to kdb

In [2]:
// extract from BigQuery to Cloud Storage
system "bq extract --destination_format NEWLINE_DELIMITED_JSON ", project, ":bqkdb.allBQSimpleTypes gs://", csbucketname, "/allBQSimpleTypes.json"

Waiting on bqjob_r449c22e13fbc4e13_0000016e4638ce7d_1 ... (0s) Current status: DONE   

""


In [3]:
csfilename: "gs://", csbucketname, "/allBQSimpleTypes.json"

In [4]:
// Copy from Cloud Storage to local box
system "gsutil cp ", csfilename, " /tmp/"

Copying gs://storagebodon/allBQSimpleTypes.json...
- [1 files][  387.0 B/  387.0 B]                                                
Operation completed over 1 objects/387.0 B.                                      




## Manual type conversion
* string, boolean and float types are properly handled
* other types need to be casted manually
* timestamps needs minor conversion, chopping timezone information off


In [5]:
decode: {[fM; j] 
  k: .j.k j; 
  key[k]!fM[key k]@'value k}

In [6]:
formatterMap: (enlist `)!enlist (::)   // default formatter: leave unchanged

In [7]:
formatterMap[`intCol]: "I"$
formatterMap[`tsCol]: 	{"P"$-4 _ x}
formatterMap[`dateCol]: "D"$
formatterMap[`timeCol]: "T"$
formatterMap[`dtCol]: "P"$

In [8]:
t: formatterMap decode/: read0 hsym `$"/tmp/allBQSimpleTypes.json"

In [9]:
t

stringCol intCol floatCol boolCol tsCol                         dateCol    ti..
-----------------------------------------------------------------------------..
"GOOG"    42     100.3    1       2019.11.06D01:45:00.000000000 2019.11.11 16..
"AAPL"    200    104.9    0       2019.11.11D16:32:04.291299000 2019.11.11 16..


In [10]:
meta t

c        | t f a
---------| -----
stringCol| C    
intCol   | i    
floatCol | f    
boolCol  | b    
tsCol    | p    
dateCol  | d    
timeCol  | t    
dtCol    | p    


# From kdb to BigQuery

Let us save the fixed kdb table to JSON

In [11]:
save `:/tmp/t.json

`:/tmp/t.json


In [12]:
system "bq load --autodetect --source_format NEWLINE_DELIMITED_JSON bqkdb.allBQSimpleTypes_json /tmp/t.json"

Waiting on bqjob_r1e8487c369c5982a_0000016e5c3673e8_1 ... (3s) Current status: DONE   

""
""


We can see that the **column orders do not match**

In [13]:
system "bq show bqkdb.allBQSimpleTypes_json"

"Table ferenc-world:bqkdb.allBQSimpleTypes_json"
""
"   Last modified           Schema          Total Rows   Total Bytes   Expira..
" ----------------- ---------------------- ------------ ------------- -------..
"  11 Nov 21:45:18   |- dtCol: timestamp    2            110                 ..
"                    |- timeCol: time                                        ..
"                    |- dateCol: date                                        ..
"                    |- tsCol: timestamp                                     ..
"                    |- floatCol: float                                      ..
"                    |- boolCol: boolean                                     ..
"                    |- intCol: integer                                      ..
"                    |- stringCol: string                                    ..
""


## Cleanup

In [14]:
// Cloud Storage
system "gsutil rm ", csfilename

Removing gs://storagebodon/allBQSimpleTypes.json...
/ [1 objects]                                                                   
Operation completed over 1 objects.                                              




In [15]:
// local files
system "rm /tmp/t.json"



In [16]:
// BigQuery table
system "bq rm -f bqkdb.allBQSimpleTypes_json"

