Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support BigQuery for tensorflow-io #29

Closed
yongtang opened this issue Dec 16, 2018 · 6 comments · Fixed by #328
Closed

Support BigQuery for tensorflow-io #29

yongtang opened this issue Dec 16, 2018 · 6 comments · Fixed by #328
Labels
enhancement Enhancement request

Comments

@yongtang
Copy link
Member

BigQuery is Google's serverless cloud data analytics platform. During last SIG I/O call the support for BigQuery (BigQueryDataset) was mentioned.

It would be good to add BigQuery (BigQueryDataset) to tensorflow-io so that users could use tensorflow to do more big data analytics on cloud.

Note that Google's BigQuery seems to have no C++ client library. It does have a python support and a RESTful API support. The implementation likely will need to use BigQuery's python client library, or http?

@yongtang yongtang added the enhancement Enhancement request label Mar 3, 2019
@yongtang
Copy link
Member Author

yongtang commented Mar 3, 2019

One challenge is that BigQuery essentially returns a json response so a json library with good performance in C/C++ is needed.

@suphoff
Copy link
Contributor

suphoff commented Mar 4, 2019

@yongtang: I used jsoncpp last year (https://github.com/open-source-parsers/jsoncpp) for a RESTful API. It was nowhere near a performance critical path so I can't speak for the performance - however I can say it was trivially to use it for encoding & decoding and it was super easy to build it as an external dependency with cmake.

@suphoff
Copy link
Contributor

suphoff commented Mar 4, 2019

@yongtang: Just checked and it is also used in TF - so all bazel build infrastructure should be there

@yongtang
Copy link
Member Author

yongtang commented Mar 4, 2019

@suphoff Thanks! Will take a look at jsoncpp.

@vlasenkoalexey
Copy link
Contributor

I've recently joined Google CloudAI team and going to work on this.
Btw, BigQuery has recently released https://cloud.google.com/bigquery/docs/reference/storage/rpc/google.cloud.bigquery.storage.v1beta1#google.cloud.bigquery.storage.v1beta1.ReadRowsRequest API that is using gRPC and specifically designed for high performance scenarios like this one.

@yongtang
Copy link
Member Author

yongtang commented Jun 6, 2019

Thanks @vlasenkoalexey 👍 ! Let me know if you have any questions or anything I could be of help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement request
Projects
None yet
3 participants