Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Task]: Cloud Bigtable Change Stream Java Connector Release #27183

Closed
1 of 15 tasks
tonytanger opened this issue Jun 20, 2023 · 4 comments
Closed
1 of 15 tasks

[Task]: Cloud Bigtable Change Stream Java Connector Release #27183

tonytanger opened this issue Jun 20, 2023 · 4 comments
Assignees

Comments

@tonytanger
Copy link
Contributor

What needs to happen?

Cloud Bigtable Change Stream Java Connector is ready for general use.

Issue Priority

Priority: 2 (default / most normal work should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner
@tonytanger
Copy link
Contributor Author

.take-issue

@JrSchild
Copy link

JrSchild commented Oct 20, 2023

Hi @tonytanger,

Apologies for digging up this issue. Thank you for your work on the BigTable change streams! This is really helpful. Would you be able to give an update on the status of the Python SDK for Bigtable change streams? If this is not yet available, is there a way for our data engineering team to leverage their Python experience while using the Java implementation to stream data from BigTable?
Our goal is to consume the change stream (preferably Python), do some transformations, and also write the result to BigQuery from a Python environment.

Edit: My current strategy is to create a single JAR file as a standalone PTransform, which accepts a single message with the BigTable from the Python side, and then starts streaming the changes back out into my Python environment. Similar to the example. Would that be a reasonable approach?

Much appreciated!

@tonytanger
Copy link
Contributor Author

We currently don't have plans to implement the python SDK.

I can't comment much on the multi language aspect as I'm not familiar with it. But if you're able to setup the Java side and it's able to output the ChangeStreamMutation messages in a way that can be consumed by python, then it sounds like a possible solution.

A possible solution is to setup a Java Beam pipeline that outputs to another system like Cloud PubSub which has support for Python directly.

If you would like to discuss your use case more, please file a support ticket with Cloud Bigtable, or feel free to make a feature request.

@JrSchild
Copy link

Thank you for the reply. We will POC the multi language pipeline first and keep PubSub as possible backup solution. I'll file a support ticket if we need more information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants