Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-3342] CBT Read IO Connector Wrapper #11295

Closed
wants to merge 29 commits into from

Conversation

mf2199
Copy link

@mf2199 mf2199 commented Apr 2, 2020

[CBT connector: Expanding bigtableio with read functionality.]

Please add a meaningful description for your change here


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

Post-Commit Tests Status (on master branch)

Lang SDK Apex Dataflow Flink Gearpump Samza Spark
Go Build Status --- --- Build Status --- --- Build Status
Java Build Status Build Status Build Status
Build Status
Build Status
Build Status
Build Status
Build Status Build Status Build Status
Build Status
Build Status
Python Build Status
Build Status
Build Status
Build Status
--- Build Status
Build Status
Build Status
Build Status
Build Status
--- --- Build Status
XLang --- --- --- Build Status --- --- Build Status

Pre-Commit Tests Status (on master branch)

--- Java Python Go Website
Non-portable Build Status Build Status
Build Status
Build Status Build Status
Portable --- Build Status --- ---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

@mf2199 mf2199 changed the title CBT connector: Expanding bigtableio with read functionality. [testing only] [WIP-draft] CBT connector: Expanding bigtableio with read functionality. Apr 2, 2020
@mf2199 mf2199 changed the title [testing only] [WIP-draft] CBT connector: Expanding bigtableio with read functionality. [testing only] [WIP-draft] Apr 2, 2020
@mf2199 mf2199 changed the title [testing only] [WIP-draft] [testing only - NOT READY FOR REVIEW] [WIP-draft] Apr 2, 2020
@mf2199 mf2199 changed the title [testing only - NOT READY FOR REVIEW] [WIP-draft] [WIP-draft, testing only - NOT READY FOR REVIEW] Apr 2, 2020
@mf2199 mf2199 marked this pull request as ready for review April 2, 2020 18:02
@mf2199 mf2199 closed this Apr 20, 2020
@mf2199 mf2199 reopened this May 8, 2020
@chamikaramj
Copy link
Contributor

Retest this please

@chamikaramj
Copy link
Contributor

Run Python PostCommit

@mf2199 mf2199 closed this Jun 6, 2020
@chamikaramj chamikaramj reopened this Jun 7, 2020
@chamikaramj
Copy link
Contributor

Run Python PreCommit

@chamikaramj
Copy link
Contributor

Retest this please

@chamikaramj
Copy link
Contributor

Run Python PostCommit

@aaltay
Copy link
Member

aaltay commented Jun 19, 2020

retest this please

@mf2199
Copy link
Author

mf2199 commented Jun 21, 2020

@chamikaramj @aaltay The build errors point to 'missing' arguments that were made so by design, e.g.:

15:41:25 E       ValueError: Pipeline has validations errors: 
15:41:25 E       Missing required option: project.
15:41:25 E       Missing required option: region.
15:41:25 E       Missing GCS path option: temp_location.
15:41:25 E       Missing GCS path option: staging_location.

[from "Run Python PreCommit" log]

These are meant to be supplied via command line, as they may contain non-public information and are subject to change.

Any ideas?

@chamikaramj
Copy link
Contributor

Retest this please

@chamikaramj
Copy link
Contributor

chamikaramj commented Jul 22, 2020

Logs were not available anymore. Re-triggerred.


"""BigTable connector

This module implements writing to BigTable tables.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately we forgot to mark this as experimental. So we'll have to leave the sink here and add the new source to "experimental/bigtableio.py" for backwards compatibility. When we are confident about the performance of the source we can move it her. For example, I think we need to add support for dynamic work rebalancing similar to Java BigTable source (can be in a future PR).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, we are to restore this file in the original version while keeping the altered one in the experimental folder. If so, then it's done.


def parse_commane_line_arguments():
parser = argparse.ArgumentParser()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do not need to set these up manually. Please see the way our other integration tests are setup. You can either configure a new Jenkins test suite or add integration tests to existing Python post-commit tests. Latter might be easier.

@aaltay
Copy link
Member

aaltay commented Aug 13, 2020

@mf2199 - What is the next step on this PR?

@stale
Copy link

stale bot commented Dec 13, 2020

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@beam.apache.org list. Thank you for your contributions.

@stale stale bot added the stale label Dec 13, 2020
@stale
Copy link

stale bot commented Dec 25, 2020

This pull request has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.

@stale stale bot closed this Dec 25, 2020
@param17
Copy link

param17 commented Mar 23, 2021

Can we re-open this? What's the next step?
cc: @mf2199

@chamikaramj
Copy link
Contributor

IIRC we could not get BT integration tests working with this PR. Also I think some folks from Google side was going to improve this or look into this but not sure if what work happened. I would like to get this PR to a good working state (including working ITs) before submitting. I think a getting a connector that is not fully fleshed out submitted will do more harm than good.

Another option will be to add a multi-language wrapper for already established Java BT connector which should work for portable runners and Dataflow Runner v2.

@sachinag
Copy link
Contributor

sachinag commented Jul 8, 2022

@chamikaramj can you see if BT team can get this over the line please?

@aaltay
Copy link
Member

aaltay commented Jul 14, 2022

/cc @johnjcasey

@chamikaramj
Copy link
Contributor

I prefer implementing a x-lang based connector instead of proceeding with this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants