Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update replication_key state for descending streams #152

Closed
wants to merge 7 commits into from

Conversation

ericboucher
Copy link
Contributor

A few limitations in the GitHub API force us to paginate a few streams in descending order. Since we cannot leverage the automated signpost updates from the SDK, this PR makes sure we properly update the state at the end of the run.

In addition, this PR switches the commits stream to use use_fake_since_parameter since this endpoint paginates in descending order.

@ericboucher
Copy link
Contributor Author

@edgarrmondragon could you help confirn that this will do the following:

  • update the state with the time of the start of the run, in the event that the run completes successfully
  • the next time we run, the signpost will be used as the "get_starting_timestamp" if none is passed in the config

Comment on lines +68 to +80
yield from super().get_records(context)

# Important - Update state for streams in descending order
if self.use_fake_since_parameter:
state = self.get_context_state(context)
if set(["replication_key_signpost", "replication_key"]).issubset(
state.keys()
):
record: Dict = {}
record[state["replication_key"]] = state["replication_key_signpost"]
self._increment_stream_state(
latest_record=record,
context=context,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ericboucher this will not run until get_records is actually iterated over, given the lazy nature of generators.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean at the end of super.get_records()? If so that's the goal yes. Or am I missing something? And get_records() needs to be called in a special way? Should this live somewhere else then? @edgarrmondragon

@ericboucher ericboucher changed the title Properly update state signpost for desc streams Update replication_key state for descending streams Jun 28, 2022
@ericboucher
Copy link
Contributor Author

I am not entirely sure it is working as expected and don't really have a good way to test. Open to ideas @laurentS @edgarrmondragon :)

@laurentS
Copy link
Contributor

I am not entirely sure it is working as expected and don't really have a good way to test. Open to ideas @laurentS @edgarrmondragon :)

@ericboucher I've added a test for your code. I think it does what you want. It's a bit clunky as I had to fake both the system time and the response from github, but might be a useful example for other corner cases we need to verify in the future. Feel free to adjust if I've missed anything!

@sonarcloud
Copy link

sonarcloud bot commented Jul 19, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

No Coverage information No Coverage information
0.0% 0.0% Duplication

for name, stream in tap2.streams.items():
if name == "commits":
s = stream.stream_state
# the bookmark should be the timestamp of the latest commit
Copy link
Contributor Author

@ericboucher ericboucher Jul 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is enough to test completely.

And actually, I was expecting the bookmark to be 2022-07-01 14:00:00 = utc_now since we use state["replication_key_signpost"]

Am I missing something @edgarrmondragon?

@laurentS
Copy link
Contributor

laurentS commented Nov 9, 2022

This PR is probably not needed anymore with meltano/sdk#1164

@sonarcloud
Copy link

sonarcloud bot commented Nov 11, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

No Coverage information No Coverage information
0.0% 0.0% Duplication

@ericboucher
Copy link
Contributor Author

Solved by #164

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants