Skip to content

Fix landings stream to be incremental#58

Merged
luandy64 merged 3 commits intomasterfrom
fix-landings-stream-to-be-incremental
May 17, 2022
Merged

Fix landings stream to be incremental#58
luandy64 merged 3 commits intomasterfrom
fix-landings-stream-to-be-incremental

Conversation

@loeakaodas
Copy link
Copy Markdown
Contributor

@loeakaodas loeakaodas commented Feb 8, 2022

Description of change

Manual QA steps

  • ran tap locally
  • ran against singer-check-tap and target-stitch --dry-run
  • ran tap-tester locally
  • tests passing in circle

Risks

  • minimal, the stream will now be incremental

Rollback steps

  • revert this branch

@loeakaodas loeakaodas requested a review from luandy64 February 8, 2022 22:22
Comment thread tap_typeform/streams.py

page_count = 2
sort = 'landed_at,asc'
sort = 'submitted_at,asc' # sorting isn't supported with `after` and defaults to submitted_at
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know that submitted_at and landed_at are always the same? If not, then shouldn't the replication key be submitted_at?

Copy link
Copy Markdown
Contributor Author

@loeakaodas loeakaodas Feb 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are definitely not the same, at least in the test data submitted_at is always greater than landed_at. I mimicked the setup of the answers stream for making the landings stream incremental and used landed_at as the replication key. This was because both streams use get_form_responses method from http.py to call the API.

IDS.LANDINGS: {"replication_method": "INCREMENTAL", "replication_keys": ["landed_at"]},
IDS.ANSWERS: {"replication_method": "INCREMENTAL", "replication_keys": ["landed_at"]},

I'm not sure why landed_at was chosen for the answers stream, but maybe there was a case where it would be null if a form isn't submitted, possibly? The answers schema doesn't even have submitted_at as one of it's field.

To keep it consistent with the tap's behavior on the answers stream I think we should leave it as is, but I did test it with tap-tester locally and all tests were passing if I switched the landings stream to use submitted_at as the replication key.

A more robust solution would be to switch both the landings and answers stream to use submitted_at as the replication key given the API limitations when using the after query param. This would require a schema change for the answers stream and might run into data issues if submitted_at can be null.

@MarcosPala
Copy link
Copy Markdown

Any status on this?

@loeakaodas
Copy link
Copy Markdown
Contributor Author

Any status on this?

@KrisPersonal @luandy64 could this be merged? it fixes an issues for clients in Stitch and is passing tests.

@luandy64 luandy64 merged commit 5dd3da1 into master May 17, 2022
@luandy64 luandy64 deleted the fix-landings-stream-to-be-incremental branch May 17, 2022 14:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Stitch/Typeform connexion turn into full-synchronization

4 participants