-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TDL-17000: fix breaking changes and TDL-17017: Fix bookmark for conversation_parts stream. #45
TDL-17000: fix breaking changes and TDL-17017: Fix bookmark for conversation_parts stream. #45
Conversation
tap_intercom/streams.py
Outdated
""" | ||
# Get bookmark of parent stream `conversations` from tap_state | ||
parent_bookmark = singer.get_bookmark(tap_state, | ||
self.parent.tap_stream_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach isn't in line with our best practices. All streams should bookmark independently. State dependencies like this cause code that is very hard to read and maintain (as seen with the duplicate concept of "tap_state" versus "state" here).
So, the conversation_parts
will need a custom sync method regardless of its approach so it can make its own way through the parent stream and store its own bookmarks as it needs.
Additionally, the parent should be rather low volume (I'd expect below 100k), so any optimization gained from this approach is lost in code complexity.
tap_intercom/streams.py
Outdated
@@ -373,7 +377,7 @@ def get_records(self, bookmark_datetime=None, is_parent=False) -> Iterator[list] | |||
yield from records | |||
|
|||
|
|||
class ConversationParts(Conversations): | |||
class ConversationParts(FullTableStream): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a design consideration, this stream is a sort of custom concept, and I'd expect that it can probably inherit directly from the BaseStream class to do its custom bookmarking independent of the other more straightforward sync logic.
I would also still call this an incremental stream, as it bookmarks on it's parent's replication key. This approach is just a more coarse grained incremental extraction, but it's closer to incremental than full table. It was mislabeled as full table in the past which caused this confusion in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We updated conversation_parts sync as per suggestion. Inherited ConversationParts from BaseStream and kept it incremental by writing a custom sync which will bookmark on its parent replication key.
'operator': '=', | ||
'value': self.dt_to_epoch_seconds(bookmark_datetime) | ||
}] | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An updated search query for conversation stream to consider records greater or equal to bookmark from API as same as v1.3.3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few minor changes and cleanup, I think overall this looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
* TDL-17000: fix breaking changes and TDL-17017: Fix bookmark for conversation_parts stream. (#45) * Reverted conversation_parts to full_table for parent as V1.1.3 * Updated bookmark test to handle conversation_parts * Updated comments * Added code to select parent_stream if only child stream is selected same as v1.1.3 * Added code comments * Resolved review comment * Resolved review comment * Added code comments * Added bookmark mechanism for conversation_parts same as before but independant of conversations * Added code comments * Added query paramater to get conversations greater and equal to bookmark * Adding changes of PR42 for better unit tests * Added logger messages * Added unit test * Updated unit test * Updated unit test * Updated variables in unit test * Added time_extracted for conversation_parts * Resolved review comments * Added companies to untestable for bookmark test * Reverted changes of last commit * Tdl 17003 Added back missing field for contact streams (#43) * Reverted the logic as per old version * Removed leads as well as it is not supported by tap * Tdl 17006 revert back companies to incremental (#44) * Reverted companies to incremental stream * Debuging integration test * Removed print statement * Added companies to untestable for bookmark test * TDL-17002: Make the logger messages more descriptive (#46) * added logger messages * removed unnecessary loggers * removed unnecessary loggers * updated the code to use stream name instead of parent stream name Co-authored-by: savan-chovatiya <80703490+savan-chovatiya@users.noreply.github.com> Co-authored-by: Umang Agrawal <80704207+umangagrawal-crest@users.noreply.github.com>
Description of change
TDL-17000: fix breaking changes
TDL-17017: Fix bookmark for conversation_parts stream.
Inherited ConversationParts from BaseStream and created custom sync and get_records which will work in the following manner.
Updated search query for conversation stream to consider records greater or equal to bookmark from API. As the last stable version, 1.1.3 was considering records with the same replication key and bookmark value.
Manual QA steps
Risks
Rollback steps