-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🎉 Google sheets bugfix: handle duplicate sheet headers #2905
🎉 Google sheets bugfix: handle duplicate sheet headers #2905
Conversation
@makalaaneesh thanks for your PR! I will review this very soon, just a little crunched on time yesterday :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@makalaaneesh looks great! just a few small tweaks but I think we can merge this very soon.
if duplilcate_headers: | ||
duplicate_headers_in_sheet[sheet_name] = duplilcate_headers | ||
except Exception as err: | ||
logger.error(str(err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if there is an error we should also return a fail message
for sheet_name in sheet_names: | ||
try: | ||
header_row_data = Helpers.get_first_row(client, spreadsheet_id, sheet_name) | ||
_, duplilcate_headers = Helpers.get_vaild_headers_and_duplicates(header_row_data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: duplilcate
should be duplicate
@@ -63,6 +63,27 @@ def check(self, logger: AirbyteLogger, config: json) -> AirbyteConnectionStatus: | |||
status=Status.FAILED, message=f"Unable to connect with the provided credentials to spreadsheet. Error: {reason}" | |||
) | |||
|
|||
# Check for duplicates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Check for duplicates | |
# Check for duplicate headers |
return AirbyteStream(name=sheet_name, json_schema=sheet_json_schema) | ||
|
||
@staticmethod | ||
def get_vaild_headers_and_duplicates(header_row_values: List[str]) -> (List[str], List[str]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: get_valid...
self.assertEqual(expected_stream, actual_stream) | ||
|
||
def test_duplicate_headers_to_ab_stream_fails(self): | ||
def test_duplicate_headers_retrived(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome! 👍🏼
So many typos, my bad! |
I wish all PRs had typos as their main issue , I'll say that 😜 |
thanks @makalaaneesh ! |
What
This solves #2153
How
check
fails if there are duplicate headers with a message.discover
ignores duplicate headers from the AirbyteStream. It logs a warningPre-merge Checklist
Recommended reading order