Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(ingest/glue): Profiling breaks for non-partitioned tables due to absent Table.PartitionKeys #9591

Merged
merged 8 commits into from Jan 26, 2024

Conversation

KulykDmytro
Copy link
Contributor

@KulykDmytro KulykDmytro commented Jan 9, 2024

Issue

While profiling glue table can face with issue:

  File "/home/airflow/.local/lib/python3.11/site-packages/datahub/ingestion/source/aws/glue.py", line 836, in get_profile_if_enabled
    partition_keys = response["Table"]["PartitionKeys"]
                     ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
KeyError: 'PartitionKeys'

this caused due to response for non-partitioned tables do not contain Table.PartitionKeys key

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

@KulykDmytro KulykDmytro changed the title (ingest/glue) Table.PartitionKeys can be absent for non-partitioned tables fix (ingest/glue) Table.PartitionKeys can be absent for non-partitioned tables Jan 9, 2024
@KulykDmytro KulykDmytro changed the title fix (ingest/glue) Table.PartitionKeys can be absent for non-partitioned tables fix(ingest/glue) Table.PartitionKeys can be absent for non-partitioned tables Jan 9, 2024
@KulykDmytro KulykDmytro changed the title fix(ingest/glue) Table.PartitionKeys can be absent for non-partitioned tables fix(ingest/glue) Profiling breaks for non-partitioned tables due to absent Table.PartitionKeys Jan 12, 2024
# check if this table is partitioned
if partition_keys:
if partition_keys := response["Table"].get("PartitionKeys"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can't use the := walrus operator since we need to maintain compatibility with python 3.7

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it will change

@hsheth2 hsheth2 added community-contribution PR or Issue raised by member(s) of DataHub Community merge-pending-ci A PR that has passed review and should be merged once CI is green. labels Jan 25, 2024
@hsheth2 hsheth2 changed the title fix(ingest/glue) Profiling breaks for non-partitioned tables due to absent Table.PartitionKeys fix(ingest/glue): Profiling breaks for non-partitioned tables due to absent Table.PartitionKeys Jan 26, 2024
@hsheth2 hsheth2 merged commit fc27ab2 into datahub-project:master Jan 26, 2024
53 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community-contribution PR or Issue raised by member(s) of DataHub Community merge-pending-ci A PR that has passed review and should be merged once CI is green.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants