Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(ingest): respect max_threads for ingestion reporter #8521

Merged
merged 1 commit into from
Jul 28, 2023

Conversation

hsheth2
Copy link
Collaborator

@hsheth2 hsheth2 commented Jul 27, 2023

Slack ref: https://datahubspace.slack.com/archives/CUMUWQU66/p1690491969370859

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

@github-actions github-actions bot added the ingestion PR or Issue related to the ingestion of metadata label Jul 27, 2023
@@ -42,7 +42,9 @@ def create(
elif config_dict is None:
raise ConfigurationError("Missing provider configuration.")
else:
provider_config = DatahubIngestionStateProviderConfig.parse_obj(config_dict)
provider_config = (
DatahubIngestionStateProviderConfig.parse_obj_allow_extras(config_dict)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hsheth2, Thank you for the fix! I'm wondering since max_thread is also used in DatahubRestSinkConfig, which I think the one that handles the rest sink, should parse_obj_allow_extras also be called when creating DatahubRestSinkConfig?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That one actually should be fine as-is, since that class has a max threads option

The main fix here is the removal of that pydantic field removed validator, which was deleting the user provided max threads config before it got parsed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@hsheth2 hsheth2 requested a review from asikowitz July 28, 2023 19:09
@hsheth2
Copy link
Collaborator Author

hsheth2 commented Jul 28, 2023

Merging through unrelated CI failures.

@hsheth2 hsheth2 merged commit 9718505 into datahub-project:master Jul 28, 2023
41 of 44 checks passed
@hsheth2 hsheth2 deleted the fix-max-threads branch July 28, 2023 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ingestion PR or Issue related to the ingestion of metadata
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants