Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ingest): unbundle airflow plugin emitter dependencies #7493

Merged
merged 2 commits into from
Mar 7, 2023

Conversation

cburroughs
Copy link
Contributor

As described at length in
https://github.com/datahub-project/datahub/blob/v0.10.0/metadata-ingestion/setup.py#L62 the current state of platform compatibility for the confluent_kafka is complicated. It can be a source of installation incompatibility and general friction. To make the airflow plugin easier to include in resolvable dependencies sets across a variety of platforms, unbundle the datahub-kafka emitter into extras_require.

Checklist

  • [X ] The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • [ X] Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • [X ] For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

As described at length in
<https://github.com/datahub-project/datahub/blob/v0.10.0/metadata-ingestion/setup.py#L62>
the current state of platform compatibility for the confluent_kafka is
complicated.  It can be a source of installation incompatibility and
general friction.  To make the airflow plugin easier to include in
resolvable dependencies sets across a variety of platforms, unbundle the
datahub-kafka emitter into extras_require.
@github-actions github-actions bot added docs Issues and Improvements to docs ingestion PR or Issue related to the ingestion of metadata labels Mar 3, 2023
@cburroughs
Copy link
Contributor Author

(I think the slight asymmetry where the rest emitter is included by default instead of both being extras is the right tradeoff: For new people trying it out the rest emitter is almost certainly what they first want, and the use cases for excluding requests from an airflow install seem unlikely. I don't have a strong opinion however.)

Copy link
Collaborator

@hsheth2 hsheth2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I pushed a small tweak to the docs

@cburroughs out of curiosity, what issues were you running into with confluent_kafka? Was it the same build issues on M1 macs or something else?

@cburroughs
Copy link
Contributor Author

@cburroughs out of curiosity, what issues were you running into with confluent_kafka? Was it the same build issues on M1 macs or something else?

In short, yeah the lack of sufficient wheels for the versions within the datahub constraints.

@hsheth2 hsheth2 merged commit cc0772f into datahub-project:master Mar 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Issues and Improvements to docs ingestion PR or Issue related to the ingestion of metadata
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants