Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pip packages are not installed for Iceberg source plugin with Hive type #10058

Open
usmanovbf opened this issue Mar 14, 2024 · 3 comments
Open
Labels
bug Bug report

Comments

@usmanovbf
Copy link

usmanovbf commented Mar 14, 2024

Describe the bug
Pip packages are not installed for Iceberg source plugin with hive type

To Reproduce
Steps to reproduce the behavior:

  1. Create Iceberg ingestion source by the example below but without type: hive
  2. Click Save & Run
  3. Get an error about required type field
  4. Update the recipe with type: hive
  5. Click Save & Run button
  6. See the errors (logs are below):
    1. ModuleNotFoundError: No module named 'thrift'
    2. pyiceberg.exceptions.NotInstalledError: Apache Hive support not installed: pip install 'pyiceberg[hive]'

Expected behavior
All pypi packages 'pyiceberg[hive]' thrift should be installed properly

Solution

Execute pip install every time before execution of recipe

Screenshots
image

Desktop (please complete the following information):

  • OS: MacOS Sonoma arm64
  • Browser Chrome
  • Version 122.0.6261.112

Additional context

  1. Recipe:
source:
    type: iceberg
    config:
        env: PROD
        catalog:
            name: iceberg-catalog
            type: hive
            config:
                uri: 'https://hostname1:9083'
                s3.endpoint: 'https://hostname2'
                s3.access-key-id: '${secret1}'
                s3.secret-access-key: '${secret2}'
        table_pattern:
            allow:
                - 'test.*'
        profiling:
            enabled: false
  1. Error logs: exec-urn_li_dataHubExecutionRequest_1d2b870e-81e9-477a-8869-39505a9f2b3d.log
  2. Even adding Extra Pip Libraries does not help
Extra Pip Libraries9 9. Datahub version 0.12.1 10. As I see, it is not fixed in 0.13.0 from 0.12.1 https://github.com/datahub-project/datahub/commits/v0.12.1/metadata-ingestion/src/datahub/ingestion/source/iceberg
@usmanovbf usmanovbf added the bug Bug report label Mar 14, 2024
@hsheth2
Copy link
Collaborator

hsheth2 commented Mar 19, 2024

@usmanovbf would you be open to sending a PR for this?

@usmanovbf
Copy link
Author

@hsheth2 sorry, I have no time for now. Hope you or your teammate will find some time to fix it

@igorvoltaic
Copy link

might be related to #10289

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug report
Projects
None yet
Development

No branches or pull requests

3 participants