Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Dremio extractor #377

Merged
merged 1 commit into from Oct 8, 2020
Merged

feat: Dremio extractor #377

merged 1 commit into from Oct 8, 2020

Conversation

joshthoward
Copy link
Contributor

Summary of Changes

The PR adds a Dremio extractor object along with sample usage and tests. The primary file added is:

databuilder/extractor/dremio_metadata_extractor.py

No new dependency was added. This appears to be in line with the de facto standard of not adding extractor dependencies explicitly to the library's requirements.txt file. (e.g. the Big Query extractor raises a ModuleNotFoundError unless the user explicitly add the dependencies required by the extractor) Please let me know if this is an issue.

Tests

A new test was created for the above file:

tests/unit/extractor/test_dremio_metadata_extractor.py

No other test modifications are present.

Documentation

Doc strings are present, but the following example was also provided:

example/scripts/sample_dremio_data_loader.py

CheckList

Make sure you have checked all steps below to ensure a timely review.

  • [-] PR title addresses the issue accurately and concisely. Example: "Updates the version of Flask to v1.0.2"
  • [-] PR includes a summary of changes.
  • [-] PR adds unit tests, updates existing unit tests, OR documents why no test additions or modifications are needed.
  • [-] In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does
  • [-] PR passes make test

Please let me know if any additional information is needed :)

@feng-tao
Copy link
Member

feng-tao commented Oct 2, 2020

CI fails

@feng-tao
Copy link
Member

feng-tao commented Oct 2, 2020

@joshthoward joshthoward changed the title Added Dremio extractor feature feat: Dremio extractor Oct 3, 2020
Copy link
Member

@feng-tao feng-tao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, could you also update the doc (https://github.com/amundsen-io/amundsendatabuilder#list-of-extractors) with brief information? Would be good to update https://github.com/amundsen-io/amundsen#table-connectors in a later pr as well. thanks

last_row = row
columns.append(ColumnMetadata(
row['col_name'],
unidecode(row['col_description']) if row['col_description'] else None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this cast?

last_row['cluster'],
last_row['schema'],
last_row['name'],
unidecode(last_row['description']) if last_row['description'] else None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I saw this in another extractor, but Neo4j doesn't have issues handling unicode. This is removed.

Signed-off-by: Josh Howard <josh.t.howard@ey.com>
@joshthoward
Copy link
Contributor Author

@jinhyukchang, @allisonsuarez, @dikshathakur3119

Can I get your feedback when convenient?

@feng-tao
Copy link
Member

feng-tao commented Oct 8, 2020

cool, thanks, could you have another pr to update the doc for extractor (https://github.com/amundsen-io/amundsendatabuilder#list-of-extractors) ? thanks

@feng-tao feng-tao merged commit 63f239f into amundsen-io:master Oct 8, 2020
Wonong pushed a commit to Wonong/amundsendatabuilder that referenced this pull request Oct 16, 2020
Signed-off-by: Josh Howard <josh.t.howard@ey.com>

Co-authored-by: Josh Howard <josh.t.howard@ey.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants