Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cli): support datahub ingest mcps #7871

Merged
merged 4 commits into from May 23, 2023

Conversation

hsheth2
Copy link
Collaborator

@hsheth2 hsheth2 commented Apr 20, 2023

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

@github-actions github-actions bot added the ingestion PR or Issue related to the ingestion of metadata label Apr 20, 2023
@@ -233,6 +233,31 @@ def parse_restli_response(response):
return rows


@ingest.command()
@click.argument("path", type=click.Path(exists=True))
def metadata_file(path: str) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works even if you point the source to a directory that contains one or more metadata event files. So the docs should reflect that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

"""
Ingest from a metadata json file.

This requires that you've run `datahub init` to set up your config.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't require it since sink is auto-inferred.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sink is inferred from your env variables or your ~/.datahubenv config - so they do need to run init first

@@ -233,6 +233,31 @@ def parse_restli_response(response):
return rows


@ingest.command()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit nervous about taking the top-level default option on this one even though I like the convenience of it.

Why should

datahub ingest <something>

default to ingesting an MCP file or directory with MCP files?

versus:

datahub ingest --source file <path>

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my reading here is that we would have

datahub ingest --path <location> 

Is that right? If yes, this feels fairly natural

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You use this as datahub ingest metadata-file my_file.json

$ datahub ingest metadata-file --help
Usage: datahub ingest metadata-file [OPTIONS] PATH

  Ingest from a metadata json file or directory of files.

  This requires that you've run `datahub init` to set up your config.

Options:
  --help  Show this message and exit.

@hsheth2 hsheth2 changed the title feat(cli): support datahub ingest metadata-file feat(cli): support datahub ingest mcps May 18, 2023
Copy link
Contributor

@shirshanka shirshanka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@shirshanka shirshanka merged commit fb9a35b into datahub-project:master May 23, 2023
45 of 46 checks passed
svdimchenko pushed a commit to svdimchenko/datahub that referenced this pull request May 24, 2023
@hsheth2 hsheth2 deleted the ingest-file branch February 14, 2024 05:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ingestion PR or Issue related to the ingestion of metadata
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants