-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: initial import of the extension sdk #1
Conversation
@aaronsteers @pnadolny13 it might be easier to look at this on the branch (https://github.com/meltano/ext-airflow/tree/feat-init-extension). Full disclosure,I haven't had a chance to retest the cookiecutter since making some changes this morning. But should be pretty close too. Curious what yall's first thoughts are. |
@pandemicsyn this is awesome! I'll review the code also and share any thoughts/feedback but I had an initial question to start. Could you re-explain the situation with invoke and why we need the second |
Yes, and we might not even have too. I think @aaronsteers had an idea for reusing the existing command/yaml structure. But we could definitely hide it on the meltano side and inject it automatically, but on the extension side it's needed because without it you get into command and option shadowing situations. IMHO generally speaking having to use the extension this way should be rare. Like, a proper well-integrated airflow extension would make user creation/management easier, and allow it to be meltano config driven. -- As for why, our extension spec has that extensions must support three commands right now:
Down the road we might have others (e.g cleanup, service, state, whatever) though. Those are commands for the extension, NOT for what it wraps behind the scenes. You need the For example, if If your extension has a global flag like There are many ways we can skin this cat, e.g. we could mitigate this some with creative flag use to indicate that we're about to ask an extension to run a meltano command. This could even be a per extension/author choice. Don't want a clean entry point where you pass on everything and want to explicit define the commands you wrap? You can! Just drop to invoke, and define some specific subcommand's instead. ninja edit to say: I'm totally down for any alternatives. Not strongly tied to this pattern. |
If the
That does men that flags like airflow_extension --meltano-log-format=true or a future --meltano-context are probably a no-go and would have to be env vars (just to make it easier to extension authors). That might actually be pretty nice ? /cc @aaronsteers @pnadolny13 |
I like this exploration of options. The "meltano" command (or similar) with subcommands is a nice approach. I just thought of another approach, which is to have 2 CLI entrypoints, one that is the top-level with full capabilities and one is the wrapper that biases toward simplicity of execution and having swap-in capability. For example:
This can be accomplished by simply defining an extra line in the What's nice about this is that it still plays nicely with invocation patterns that want to basically swap one executable with another. @pandemicsyn - What do you think about this? |
cookiecutter/wrapper-template/{{cookiecutter.wrapper_id}}/pyproject.toml
Outdated
Show resolved
Hide resolved
@aaronsteers I'm not sure I grok what the benefit is to splitting this up into two distinct commands. I think from an extension development standpoint this does make it a bit harder to reason about - even with cookiecutter doing the bulk of the heavy lifting because things aren't quite as clear-cut and click/typer usage might get a bit complicated for them (what you can use where, where you can use a command group, etc). The one big hang up for me is mixing pass through and non-pass through on the wrapper commands:
If you allow that then I'm not sure what this gains us over just automatically injecting a required |
@aaronsteers I'm kinda blocked on this since I'm at the point where I'd like to clean up the Airflow and Superset extensions, and get reviews of those while I start on dbt. So if you wanted to roll with your solution I'd propose a minor tweak of the invoker command NOT allowing custom commands or shadowing:
I like how clean this is on the extension side, but feels like it might be more messy on the meltano side (who invokes what variation and when)? |
@pandemicsyn - Yeah, I like this latest proposal iteration a lot. 👍 |
To your question here:
I can see these usage patterns emerging:
|
fyi - rebuilt a basic version of the superset extension on top of the output from the cookiecutter template: meltano/superset-ext#1 🥳 |
Not really relevant for this first PR, but in case anyone's curious about how we'll handle release of the extension initially, there's some details in the parent issue: meltano/meltano#6398 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should make meltano_extension_sdk
into a namespace package named meltano.edk
. It would still be versioned/installed separately from Meltano - making it a namespace package only affects the import name (i.e. meltano.edk
) and the distribution name meltano-edk
.
It's a nice way of organizing things imo. To make it a namespace package, you should only need to update pyproject.toml
in the EDK repository so that the name Poetry has for the package is meltano.edk
, and then update the directory structure to have be meltano/edk
instead of meltano_edk/
.
@pandemicsyn @aaronsteers Thoughts?
class ExtensionCommand(Command):
description = "The extension cli"
pass_through_cli: bool = False
commands: List[str] = [
"describe",
"invoke",
"pre_invoke",
"post_invoke",
"initialize",
] @pandemicsyn Since Also, because this list is hardcoded, the |
@pandemicsyn Using the EDK, should I have the extension I'm developing call Meltano directly with |
👍 @WillDaSilva I'm down for this. @aaronsteers any objections ?
@WillDaSilva Yea, @aaronsteers we've not really chatted about plain utility commands all that much but I know you've been thinking about this for awhile, any prefs or thoughts ?
@WillDaSilva yea have the extension call Meltano directly with subprocess.run for now. fyi - there should be a symlink back to meltano in |
1. ditch universal new lines but still default enabling text mode 2. allow passing in kwargs to run 3. set default env in process.py rather than depending on wrapper.py
Punted on this since I figure we can iterate on it when we go to break the EDK out into its own repo tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A nitpick: some of the files lack newlines at the end of them. It would probably be worth enabling precommit.ci
for this repository, and adding a similar pre-commit config as what we've got in our other repositories.
I've got #2 logged to setup all the ci/linter things. Knowing that the cookiecutter templates and the edk would need special handling and that we'd immediately just turn around and modify the configs to remove those anyway, didn't seem worth the work for the initial PR. I'll steal the pre-commit config from the SDK repo real quick to cover the basics. |
Added a pre-commit config, and fixed up all the warnings - I might have missed back porting some to the cookiecutter templates themself's since I didn't completely regenerate the extension from scratch again. It was mostly missing doc strings - I'll keep an eye out for those when I regenerate the superset extension. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉🚀🤘
Output and flags
Based on some early feedback from @pnadolny13 I configured the default logging profile to drop timestamps and info fields. That's mainly to keep output clean when its nested within a
meltano run
invocation.The fields can be re-enabled via cli options or env vars. In addition, there's also a
MELTANO_LOG_FORMAT=true/--meltano-log-format
option. That switches the extension to start writing output in json format. I wanted to have that present out of the gate so that down the road, meltano can start setting that flag and more intelligently parse an extension's output (and selectively present portions of it).Basic cookiecutter and ext sdk overview
main.py
andpass_through.py
are the two main things generated by the cookiecutter skaffolding. Those cover the CLI invocation, logging setup, disabling the rich & typer colorized styling, etc. The "only" thing a user will have to do is fill out wrapper.py and satisfy the interface defined by the ExtensionBase class (which only has a small handful of required methods).Expect ExtensionBase to evolve a lot when tackle Superset/dbt - especially as we need more explicit config handling and path setting's when we go and implement dbt.
Right now theres two methods attached to supplied Invoker class (a minimal invoker class users can use to interact with a subprocess they're wrapping) on the extension:
run
- Useful when you want to run a command, but don't care about its output and only care about its return code. Basically background things you want to do who's output you don't care about nor want to show users.run_and_log
- Best used when you want to run a command and stream the output to a user. This logs output using an opinionated logger that we (meltano) can control via env vars to do things like flip on json logs.Basic usage example
Given this yaml:
Calling any command would create airflow home and all the supporting files. But if we want the optional meltano dag generator, you can call the "initialize" command of the plugin.
Create a user and start the webserver:
Doesn't yet close meltano/meltano#6398 as well still need to chat about how to transition folks to the new extension and how we want to represent it on the hub.