Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Groundwork for exposing core API in CLI #6293

Merged
merged 6 commits into from
Jan 3, 2024
Merged

Conversation

sipsma
Copy link
Contributor

@sipsma sipsma commented Dec 19, 2023

This lays out of some of the low-level plumbing needed to support #6229 :

  1. Adds the ability for clients to inspect the entirety of the current schema as a set of TypeDefs.
    • Previously callers could only do this for a single module; now it can for everything loaded into the schema, including core APIs.
  2. Adds an internal flag to the CLI plumbing to optionally include the core type defs it now has access to thanks to the above
  3. Adds a flag to the existing dagger call, --experimental-include-core (-x for short), that if set enables the behavior of including core APIs
    • This is not the final UX at all, the intention is to just give us a way to more easily iterate on this overall effort across PRs (rather than one bloated "lift and shift" PR) without impacting/accidentally-breaking the current UX/DX

Implementation notes

Getting TypeDefs for core

I spent a while waffling around how best to approach exposing the core API as TypeDefs (or whether to at all), with these considerations in mind:

  1. We do eventually want core to be fully modeled as a loadable module, so any reasonable baby steps we can take towards that are nice all else being equal
  2. We don't yet have the ability to fully model core as a Module, so we don't want to necessarily go all out on pretending that it is yet
    • I.e. one option was to actually return a Module representation of core, but with almost all of the fields returning an error if called. This felt best to avoid for the time being; too much API weirdness and new sources of backwards-incompatible changes if we decide to approach this differently.

Given the above, I landed on this approach:

  1. currentTypeDefs is added under Query and returns a list of TypeDefs for everything loaded into the current schema.
    • Nice because it can include the core TypeDefs without (yet) actually modeling core as a module from clients' perspectives.
  2. An optional field sourceModuleName is added to ObjectTypeDef that is set with the name of the user module if the ObjectTypeDef is associated with one (empty otherwise).
    • This is a very easy to implement+support API that allows us to distinguish user module objects from core objects without needing to explicitly model core as full-blown Module (the field is just "" if it's not associated with a user module).

Alternative - introspection query

One alternative I tried but backed out of was to do this all entirely client-side w/ introspection queries.

So the CLI just relies purely on introspection queries and constructs it's modTypeDef/modObject/etc. representations from that.

This almost works nicely but there's a wrench when trying to determine whether an object is from core or not, which we may want for the rest of this effort.

The only non-hacky way I could think of doing that without corner cases was via directives, which while plausible starts getting much more involved and changes how we generate the actual core schemas we present to clients.

That being said, this is still a viable option that I'm open to (either now or in the future once we made more progress here), let me know if you have any thoughts.

dagger call -x

EDIT: I found a simple path to have core included by default in the existing commands, so there's no -x flag needed anymore: #6293 (comment)

Previous content

I wanted us to be able to try this all out right away without making complex changes to dagger run immediately or breaking the DX of dagger call immediately.

Adding the flag to dagger call --experimental-include-core (aka dagger call -x) was the fastest route to get this usable and testable I saw, but there's plenty of other options:

  1. A separate dagger modrun command or similar that's just like dagger call but with core types always included.
  2. A flag on dagger run that trigger this new behavior.
    • This does force us to immediately deal with the problems+questions around when dagger run should behave like before and when it should have the new "module-aware" behavior; including corner cases like when no -m flag is provided but you are in a directory containing a dagger.json

I'm not strongly opposed to either of those options if others feel strongly one way or another.

Example

This module:

package main

func New() *Test {
    return &Test{
        Ctr: dag.Container().From("alpine:3.18"),
    }
}

type Test struct {
    Ctr *Container
}

when invoked with dagger call now allows commands like this:

$ dagger call ctr file --path /etc/alpine-release contents
✔ dagger call ctr file contents [1.47s]
┃ 3.18.5                                                                                                                                                                                                                                                 
• Cloud URL: https://dagger.cloud/runs/16b53100-985b-46b4-b4a4-225cd8af1562
• Engine: 5ed77503679c (version devel ())
⧗ 4.88s ✔ 68 ∅ 6

and

$ dagger call ctr directory --path=/etc export --path ./outdir
✔ dagger call ctr directory export [1.57s]
┃ true                                                                                                                                                                                                                                                  
• Cloud URL: https://dagger.cloud/runs/cac2a4ac-d951-474b-b9d6-71d61f03159a
• Engine: 55e8d0b9b620 (version devel ())
⧗ 3.70s ✔ 104 ∅ 3
$ ls ./outdir/
alpine-release  busybox-paths.d  crontabs  group     hosts   inittab  logrotate.d  modules         motd  network        opt         passwd    profile    protocols   securetty  shadow  ssl     sysctl.conf  udhcpd.conf
apk             conf.d           fstab     hostname  init.d  issue    modprobe.d   modules-load.d  mtab  nsswitch.conf  os-release  periodic  profile.d  secfixes.d  services   shells  ssl1.1  sysctl.d

@sipsma sipsma requested review from a team as code owners December 19, 2023 03:04
@sipsma sipsma requested review from helderco and shykes and removed request for a team December 19, 2023 03:04
Copy link
Member

@jedevc jedevc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One alternative I tried but backed out of was to do this all entirely client-side w/ introspection queries.

Potentially, this would also allow us to use this new API in our module codegen instead of relying on introspection?

If so, that feels like it could clear up quite a lot of code, and since we have control over the TypeDefs, we can embed more contextual information. One use I quite like the idea of an SDK being able to put different types into sub-packages if it likes, depending on the origin module (or at least listing the source in the relevant doc-comments)

// Whether to load core API types along side the user module types.
// TODO: this should become the default behavior once the CLI simplification
// work is done, can be removed at that time.
IncludeCore bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential discussion point: would we ever want to allow including dependency types (and not just core)?

Maybe this is not as useful for end-users, but for devs, I can see this being super useful - I have all my dependencies listed in dagger.json, I want to play around with those on the CLI, as I write my code.

Or, do interfaces make this kind of irrelevant, since we won't allow accepting/returning foreign concrete types directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see that being potentially useful and provided it's only direct dependencies then it would be totally logically consistent since you'd essentially be operating on the same schema that the module itself has (modules have access to both their own APIs and the APIs of their direct deps).

Honestly the only real impediment is that the CLI currently only lets you load one user module at a time (#5866) and has some assumptions around that. E.g. because there's only one user module loaded you don't have to specify the "main module object" as a top-level command and can instead just start calling functions on it right away (or provide top-level flags for the constructor).

  • We can resolve that by just requiring that if multiple user modules are loaded then you have to specify a top-level object/constructor.

But that's all getting out of scope of this PR obviously 😄

core/schema/coremod.go Outdated Show resolved Hide resolved
@sipsma
Copy link
Contributor Author

sipsma commented Dec 19, 2023

Potentially, this would also allow us to use this new API in our module codegen instead of relying on introspection?

Possibly but there's some nuance to it (this was another consideration that caused me to waffle around in how to approach all this).

The potential weirdness there is that TypeDef is also part of the core API, so using it exclusively instead of graphql introspection json creates a sort of bootstrapping problem.

You could deal with this fairly easily by just requiring that SDKs predeclare the various TypeDef related types and then we just pass those as json that they can unpack into those types.

That sounds not great, but then if you consider that today the way it works is that the SDKs need predeclared types for all the graphql introspection types and unpack the introspection json into those, it's not quite as bad, feels more like a wash.

  • The main argument for using the graphql introspection types is that they are an external and stable definition, so there's probably less churn.
  • But that doesn't invalidate what you're saying around the fact that TypeDef potentially gives us more leeway in terms of easily attaching metadata to the types. With graphql introspection, we'd have to use directives for that. Those are fine, but probably more convoluted to deal with overall than something custom to what we're doing.

Overall, after wracking my brain about all that I just decided to go with what's in this PR since it's pretty straightforward, integrated w/ the existing CLI easiest and doesn't rule out our options in the future if we want to change approach (e.g. if we wanted to switch the CLI to just use graphql introspection instead, you can mostly just move the code that's currently in the server to the CLI and then just need to deal with adding directives for distinguishing the source module name)

@shykes
Copy link
Contributor

shykes commented Dec 19, 2023

I think -x might be overkill since dagger call is already hidden and documented as experimental.

@sipsma sipsma mentioned this pull request Dec 20, 2023
@sipsma sipsma force-pushed the core-typedefs branch 2 times, most recently from b937dd3 to 382fedf Compare December 20, 2023 21:40
@sipsma
Copy link
Contributor Author

sipsma commented Dec 20, 2023

I think -x might be overkill since dagger call is already hidden and documented as experimental.

The reason I went with that is the naive implementation actually breaks existing use of dagger call without any other changes. This is due to the fact that the current CLI requires your commands end in a return type with no functions, which used to include container/directory/file/etc. but doesn't once core types are actually treated the same as user mod types.

However, I took another look and realized it's actually fairly straightforward to both lift that requirement and retain the previous behavior (for the time being, pending more follow up improvements as part of this whole effort).

  • @helderco I made that change in this commit d1f5e0d, let me know if it makes sense to you too
  • I backfilled more test coverage of existing functionality in that commit also, so feel decent that nothing broke as part of it

So as a result of that, I could remove -x and now the only noticeable difference is that if you run --help on a command that results in a core type you get all the messy output of every API on the object from core. That's fine though, that's one of the things we knew we needed to clean up in follow ups anyways. And if you don't run --help then the behavior was the same as before, so no immediate breakage there.

@sipsma
Copy link
Contributor Author

sipsma commented Jan 2, 2024

ping @helderco to review

Copy link
Contributor

@helderco helderco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sync shows in --help but doesn't work:

Error: response from query: input:1: Field "sync" of type "ContainerID!" must not have a sub selection.

That's because the default stdout is being applied:

query{pythonSdkDev{test{default{sync{stdout}}}}}

@sipsma
Copy link
Contributor Author

sipsma commented Jan 2, 2024

@helderco good catch, problem was that we were converting *ID to the actual object typedef for both input args and return values, but sync is the one special case where we return an ID rather than the actual object. Updated to only do the ID->object conversion on args rather than return values: 7144cdd

Copy link
Contributor

@helderco helderco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀

Two things I'd like to see changed is to not output anything with sync and maybe allow id as well.

The reason for wanting no output in sync is to allow for a workflow of using with_focus for the output. Using stdout or sync is noisy but a silent sync would enable that.

As for allowing id, it could be used for composability in the CLI (e.g., Directory input accepting a DirectoryID value, from another command).

This enables callers to directly introspect a full loaded schema on the
level of TypeDefs rather than the introspection json. This includes the
core API, which is what the bulk of the new code here is supporting.

This is intended to be used by the CLI to support exposing the core API
the same way the way user module objects+functions currently are.

Signed-off-by: Erik Sipsma <erik@dagger.io>
We want this to eventually become the default behavior of a unified
dagger CLI command for invoking module functions, but in the mean time
this experimental flag allows us to iterate on the UX and features
needed without breaking or impacting existing functionality.

Signed-off-by: Erik Sipsma <erik@dagger.io>
We can actually do this in the short term without breaking the current
UX. The modification needed was to change the OnSelectObjectLeaf
callback to run once the last command is being executed (i.e. we're
about to construct and submit the actual API query).

This allows the existing behavior to be retained (for now) while also
supporting continued chaining off of core object types like
Container/Directory/File too.

As part of this I also split the existing `dagger call` integ tests to
their own file, backfilled some coverage around the current behavior
when Container/Directory/File is the final return type and also added
new coverage for the new core chaining abilities.

Signed-off-by: Erik Sipsma <erik@dagger.io>
Wanted to have this to ensure the CLI changes don't break the up
command.

The test currently only uses `--native` because I hit a pre-existing bug
with `--port` (linked to the GH issue opened for it in the test case
code).

Signed-off-by: Erik Sipsma <erik@dagger.io>
Signed-off-by: Erik Sipsma <erik@dagger.io>
@sipsma
Copy link
Contributor Author

sipsma commented Jan 3, 2024

The reason for wanting no output in sync is to allow for a workflow of using with_focus for the output. Using stdout or sync is noisy but a silent sync would enable that.

Yes agreed, at least if we end up keeping sync visible to the CLI at all. I'm going to be opening a few issues for various follow ups needed to cleanup everything unleashed here; I will include this in one of those.

As for allowing id, it could be used for composability in the CLI (e.g., Directory input accepting a DirectoryID value, from another command).

Yeah something like this is at least in the cards once #6297 lands, though there'd be stuff to figure out still (only want to enable re-use of "pure" IDs, etc.)

Signed-off-by: Erik Sipsma <erik@dagger.io>
@sipsma sipsma merged commit fdc61ca into dagger:main Jan 3, 2024
43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants