Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Introduce kind provider #16

Closed
wants to merge 12 commits into from
Closed

WIP: Introduce kind provider #16

wants to merge 12 commits into from

Conversation

marefr
Copy link
Member

@marefr marefr commented May 16, 2023

The bare minimum are in place for LoadProvider with some basic tests (that should pass).

Opted for structs of coreKinds, composableKinds and customKinds to basically allow splitting schemas into multiple files for a provider more easily, i.e. coreKinds: Foo and coreKinds: Bar vs coreKinds: [ Foo, Bar ]. Maybe just a detail/nice to have for defining the test providers?

Notes:

  • Had to upgrade thema to get latest changes regarding lineage.#LatestVersion

@marefr marefr self-assigned this May 16, 2023
@sdboyer
Copy link
Contributor

sdboyer commented May 17, 2023

cc @AgnesToulet

"github.com/grafana/thema"
)

const providerPackageName = "provider"
Copy link
Member Author

@marefr marefr May 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defining a provider currently requires it to be defined in a package named provider. Not sure if this is satisfactory/what we want? Basically, does a CUE package play a special role somehow for a provider/in a schema registry?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great question, I didn't give it much thought in the schema registry yet but I definitely should!

Having one generic forced package name will mean having to rename imports when importing from another provider, it seems a bit messy even if it is possible.

I don't know if it has other impact than imports, cc @sdboyer

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I did discuss this briefly with @sdboyer today and sounded like this isn't that important, but on the other hand loading CUE files might requires us to specify a specific package name which needs to be the same for all providers. This would be the working idea for now, but leave it as an open question/something we need to evaluate moving forward.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the one hand, yes, it's typically important for standardized loaders that we pick one package name and require that it be the CUE package in which folks write certain things.

On the other, if we're going to have a few different "implementations" of provider (e.g. plugins vs. apps) it seems likely those implementations may want to make different choices about what that expected package name is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other, if we're going to have a few different "implementations" of provider (e.g. plugins vs. apps) it seems likely those implementations may want to make different choices about what that expected package name is.

@sdboyer thanks. So this sounds like a consumer might need an option of specifying the package name, but how would it know before inspecting the CUE files, feels like a chicken/egg problem. I know we did talk briefly about loading CUE files without specifying a package name, but don't remember the details of it - do you see this as a way forward?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but how would it know before inspecting the CUE files

It can't, it has to know a priori what it's looking for. Which means it's part of the user-facing requirement: if the plugins loader decides to expect package grafanaplugin, that has to be publicly documented, enforced by tooling, etc. That's what we currently do in grafana/grafana's pfs. If app sdk is expecting something else, it has to document that where appropriate, separately.

So, this is a very low-level bit of architecture. It may be useful for us to not hardcode it at the level of kind provider, but only because we're planning on making that decision in the very next layer up. There's no end-user choice involved here - the package name to use must be firmly decided long before it makes its way to any app/plugin author.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires a cue.mod/module.cue though, but I guess a provider can enforce this perhaps?

Maaaaaybe. We have this helper func in pfs that dynamically injects a cue.mod dir if one doesn't exist.

That's almost certainly the approach to take for now while we're transitioning. The main nasty bit here is yet another naming thing - what name do we choose if we inject the cue.mod? The answer i fell back on in pfs - grafana.com/plugins/$PLUGIN_ID is...ok, but also implies how this name should probably be tied to package identity.

The main place users will see this name is filenames for line numbers in error messages from e.g. schema validation failures. (CUE errors produce a series of line numbers like this with all the lines that contributed to an error). So it's best to think about picking a name that an end-user can conceivably use to make sense of these errors: "Ahh, this error contains a line number from the file grafana.com/plugins/timeseries/panelcfg.cue, something violated the timeseries plugin's panel configuration schema."

(Note that i'm not saying this will be easy/all users can do it - just that it the information in these errors should be sufficiently clear that it's possible for people to learn general rules that let them quickly infer something about the source of the error.)

Once we can tell a clear story about exactly which users/use cases would be affected by requiring the cue.mod, and when they'd experience an error if they don't have one, it would make sense to think about forcing them to write one. There are probably some other options there, too, but i think that's too far out (a couple quarters at least) to bother with thinking about right now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we did talk briefly about loading CUE files without specifying a package name, but don't remember the details of it - do you see this as a way forward?

There's three basic possibilities to consider:

  1. With a concrete, known package name, as we do with grafanaplugin today in pfs.
  2. Without giving a package name. This causes CUE to implicitly try to load the package with the same name as the parent directory.
  3. Providing _ as the package name. This results in loading the CUE files in the directory that do not have a package keyword.

Let's just eliminate item 2 as an option. If someone is really curious, i'm happy to explain my thinking, but i'll save myself some typing until that person speaks up.

Option 3 has the drawback - probably - that these files are not directly importable from other CUE files.

That led me to Option 1.

Note that all of this amounts to the string that is provided for load.Config.Package when passed to load.Instances.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can't, it has to know a priori what it's looking for. Which means it's part of the user-facing requirement: if the plugins loader decides to expect package grafanaplugin, that has to be publicly documented, enforced by tooling, etc. That's what we currently do in grafana/grafana's pfs. If app sdk is expecting something else, it has to document that where appropriate, separately.

So, this is a very low-level bit of architecture. It may be useful for us to not hardcode it at the level of kind provider, but only because we're planning on making that decision in the very next layer up. There's no end-user choice involved here - the package name to use must be firmly decided long before it makes its way to any app/plugin author.

Okay. But isn't one of the goals with the kind provider that we can write tooling (codegen) targeting kind provider rather than targeting the very next layer up for us to be able to share common tooling (codegen etc)? If not, the tooling (codegen) needs to know what package name to use based on what the provider is which seems not great from the perspective of schema registry where you can have a mix of schemas originating from different providers etc. I might have understood this completely wrong? :)

That's almost certainly the approach to take for now while we're transitioning. The main nasty bit here is yet another naming thing - what name do we choose if we inject the cue.mod? The answer i fell back on in pfs - grafana.com/plugins/$PLUGIN_ID is...ok, but also implies how this name should probably be tied to package identity.

Yes, I've noticed we do that in pfs. Sure, it could make sense. This is still dynamically happen at runtime/load time. But I guess one thing I haven't understood is whether or how this generated cue.mod file is becoming a part of a schema registry in that case?

That led me to Option 1.

Based on my above comments I wrote sounds like enforcing/hardcode the provider package name at the level of kind provider is the way forward and was what I originally did, e.g.

const providerPackageName = "provider"

This feels like the easiest way forward for now at least.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I guess one thing I haven't understood is whether or how this generated cue.mod file is becoming a part of a schema registry in that case?

This one i can answer definitively - it should not. If we reuse the same loader funcs that dynamically inject a cue.mod ~everywhere, there's no reason to ever persist the results of that dynamic injection.

Okay. But isn't one of the goals with the kind provider that we can write tooling (codegen) targeting kind provider rather than targeting the very next layer up for us to be able to share common tooling (codegen etc)? If not, the tooling (codegen) needs to know what package name to use based on what the provider is which seems not great from the perspective of schema registry where you can have a mix of schemas originating from different providers etc. I might have understood this completely wrong? :)

Mmm, interesting point...i don't think you've misunderstood, i think i just missed this :) It's certainly relevant to the LoadProvider func in this PR.

I do still think it's preferable to not enforce the package name at quite this level. Unfortunately, unpacking what "this level" means gets us into the weeds with CUE loading.

I'm not sure how easily i can suggest a path here, given the number of considerations - it may be that it's just easiest for me to hack on this a bit to arrange the pieces sanely. But i'll start at least by outlining some names and patterns, and we can see what makes sense as the next step.

CUE itself establishes that there are three representational states that the bytes representing a bunch of user-provided CUE expressions can be in. Let's refer to these as (my names):

  • Raw - the literal raw bytes from a user. Might not even be syntactically valid CUE.
  • Loaded - the bytes have been parsed by [load.Instances](https://pkg.go.dev/cuelang.org/go@v0.5.0/cue/load#Instances), and are now represented as a build.Instance.
  • Built - a build.Instance has been loaded into the CUE graph by a call like cue.Context.BuildInstance, resulting in a cue.Value. The cue.Value.Err() method on this result indicates if there was a semantic problem in the original inputs.

Or, really handwaving, we could express in a flow diagram that there are some common types, with common verbs that transition between the types:

[]byte -- "Load()" --> *build.Instance -- "Build()" --> cue.Value

Let's try to think of each of these (loosely, they are not perfect fits) in the spirit of "Parse, Don't Validate". With that in mind, let's note that even if we have a semantically correct cue.Value from CUE's perspective, it's not necessarily a valid instance of the thing we want - in this case a Provider. So we need a fourth step, for which we've generally been using the word "Bind":

[]byte -- "Load()" --> *build.Instance -- "Build()" --> cue.Value -- "Bind()" --> (our ecosystem type)

BindLineage solidly falls in this category. BindCore and its siblings sorta do (and i'd like to take them from sorta -> solid).

For now at least, let's call anything that fits into this general multi-step process, from bytes to some native Go type, a "kindsys compiler."

It's expected that we have more than one kindsys compiler. It's also expected that we should be able to make those compilers out of reusable parts, if we create those parts carefully. load.InstanceWithThema is the proof there - it wraps CUE's load.Instances, but the thema wrapper is all we ever need in kindsys, or grafana, or app sdk.

I don't have a totally fixed idea yet of what all these passes should be. But i'm decently sure that trying to follow a general rule of keeping these passes as logically separate functions has the best chance of letting us iterate towards something good.

i don't imagine this is clear enough to immediately unblock progress - apologies! - we can plan to circle back on it.

provider.go Outdated
val = val.Unify(providerVal)
p.V = val

nameVal := val.LookupPath(cue.MakePath(cue.Str("name")))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably use CUE decoding here instead of manually parsing of props. This might imply that provider properties/metadata lives on its own struct rather than root level to ease decoding of things.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea! ☝🏻

Copy link

@AgnesToulet AgnesToulet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good!

From there I have a few questions:

  • In which provider should live the schemas shared by the plugins defined in: https://github.com/grafana/grafana/tree/main/packages/grafana-schema/src/common? If the answer is grafana, this doesn't fit with the file structure expected in this PR.
  • What should happen if the providers have more metadata than the ones defined here? I think they should still be published in the kind/provider registry, I don't know if it would have an impact on this PR.

provider.go Outdated

coreKindsVal := val.LookupPath(cue.MakePath(cue.Str("coreKinds")))
if coreKindsVal.Exists() {
s, err := coreKindsVal.Struct()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Struct() is deprecated, you can do this directly:

		it, err := coreKindsVal.Fields()
		if err != nil {
			return nil, fmt.Errorf("coreKinds is not a struct: %w", err)
		}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, didn't now that. Feel free to make changes as you see fit. Otherwise I will eventually

provider.cue Outdated Show resolved Hide resolved
"github.com/grafana/thema"
)

const providerPackageName = "provider"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great question, I didn't give it much thought in the schema registry yet but I definitely should!

Having one generic forced package name will mean having to rename imports when importing from another provider, it seems a bit messy even if it is possible.

I don't know if it has other impact than imports, cc @sdboyer

Co-authored-by: Agnès Toulet <35176601+AgnesToulet@users.noreply.github.com>
@marefr
Copy link
Member Author

marefr commented May 24, 2023

@AgnesToulet

In which provider should live the schemas shared by the plugins defined in: https://github.com/grafana/grafana/tree/main/packages/grafana-schema/src/common? If the answer is grafana, this doesn't fit with the file structure expected in this PR.

Great question, not something I've considered or heard any discussions of. For reference, I did ask some related question to these types. I'll share the link in DM with you. Given https://github.com/grafana/grafonnet I imagine they already include generated jsonnet of these common types somehow so question is if they need to be in the registry?

What should happen if the providers have more metadata than the ones defined here? I think they should still be published in the kind/provider registry, I don't know if it would have an impact on this PR.
provider.go

Yes, that's the idea. That's why I've considered to move the provider metadata into a struct to allow some kind of "namespacing" of fields and to allow provider to have more metadata included without colliding with the existing provider fields/properties. In general this line

...
makes the Provider open for extension and allowing any other field/property (I think) 😄 Guess the provider could as well have some specific structure that allows additional metadata in a more structured manner, but open for discussion.

Copy link

@mckn mckn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Awesome work on this 🎉

provider.go Outdated
val = val.Unify(providerVal)
p.V = val

nameVal := val.LookupPath(cue.MakePath(cue.Str("name")))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea! ☝🏻

@marefr
Copy link
Member Author

marefr commented Jun 12, 2023

Another question regarding importing grafana-schema/common, see below, which is supported in core Grafana.

import (
	"github.com/grafana/grafana/packages/grafana-schema/src/common"
)

Don't have any idea how we're going to support this for providers. @sdboyer any existing ideas around this?

@sdboyer
Copy link
Contributor

sdboyer commented Jun 12, 2023

Don't have any idea how we're going to support this for providers.

Yeah, this is one of the nastiest bits we have to deal with. We have to cheat the otherwise normal Thema rules of not allowing references to external things.

This is particularly painful for registries and Thema itself, which are generally responsible for the compatibility guarantees of the system. But we can start the answer to it in kindsys, because it's OK for it to know about some Grafana patterns (whereas Thema does not and will not). So, kindsys can provide an allowlist on certain import paths. It's then on us to create other checks (e.g. in CI for grafana/grafana) that ensure those common schemas only change in acceptable ways. We've started outlining what those have to be here.

@marefr
Copy link
Member Author

marefr commented Jun 20, 2023

Summary of today's mob session with @AgnesToulet and @andresmgot:
Tried to pick up where we left off, i.e. copying grafanaplugin schema from grafana repo to kindsys for the sake of validating some of our ideas in a provider-test involving the grafanaplugin schema. Running a bit in circles at first, confused with certain errors etc.

Eventually @AgnesToulet explained the current state of the kind registry and that grafana kinds are moved/transferred to the kind registry in a slightly different shape (certain fields/properties are explicitly set) compared to how its defined within core Grafana. As an example, https://github.com/grafana/grafana/blob/main/public/app/plugins/panel/annolist/panelcfg.cue, are being converted to a kind registry schema using https://github.com/grafana/grafana/blob/main/.github/workflows/scripts/kinds/verify-kinds.go and the result is https://github.com/grafana/kind-registry/blob/main/grafana/next/composable/annotationslist/panelcfg.cue. To quote Agnes, the kind registry still needs to define (have) providers for each kind

Above is very much in line with Proposal 2 in the kind providers design doc where it was suggested to use the pfs package as an intermediate that take plugin schemas as input and outputs a kind provider. This seems like the way forward.

Given above and this PR there's a few identified things left to fix:

  • Cleanup/remove grafanaplugin test
  • Fix imports of common, e.g. failed to load instance: import failed: cannot find package "github.com/grafana/grafana/packages/grafana-schema/src/common"
  • Possibly, auto-generate a cue.mod/module.cue file when its missing

In regards to grafana/pfs package we want to modify that to make it able to output a kind provider schema (fs.FS filesystem) which can be either provided as input to kindsys.LoadProvider or either written to disk. This might be the next step when we've reach stability with this PR.

Co-authored-by: Andres Martinez Gotor <andres.martinez@grafana.com>
Co-authored-by: Agnès Toulet <AgnesToulet@users.noreply.github.com>
@marefr
Copy link
Member Author

marefr commented Jun 20, 2023

Don't have any idea how we're going to support this for providers.

Yeah, this is one of the nastiest bits we have to deal with. We have to cheat the otherwise normal Thema rules of not allowing references to external things.

This is particularly painful for registries and Thema itself, which are generally responsible for the compatibility guarantees of the system. But we can start the answer to it in kindsys, because it's OK for it to know about some Grafana patterns (whereas Thema does not and will not). So, kindsys can provide an allowlist on certain import paths. It's then on us to create other checks (e.g. in CI for grafana/grafana) that ensure those common schemas only change in acceptable ways. We've started outlining what those have to be here.

Fix imports of common, e.g. failed to load instance: import failed: cannot find package "github.com/grafana/grafana/packages/grafana-schema/src/common"

I guess the main question for me is how making github.com/grafana/grafana/packages/grafana-schema/src/common cue files available within kindsys since it seems like the actual files must be available for imports to succeed or can we just "allow" the package for now as Sam suggests and that will "just work"? The only possible way I see right now is to reference https://github.com/grafana/grafana/blob/39a18ca6ba8d46c5e10572d8ab44affb0785f516/embed.go#L10 to get a hold of the common schemas.

@AgnesToulet do you have any other idea?

@marefr marefr closed this Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants