Introducing the declare functionality of the new modules #6253

wildum · 2024-01-26T14:07:40Z

PR Description

This PR introduces the declare functionality defined by the new modules design.

Commits:

(declare) introduces the new declare node and modifies the parser to allow users to create it
(moduleComponent): a new constructor was needed because the new modules will not use the Exports type. It uses ModuleComponentOptions as an argument instead of component.Options to remove the need to pass unnecessary information. The old constructor is marked as deprecated.
(loaderConfigOptions): create a new type to pass config options when loading a new config with the loader. This is used to pass dependencies to modules that are defined outside of their context.
(nodeCustomComponent): introduces the new custom component node. It implements the componentNode interface. ModuleIds were added to the component node interface so that information about components running inside of the custom component could be retrieved.
(componentNodeManager): a new struct used to create component nodes (built-in and custom). It manages the declare dependencies of the custom nodes (from declare/parent/import). A new import config node was also added as an empty shell for convenience.
(test): a new test file with a few tests to ensure that the declare functionality behaves as designed.

Note to the reviewer

The import functionality will come via another PR. To understand the design choices some parts already have some logic related to the import nodes (import config node skeleton and component node manager).

PR Checklist

Tests updated

mattdurham · 2024-01-26T14:40:11Z

pkg/flow/internal/controller/component_node_manager.go

+
+// getCustomComponentConfigFromParent retrieves the config of a custom component from the parent controller.
+func (m *ComponentNodeManager) getCustomComponentConfigFromParent(cc *CustomComponentNode) *CustomComponentConfig {
+	declareContent, exists := m.additionalDeclareContents[cc.componentName]


Can we elaborate more on this? I am not sure if I follow the use case here. Do children inherit the imports of their parents?

That's correct. For example:

declare "first" {} declare "second" { first "nested" {} } second "root" {}

In this code, the declare "first" is declared at the root but used inside of the declare "second".

When you evaluate "root", you will create a new custom component and load a new config with a new controller. But this controller only has whatever is in "declare "second"" as context. It cannot see what's declared in the parent.

That's why the parent needs to pass the definition of "first" to its child. That would be the same with import.

I will consider adding more comments to clarify these points but first I will pair up with Robert for a review

OK, that makes sense, for some reason I was thinking child and parent module relationships and not child and parent node relationships.

mattdurham · 2024-01-26T15:08:12Z

component/module/module.go

@@ -40,15 +65,29 @@ func NewModuleComponent(o component.Options) (*ModuleComponent, error) {
 	return c, err
 }

+// NewModuleComponentV2 initializes a new ModuleComponent.
+func NewModuleComponentV2(o ModuleComponentOptions) (*ModuleComponent, error) {


nit: Instead of V2 wonder if we can go with something more like NewDeclareModuleComponent

…loader when loading a new config

…(custom/builtin) and to handle dependencies for the custom nodes

rfratto

Thanks, I really did find this helpful for reviewing the changes in a more constrained scope.

My main concern right now is that I'm struggling to keep a mental model for how declare definitions are being passed around. I've left a comment below suggesting one way that we can improve it, but in general I'm looking for anything of the form "reduce the number of things required to reason about the code."

I stopped reviewing after getting to the ComponentNodeManager (since that's where I'm the most confused); once that's cleared up I'll do a second round of reviews with more detail.

rfratto · 2024-01-26T16:07:26Z

pkg/flow/internal/controller/loader.go

+	// Fill our graph with declare nodes
+	declareDiags := l.populateDeclareNodes(&g, declares)
+	diags = append(diags, declareDiags...)
+


The order here doesn't seem significant as of this commit, but my gut tells me that we do want to populate declares before components. Can we make sure we add that detail to a comment so we don't forget and move things around?

rfratto · 2024-01-26T16:09:20Z

pkg/flow/internal/controller/loader.go

+		node := NewDeclareNode(declare)
+		if g.GetByID(node.NodeID()) != nil {


In populateComponentNodes, we only create the node if it doesn't already exist in the graph; can we do the same thing here?

rfratto · 2024-01-26T16:15:50Z

component/module/file/file.go

+		//nolint:staticcheck
 		Exports: module.Exports{},


nit: I find the nolint comments visually distracting from the reading flow. Does it still work if we have the comment inline like this?

Exports: module.Exports{}, //nolint:staticcheck

ditto for anywhere else we added that comment.

rfratto · 2024-01-26T16:18:23Z

component/module/module.go

+	// ID of the custom component. Guaranteed to be globally unique across all running components.
+	ID string


Given that the options is passed by a user, I don't think the comment here can say that the ID is "guaranteed to be globally unique;" rather I think it should say that the caller is responsible for making sure that the ID is globally unique instead.

rfratto · 2024-01-26T16:20:09Z

pkg/flow/config/loader.go

nit: I would generally recommend that we not create too many files; new types and functions should be as organized as closely as possible to where they are first used.

The main problem here is that I had to search for this, breaking my reading flow.

rfratto · 2024-01-26T16:25:52Z

pkg/flow/internal/controller/node_custom_component.go

+
+// NewCustomComponentNode creates a new CustomComponentNode from an initial ast.BlockStmt.
+// The underlying managed custom component isn't created until Evaluate is called.
+func NewCustomComponentNode(globals ComponentGlobals, b *ast.BlockStmt, GetCustomComponentConfig func(*CustomComponentNode) (*CustomComponentConfig, error)) *CustomComponentNode {


nit: you might want to make a type for that function outside of this:

type customComponentConfigFunction func(*CustomComponentNode) (*CustomComponentConfig, error)

Also, I'm not sure "config" is the right terminology to use here; this is more looking up the template/definition for the declare component. Can we make the name more descriptive?

rfratto · 2024-01-26T16:36:56Z

pkg/flow/internal/controller/component_node_manager.go

+}
+
+// GetCustomComponentConfig returns the custom component config for a given custom component or an error if not found.
+func (m *ComponentNodeManager) GetCustomComponentConfig(cc *CustomComponentNode) (*CustomComponentConfig, error) {


It feels like we're introducing two different ways within the code to look up components now:

ComponentRegistry interface

The GetCustomComponentConfig function passed around to nodes.

Is there a way for us to roll up the functionality for looking up all types of components into ComponentRegistry?

I'm not very convinced about this idea. At this level Built-in components and Custom components are very different:

For built-in components the interface will give you a registration that describes the component (name/args/exports) and a Build function to create it. This is used when building the graph.

For custom components you want to retrieve a piece of river config that you will run in a nested controller and a way for your controller to access additional pieces of config. This is used at eval level.

You might find these two thought experiments useful:

What might it look like if builtin components and custom components were more aligned in implementation?

What might an API look which has a lookup method that might return different component types?

These are two distinct thought experiments that would lead to two different solutions for solving the problem I'd like to be investigated.

Sorry I'm not sure I'm following. Even if they would be more aligned in implementation, they are by nature still very different. One works as a container for the other one

To rephrase the first thought experiment: what might it look like for custom components to be represented by a component.Registration, allowing the caller to use them without knowledge of whether they're backed by a builtin or custom component?

That sounds like the old modules implementation except that instead of the content of the module being loaded by the module it would be available inside of the component registry. The Build function would look the content up the scope and return the custom component ready to run?

rfratto · 2024-01-26T16:40:41Z

pkg/flow/internal/controller/node_custom_component.go

+	importLabel       string
+	declareLabel      string


I find importLabel and declareLabel to be a little confusing; I understand that it's literally the label of the import block and the label of the declare block, but that's also one layer removed from the construct we're talking about.

I would rather this be named importNamespace and customComponentName which more accurately describes what they are doing from the perspective of CustomComponentNode.

rfratto · 2024-01-26T16:44:57Z

pkg/flow/internal/controller/component_node_manager.go

+		customComponentConfig, err = m.getCustomComponentConfigFromImportedDeclares(cc)
+		if err != nil {
+			return customComponentConfig, err
+		}
+		if customComponentConfig == nil {
+			customComponentConfig = m.getCustomComponentConfigFromParent(cc)
+			// Custom components that receive their config from imported declares in a parent controller can only access the imported declares coming from the same import.
+			customComponentConfig.additionalDeclareContents = filterAdditionalDeclareContents(cc.importLabel, customComponentConfig.additionalDeclareContents)
+		}


I'm finding this part of the code hard to keep all of the moving bits in my head, mainly around the three different ways you can get the definition for a custom component and how the "declare contents" is being passed around.

While reading this, it feels like this could use a tree structure to represent the scope:

A scope for components declared in the local module (i.e., not imported).

This scope will have parents if you are inside a deeply nested declare component.

A scope for components per import statement.

If we found a way to represent the declares as a tree, it might simplify the logic here into becoming:

Find out what scope the component is in; local scope or imported scope

Search that scope for the component.

I don't have the full answer here for how to design this, mainly because I'm still having a hard time following how the declare contents get passed around, but I think a tree structure would eliminate that need.

This also feeds into my comment about trying to reuse the ComponentRegistry interface; there might be a way to replace that with the tree scope idea and to have one general way of getting a component from a name given a current scope.

I understand that this part of the code can be a bit daunting. I tried to have a descriptive function name but the verbosity might bring more confusion.

The logic is basically: A custom component needs its definition and additional definitions for nested custom components.

If it's from a local declare, it checks the declare nodes, if not found then it checks the extra content passed by the parent.

If it's from an imported declare, it checks the import nodes, if not found then it checks the extra content passed by the parent.

Once it has its template, it must retrieve the additional declares needed for its nested custom components:

If it's from a local declare, it can get the additional content from other local declares / imports / parent

If it's from an imported declare, it can only get additional content from the corresponding import

The only trick is the prefix part for the import node to restrict the scope.

From the past discussions that we had on the topic I believe that we wanted to keep modules isolated (not leaking through parents or siblings). This approach follows this principle: you pass the config + any additional config needed and the module has everything it needs to run. I tried to follow what we wrote down during the last brainstorming session:

Loader changes Scope lookup table for what namespaces are imported - Constructed on ApplyConfig - Has a hierarchy; nested loaders may have parent lookup table.

That being said, I'm not against trying a new approach, especially if this one is not convincing enough or if it feels that there should be something better. The tree structure sounds like a good idea. Here is how I would see it:

type Scope struct { parent Scope declares map[string]string // customComponentName: template imports map[string]Scope // importNamespace: importScope }

Instead of passing the additionalDeclaresContent down we can pass this scope object.

We don't really need Declare Nodes, the content of the declare block can be directly set to the Scope

onContentChange, ImportConfigNodes will need to update the Scope and the import hierarchy would need to be correctly mapped (if import A has an import B, then root Scope is Scope{imports["A"] = Scope{imports["B"]}}

For every custom component:
-if it's an instance of a local declare, you search the declares from the scope and continue up the parents till you find it. Once you have it, you can pass the current scope to the custom component so that it creates a new scope and reference the passed scope as a parent

if it's an instance of an imported declare, you search the imports from the scope and continue up the parents till you find it. Once you have it, you can pass the scope corresponding to the import to the custom component so that it creates a new scope and reference the passed scope as a parent.

With this approach, the custom components are not isolated anymore. We may have situations where a child tries to access content from the parent scope while the parent is updating its scope. That will require extra care.

What do you think about this second approach?
Unless I'm missing something I should be able to implement it fairly quickly. I think that we should go for it

That tree structure you showed is what I was thinking of, even back in our old brainstorm (it is a set of lookup tables with a "hierarchy"). I'd like to see this approach be implemented.

With this approach, the custom components are not isolated anymore. We may have situations where a child tries to access content from the parent scope while the parent is updating its scope. That will require extra care.

I'm not convinced this is true; the scope can be passed down just as the map[string]string is being passed around now without the parent needing to mutate the scope; the parent can build a new scope and pass it down if modifications to the scope is required. Unless I'm missing something?

I see, my bad I misunderstood that you meant a tree in this case, I thought the hierarchy was that just parents passing down the map to children, not passing a tree of lookup tables.
You're right, I was thinking about using pointers but we can copy the tree every time and pass it entirely, that should avoid this problem

I see, my bad I misunderstood that you meant a tree in this case, I thought the hierarchy was that just parents passing down the map to children, not passing a tree of lookup tables.

No worries, I didn't mean anything bad by it, just pointing out that we're aligned with what you proposed :)

wildum added 2 commits January 26, 2024 13:54

(declare): new declare node

99dbe64

(moduleComponent): create new constructor and deprecate old one

7bb43ef

wildum requested a review from rfratto January 26, 2024 14:10

wildum force-pushed the declare-new-modules branch from 4b1fbb6 to ce58010 Compare January 26, 2024 14:28

mattdurham reviewed Jan 26, 2024

View reviewed changes

wildum added 4 commits January 26, 2024 16:13

(loaderConfigOptions): optional parameters that can be passed to the …

be24000

…loader when loading a new config

(nodeCustomComponent): introduces the custom component node

72c7acb

(componentNodeManager): a new manager used to create component nodes …

327a8ef

…(custom/builtin) and to handle dependencies for the custom nodes

(test): test the declare functionality

d1ece2b

wildum force-pushed the declare-new-modules branch from ce58010 to d1ece2b Compare January 26, 2024 15:14

rfratto requested changes Jan 26, 2024

View reviewed changes

rfratto self-assigned this Jan 26, 2024

wildum closed this Jan 30, 2024

rfratto mentioned this pull request Feb 1, 2024

Demonstrate one potential implementation for declare #6289

Closed

github-actions bot added the frozen-due-to-age Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed. label Mar 1, 2024

github-actions bot locked as resolved and limited conversation to collaborators Mar 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introducing the declare functionality of the new modules #6253

Introducing the declare functionality of the new modules #6253

wildum commented Jan 26, 2024

mattdurham Jan 26, 2024

wildum Jan 26, 2024

mattdurham Jan 26, 2024

mattdurham Jan 26, 2024

rfratto left a comment

rfratto Jan 26, 2024

rfratto Jan 26, 2024

rfratto Jan 26, 2024

rfratto Jan 26, 2024

rfratto Jan 26, 2024

rfratto Jan 26, 2024

rfratto Jan 26, 2024

rfratto Jan 26, 2024

wildum Jan 29, 2024

rfratto Jan 29, 2024

wildum Jan 29, 2024

rfratto Jan 29, 2024

wildum Jan 29, 2024

rfratto Jan 26, 2024

rfratto Jan 26, 2024

wildum Jan 29, 2024

rfratto Jan 29, 2024

wildum Jan 29, 2024

rfratto Jan 29, 2024

		node := NewDeclareNode(declare)
		if g.GetByID(node.NodeID()) != nil {

		// ID of the custom component. Guaranteed to be globally unique across all running components.
		ID string

Introducing the declare functionality of the new modules #6253

Introducing the declare functionality of the new modules #6253

Conversation

wildum commented Jan 26, 2024

PR Description

Note to the reviewer

PR Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rfratto left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment