Skip to content
This repository has been archived by the owner on Oct 9, 2023. It is now read-only.

Admin changes for Parent child node execution relationship #111

Merged
merged 4 commits into from
Aug 3, 2020

Conversation

anandswaminathan
Copy link
Contributor

This PR has the following

  • Ability to accept metadata in NodeExecution event - spec_node_id, parent_id, and retry group.
  • DB changes to store the extra information.
  • DB changes to support parent child node execution relationship
  • Changes to expose and query node executions at root or for a parent.
  • Unit and end2end integration tests.

Type

  • Bug Fix
  • Feature
  • Plugin

Are all requirements met?

  • Code completed
  • Smoke tested
  • Unit tests added
  • Code documentation added
  • Any pending items have an associated Issue

Tracking Issue

flyteorg/flyte#218

@anandswaminathan
Copy link
Contributor Author

PTAL @kumare3 @katrogan @wild-endeavor
Ketan merged with master - has your cache status change as well.

var nodeExecutionMetadata admin.NodeExecutionMetaData
err = proto.Unmarshal(nodeExecutionModel.NodeExecutionMetadata, &nodeExecutionMetadata)
if err != nil {
return nil, errors.NewFlyteAdminErrorf(codes.Internal, "failed to unmarshal nodeExecutionMetadata")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log the error too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wild-endeavor Hey we do not log on the transformer layer. This error is logged in the manager layer already (caller of this method)

pkg/repositories/models/node_execution.go Show resolved Hide resolved
pkg/manager/impl/node_execution_manager.go Show resolved Hide resolved
// Parent that spawned this node execution - value is empty for executions at level 0
ParentID *uint `sql:"default:null"`
// List of child node executions - for cases like Dynamic task, sub workflow, etc
ChildNodeExecutions []NodeExecution `gorm:"foreignkey:ParentID"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because i'm dense, can you maybe explain to me what this means? usually in a has-many relationship, it means there are multiple rows of the thing that you have many of. in this case however, we're storing things as a list? or does it just mean that multiple children rows can have the same ParentID?

I think it'd be super helpful to see the db table before and after this change. Maybe I'll play around when I have some time.

Copy link
Contributor Author

@anandswaminathan anandswaminathan Jul 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple children rows will have the same ParentID

---------------------------+--------------------------+-----------+----------+---------------------------------------------
 id                        | integer                  |           | not null | nextval('node_executions_id_seq'::regclass)
 created_at                | timestamp with time zone |           |          | 
 updated_at                | timestamp with time zone |           |          | 
 deleted_at                | timestamp with time zone |           |          | 
 execution_project         | text                     |           | not null | 
 execution_domain          | text                     |           | not null | 
 execution_name            | text                     |           | not null | 
 node_id                   | text                     |           | not null | 
 phase                     | text                     |           |          | 
 input_uri                 | text                     |           |          | 
 closure                   | bytea                    |           |          | 
 started_at                | timestamp with time zone |           |          | 
 node_execution_created_at | timestamp with time zone |           |          | 
 node_execution_updated_at | timestamp with time zone |           |          | 
 duration                  | bigint                   |           |          | 
 parent_task_execution_id  | integer                  |           |          | 
 node_execution_metadata   | bytea                    |           |          | 
 parent_id                 | integer                  |           |          | 
 error_kind                | text                     |           |          | 
 error_code                | text                     |           |          | 
 cache_status              | text                     |           |          | 
Indexes:
    "node_executions_pkey" PRIMARY KEY, btree (execution_project, execution_domain, execution_name, node_id)
    "idx_node_executions_deleted_at" btree (deleted_at)
    "idx_node_executions_error_kind" btree (error_kind)
    "idx_node_executions_id" btree (id)
    "idx_node_executions_node_id" btree (node_id)
    "idx_node_executions_parent_task_execution_id" btree (parent_task_execution_id)

@@ -77,9 +76,21 @@ func (m *NodeExecutionManager) createNodeExecutionWithEvent(
}
parentTaskExecutionID = taskExecutionModel.ID
}
var parentID *uint
if request.Event.ParentNodeMetadata != nil {
parentNodeExecutionModel, err := util.GetNodeExecutionModel(ctx, m.db, &core.NodeExecutionIdentifier{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we retrieve the entire model, just to link the ID? Can we not get just the "id"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can. The performance improvement is tiny, as node execution record size are small. In fact you would not see any difference.

@codecov-commenter
Copy link

Codecov Report

Merging #111 into master will decrease coverage by 0.09%.
The diff coverage is 54.54%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #111      +/-   ##
==========================================
- Coverage   62.51%   62.42%   -0.10%     
==========================================
  Files         104      104              
  Lines        7601     7649      +48     
==========================================
+ Hits         4752     4775      +23     
- Misses       2292     2312      +20     
- Partials      557      562       +5     
Flag Coverage Δ
#unittests 62.42% <54.54%> (-0.10%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/repositories/config/migrations.go 0.00% <0.00%> (ø)
pkg/manager/impl/node_execution_manager.go 66.53% <54.83%> (-2.30%) ⬇️
pkg/repositories/transformers/node_execution.go 69.79% <68.75%> (-0.13%) ⬇️
pkg/repositories/gormimpl/node_execution_repo.go 62.37% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 464028f...6739c8c. Read the comment docs.

@anandswaminathan anandswaminathan merged commit efd98d4 into master Aug 3, 2020
eapolinario pushed a commit that referenced this pull request Sep 6, 2023
* Admin changes for Parent child node execution relationship
* Add migrations
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants