Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Add register_run_asset event #7098

Merged
merged 21 commits into from Apr 1, 2022

Conversation

clairelin135
Copy link
Contributor

@clairelin135 clairelin135 commented Mar 16, 2022

Adds a new Dagster event type REGISTER_RUN_ASSET.

This event is yielded after an asset job run begins for every single asset that is selected in the job. The goal of this feature is to enable querying for runs that intended to generate certain assets (e.g. job that intends to materialize 3 assets fails during materialization of second asset).

This will enable warnings in Dagit on the asset details page and the asset graph nodes. In the future, we can enable this for partitioned assets to generate views of all runs across a certain asset's partition. If this event becomes too noisy in run logs, we can consider ways to condense the logs.

@vercel
Copy link

vercel bot commented Mar 16, 2022

This pull request is being automatically deployed with Vercel (learn more).
To see the status of your deployments, click below or on the icon next to each commit.

dagster – ./docs/next

🔍 Inspect: https://vercel.com/elementl/dagster/2wk6U5ghVUNcnJ3pFNAPaVicBnNs
✅ Preview: https://dagster-git-claire-intent-to-materialize-event-elementl.vercel.app

[Deployment for 394c3c1 canceled]

dagit-storybook – ./js_modules/dagit/packages/ui

🔍 Inspect: https://vercel.com/elementl/dagit-storybook/6L7guFy7j9Uxq4qQZpaKR8mFb3aj
✅ Preview: Canceled

[Deployment for 394c3c1 canceled]

@vercel vercel bot temporarily deployed to Preview – dagit-storybook March 17, 2022 00:02 Inactive
@vercel vercel bot temporarily deployed to Preview – dagster March 17, 2022 00:02 Inactive
@clairelin135 clairelin135 marked this pull request as ready for review March 17, 2022 00:14
@vercel vercel bot temporarily deployed to Preview – dagster March 17, 2022 17:32 Inactive
@vercel vercel bot temporarily deployed to Preview – dagit-storybook March 17, 2022 17:32 Inactive
@prha
Copy link
Member

prha commented Mar 17, 2022

I think my preference is to not show these events at all in the event log (run view).

I also think we should reconsider putting this code in the executor. One advantage of inserting events upon run creation as opposed to run execution is that you have insight into pre-start runs (e.g. queued runs, run launcher failures, etc).

@sryza
Copy link
Contributor

sryza commented Mar 17, 2022

+1 to both things @prha said

@vercel vercel bot temporarily deployed to Preview – dagster March 17, 2022 22:49 Inactive
@vercel vercel bot temporarily deployed to Preview – dagit-storybook March 17, 2022 22:49 Inactive
@vercel vercel bot temporarily deployed to Preview – dagit-storybook March 17, 2022 22:53 Inactive
@vercel vercel bot temporarily deployed to Preview – dagster March 17, 2022 22:53 Inactive
@vercel vercel bot temporarily deployed to Preview – dagster March 24, 2022 18:51 Inactive
@vercel vercel bot temporarily deployed to Preview – dagit-storybook March 24, 2022 18:51 Inactive
@clairelin135
Copy link
Contributor Author

@alangenfeld I've changed the event name to asset_materialization_planned and updated the logs to display in Dagit when debug is selected. I think this is ready for you to take another look!

Comment on lines 192 to 193
ASSET_EVENTS = {
DagsterEventType.ASSET_MATERIALIZATION,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stale?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is responsible for populating the assets tag on the job page. The list of assets is generated from the asset keys attached to these event types, and adding this event type enables fetching assets that were not successfully materialized.
image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I was referring to a code comment, i think i was commenting on a stale version of the PR

Comment on lines 946 to 956
event = DagsterEvent.asset_materialization_planned(pipeline_name, asset_key)
event_record = EventLogEntry(
user_message="",
level=logging.DEBUG,
pipeline_name=pipeline_name,
run_id=pipeline_run.run_id,
error_info=None,
timestamp=time.time(),
dagster_event=event,
)
self.handle_new_event(event_record)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thoughts on putting this in line with the other DagsterEvent static constructors and logging there? Its a weird pattern but I think there is value in consistency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure--I've updated the code now to log the dagster event in a static constructor.

Comment on lines 236 to 238
)

# mirror the event in the cross-run index database
with self.index_connection() as conn:
conn.execute(insert_event_statement)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Re: lines 235 to 239]

why remove instead of update?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No strong reason here--I removed it earlier because I didn't feel that the check was necessary. I've updated it now though to contain the new event type

event = DagsterEvent.asset_materialization_planned(pipeline_name, asset_key)
event_record = EventLogEntry(
user_message="",
level=logging.DEBUG,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be good to leave reasoning next to the debug log level setting

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, added a comment next to the setting

@vercel vercel bot temporarily deployed to Preview – dagster March 24, 2022 21:39 Inactive
@vercel vercel bot temporarily deployed to Preview – dagit-storybook March 24, 2022 21:39 Inactive
@vercel vercel bot temporarily deployed to Preview – dagster March 24, 2022 21:40 Inactive
@vercel vercel bot temporarily deployed to Preview – dagit-storybook March 24, 2022 21:41 Inactive
Copy link
Member

@alangenfeld alangenfeld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prha what are your thoughts on this debug log level hide the event thing

Comment on lines 192 to 193
ASSET_EVENTS = {
DagsterEventType.ASSET_MATERIALIZATION,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I was referring to a code comment, i think i was commenting on a stale version of the PR

Comment on lines 50 to 54
const l =
node.__typename === 'LogMessageEvent' ||
node.__typename === 'AssetMaterializationPlannedEvent'
? node.level
: 'EVENT';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eesh - lets definitely leave a comment here explaining whats going on. I forgot that we classified events differently.

Comment on lines 269 to 276
def log_asset_materialization_planned_event(
log_manager: DagsterLogManager, event: "DagsterEvent"
) -> None:
# asset_materialization_planned events have a log level "DEBUG" in order to hide these
# events by default in Dagit. Modifying filtering to select DEBUG events will show these events
# in Dagit run logs.
log_level = logging.DEBUG
log_manager.log_dagster_event(level=log_level, msg=event.message or "", dagster_event=event)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to cargo-cult the pattern here and refactor when single use - this can just be inlined

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 991 to 994
if execution_plan_snapshot:
self._log_asset_materialization_planned_events(pipeline_run, execution_plan_snapshot)

return self._run_storage.add_run(pipeline_run)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be good to sequence the event writes after the run gets added to the db? incase the run add fails

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prha
Copy link
Member

prha commented Mar 25, 2022

I'm more inclined to just not show the events, regardless of level. We're only putting this in the event log so that we can query runs more efficiently by asset key. If there were a better implementation option that didn't result in visible events, we would pick that instead.

I think we should log them in debug mode but just not display them at all in the frontend. And we could change this later without major repercussions.

@vercel vercel bot temporarily deployed to Preview – dagster March 25, 2022 17:47 Inactive
@vercel vercel bot temporarily deployed to Preview – dagit-storybook March 25, 2022 17:47 Inactive
@clairelin135
Copy link
Contributor Author

@alangenfeld I don't feel strongly either way about whether to make these events viewable in Dagit or not, though I don't think users will find them very useful (the run will already show which assets intend to be materialized, and these events are only responsible for populating asset run information in Dagit). Defer to you to make the final call though

Copy link
Member

@alangenfeld alangenfeld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defer to you to make the final call though

I don't think I'm the best person to make final call on asset product experience questions. I don't have broader context, so things like

the run will already show which assets intend to be materialized

I don't know how that currently is presented.

As prha pointed out, this last bit is very easy to change so I will accept the PR and you can land it with the behavior you feel best about.

@vercel vercel bot temporarily deployed to Preview – dagster March 31, 2022 22:40 Inactive
@vercel vercel bot temporarily deployed to Preview – dagit-storybook March 31, 2022 22:40 Inactive
@clairelin135 clairelin135 merged commit 60059f9 into master Apr 1, 2022
@clairelin135 clairelin135 deleted the claire/intent-to-materialize-event branch April 1, 2022 16:54
Copy link
Member

@alangenfeld alangenfeld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[1] I believe filtering here is going to cause problems with the offset based pagination we currently have

events = [
event
for event in events
if event.dagster_event_type != DagsterEventType.ASSET_MATERIALIZATION_PLANNED
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[1]

events = [
event
for event in events
if event.dagster_event_type != DagsterEventType.ASSET_MATERIALIZATION_PLANNED
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[1]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot about the offset cursors... I guess easiest thing to do is to just filter these events client-side?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya, that or fix pagination to not be offset based 🙂

@clairelin135
Copy link
Contributor Author

@alangenfeld got it. I can adjust the filtering to filter client-side instead of here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants