Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs for graphs that depend on assets #12597

Merged
merged 1 commit into from
Mar 1, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions docs/content/concepts/ops-jobs-graphs/graphs.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -223,6 +223,7 @@ Using dynamic outputs, you can duplicate portions of a graph at runtime. Refer t
## Defining and constructing dependencies

- [Defining nothing dependencies](#defining-nothing-dependencies)
- [Loading an asset as an input](#loading-an-asset-as-an-input)
- [Constructing dependencies](#constructing-dependencies)

### Defining nothing dependencies
Expand Down Expand Up @@ -262,6 +263,36 @@ Note that in most cases, it is usually possible to pass some data dependency. In

Dagster also provides more advanced abstractions to handle dependencies and IO. If you find that you are finding it difficult to model data dependencies when using external storage, check out [IO managers](/concepts/io-management/io-managers).

### Loading an asset as an input

You can supply an asset as an input to one of the ops in a graph. Dagster can then use the [I/O manager](/concepts/io-management/io-managers) on the asset to load the input value for the op.

```python file=/guides/dagster/assets_ops_graphs/op_graph_asset_input.py
from dagster import asset, job, op


@asset
def emails_to_send():
...


@op
def send_emails(emails) -> None:
...


@job
def send_emails_job():
send_emails(emails_to_send.to_source_asset())
```

We must use the <PyObject object="AssetsDefinition" method="to_source_asset" />, because <PyObject object="SourceAsset" pluralize /> are used to represent assets that other assets or jobs depend on, in settings where they won't be materialized themselves.

If the asset is partitioned, then:

- If the job is partitioned, the corresponding partition of the asset will be loaded.
- If the job is not partitioned, then all partitions of the asset will be loaded. The type that they will be loaded into depends on the I/O manager implementation.

### Constructing dependencies

- [Using GraphDefinitions](#using-graphdefinitions)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -153,3 +153,5 @@ def send_emails_job():
```

In this case, the asset - specifically, the table the job reads from - is only used as a data source for the job. It’s not materialized when the graph is run.

The [Graph documentation](/concepts/ops-jobs-graphs/graphs#loading-an-asset-as-an-input) contains more details on how this works.