Skip to content

Commit

Permalink
[docs] - Getting Started/quick start updates [CON-39] (#8187)
Browse files Browse the repository at this point in the history
* Formatting running job section

* Updating writing job section, mention assets

* Flesh out conclusion section

* Fix page title

* Add prereq for install

* More clean up

* GS experiment

* Revert "GS experiment"

This reverts commit aeb10d7.

* Make GS asset-based

* Update getting-started.mdx

* Review changes

* Update Python example

* Typos and clarity

* Fix build tests

* Remove unused `hello_world` examples

* Fix python example

* Move python example to its own file

* Remove screenshots from spec

* Fixing tests

* Consolidate examples

* Use multiple marker instead
  • Loading branch information
erinkcochran87 committed Jun 11, 2022
1 parent d46e5d8 commit ffb3ddc
Show file tree
Hide file tree
Showing 6 changed files with 119 additions and 99 deletions.
133 changes: 84 additions & 49 deletions docs/content/getting-started.mdx
Original file line number Diff line number Diff line change
@@ -1,98 +1,133 @@
# Getting Started with Dagster
# Welcome to Dagster!

<p className="text-2xl mt-0 text-gray-500 tracking-tight font-light">
Dagster is the data orchestration platform built for productivity.
</p>

## Installing Dagster
Get started with Dagster in just three quick steps:

1. [Install Dagster](#step-1-install-dagster)
2. [Define assets](#step-2-define-assets)
3. [Materialize the assets](#step-3-materialize-the-assets)

---

## Step 1: Install Dagster

<Note>
Dagster requires Python 3.6+. Refer to the{" "}
<a href="/getting-started/install">Installation documentation</a> for more
info.
</Note>

To install Dagster into an existing Python environment, run:

```bash
pip install dagster
```

This will install the latest stable version of the core Dagster packages in your current Python environment.
This installs the latest stable version of the core Dagster packages in your current Python environment.

## Writing a Job
---

Let's get your first job up and running.
## Step 2: Define assets

```python file=/getting_started/hello_world.py startafter=start_pipeline_marker endbefore=end_pipeline_marker
from dagster import job, op
To get started, we'll define two simple data [assets](/concepts/assets/software-defined-assets):

- A `cereals` asset that represents a CSV dataset about breakfast cereals, and
- A `nabisco_cereals` asset, which is [a downstream dependency](/concepts/assets/software-defined-assets#assets-with-dependencies) of `cereals` and only contains cereals manufactured by Nabisco

@op
def get_name():
return "dagster"
In the directory where you installed Dagster, copy this code and save it in a file named `cereal.py`:

```python file=/guides/dagster/asset_tutorial/cereal.py startafter=start_multiple_assets endbefore=end_multiple_assets
import csv
import requests
from dagster import asset

@op
def hello(name: str):
print(f"Hello, {name}!")

@asset
def cereals():
response = requests.get("https://docs.dagster.io/assets/cereal.csv")
lines = response.text.split("\n")
return [row for row in csv.DictReader(lines)]

@job
def hello_dagster():
hello(get_name())

@asset
def nabisco_cereals(cereals):
"""Cereals manufactured by Nabisco"""
return [row for row in cereals if row["mfr"] == "N"]
```

Save the code above in a file named `hello_world.py`.
---

You can execute the job in three different ways: [Dagit](/concepts/dagit/dagit), [Dagster Python API](/concepts/ops-jobs-graphs/job-execution#python-apis), or [Dagster CLI](/\_apidocs/cli#dagster-pipeline-execute).
## Step 3: Materialize the assets

## Running the Job in Dagit
Next, you'll materialize the assets. Materialization computes an asset's contents and writes them to persistent storage. By default, this is a pickle file on the local system.

Dagit is a web-based interface for viewing and interacting with Dagster objects.
There are a few ways to materialize an asset:

```bash
pip install dagit
```
- [Dagit](#using-dagit)
- [Dagster Python API](#using-the-dagster-python-api)

To visualize your job in Dagit, run the following command:
### Using Dagit

```bash
dagit -f hello_world.py
```
[Dagit](/concepts/dagit/dagit) is a web-based interface for viewing and interacting with Dagster objects.

1. To install Dagit, run:

```bash
pip install dagit
```

2. To launch Dagit, run:

```bash
dagit -f cereal.py
```

Then navigate to <http://localhost:3000> to start using Dagit:
You should see output similar to:

<Image
alt="dagit-def"
src="/images/getting-started/dagit-def.png"
width={4032}
height={2454}
/>
```bash
Serving dagit on http://127.0.0.1:3000 in process 70635
```

Click on the "Launchpad" tab, then press the "Launch Run" button to launch the job.
3. Navigate to <http://localhost:3000> in your web browser to view your assets.

<Image
alt="dagit-run"
src="/images/getting-started/dagit-run.png"
width={4032}
height={2454}
/>
4. Click the **Materialize All** button to launch a run that materializes the assets:

## Running the Job Programmatically
<img src="/images/getting-started/materialize-asset-in-dagit.gif" />

You can also execute the job without the UI.
### Using the Dagster Python API

**Dagster Python API**
You can also use the [Dagster Python API](/concepts/ops-jobs-graphs/job-execution#python-apis) to materialize the assets as a script.

Add a few lines to `cereal.py`, which executes a run within the Python process:

```python file=/guides/dagster/asset_tutorial/cereal.py startafter=start_multiple_materialize_marker endbefore=end_multiple_materialize_marker
from dagster import materialize

```python file=/getting_started/hello_world.py startafter=start_execute_marker endbefore=end_execute_marker
if __name__ == "__main__":
result = hello_dagster.execute_in_process()
materialize([cereals, nabisco_cereals])
```

**Dagster CLI**
Now you can run:

```bash
dagster job execute -f hello_world.py
python cereal.py
```

---

To learn more about Dagster, head over to the [Tutorial](/tutorial). And if you get stuck or have any other questions, we'd love to hear from you on Slack:
## What's next?

Congrats - you just created and materialized your first Dagster assets! Now that you've done that, what's next?

- **Learn about Dagster** with hands-on examples using our [tutorials](/tutorial)
- **Get the most out of Dagster** by familiarizing yourself with its core [concepts](/concepts)
- **Accomplish common tasks** using our step-by-step [guides](/guides)
- **Deploy Dagster** to your platform of choice with our [deployment guides](/deployment)

If you get stuck or have any other questions, we'd love to hear from you on Slack:

<p align="center">
<a href="https://dagster-slackin.herokuapp.com/" target="_blank">
Expand Down
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
14 changes: 0 additions & 14 deletions docs/screenshot_capture/screenshots.yaml
Original file line number Diff line number Diff line change
@@ -1,17 +1,3 @@
#################
# Getting started
#################

- path: getting-started/dagit-run.png
defs_file: examples/docs_snippets/docs_snippets/getting_started/hello_world.py
url: http://127.0.0.1:3000/
steps:
- launch job

- path: getting-started/dagit-def.png
defs_file: examples/docs_snippets/docs_snippets/getting_started/hello_world.py
url: http://127.0.0.1:3000/

#################
# Tutorial
#################
Expand Down

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
"""isort:skip_file"""

import csv
import requests
from dagster import asset


# start_asset_marker
import csv
import requests
Expand All @@ -17,10 +22,40 @@ def cereals():

# end_asset_marker

# start_multiple_assets
import csv
import requests
from dagster import asset


@asset
def cereals():
response = requests.get("https://docs.dagster.io/assets/cereal.csv")
lines = response.text.split("\n")
return [row for row in csv.DictReader(lines)]


@asset
def nabisco_cereals(cereals):
"""Cereals manufactured by Nabisco"""
return [row for row in cereals if row["mfr"] == "N"]


# end_multiple_assets

# start_materialize_marker
from dagster import materialize

if __name__ == "__main__":
materialize([cereals])

# end_materialize_marker


# start_multiple_materialize_marker
from dagster import materialize

if __name__ == "__main__":
materialize([cereals, nabisco_cereals])

# end_multiple_materialize_marker

This file was deleted.

1 comment on commit ffb3ddc

@vercel
Copy link

@vercel vercel bot commented on ffb3ddc Jun 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sign in to comment.