Skip to content

Commit

Permalink
New "running dagster locally" deployment guide that walks through `da…
Browse files Browse the repository at this point in the history
…gster dev` usage (#11741)

Summary:
Right now if you want to answer 'how do i run dagster' you have to kind
of wade through the instance docs, and then the daemon docs if you want
to use any schedules or sensors. This tries to streamline the question
of 'how do i get dagster up and running locally' in a single guide using
the new 'dagster dev' command.

Not sure about the exact information architecture here / if this even
belongs under 'Deployment' per se - open to putting it elsewhere, but
there was some existing stuff in Deployment, particularly in the Daemon
section, that this somewhat replaces. The section labeled 'Open Source'
is kind of alternately called 'Deploying to your own infra' elsewhere
and this wasn't quite that, but then it's not under Open Source which is
weird... I feel pretty good about the Content but am not quite sure
where to fit it in.

### Summary & Motivation

### How I Tested These Changes
  • Loading branch information
gibsondan authored and dpeng817 committed Jan 19, 2023
1 parent 8354f89 commit 7e3032c
Show file tree
Hide file tree
Showing 9 changed files with 133 additions and 59 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,7 @@ height={1618}

When you materialize a partitioned asset, you choose which partitions to materialize, and Dagster will launch a run for each partition.

**Note**: If you choose more than one partition, the [Dagster daemon](/deployment/guides/service#running-dagster-daemon) needs to be running to queue the multiple runs.
**Note**: If you choose more than one partition, the [Dagster daemon](/deployment/dagster-daemon) needs to be running to queue the multiple runs.

<Image
src="/images/concepts/partitions-schedules-sensors/partitions/rematerialize-partition.png"
Expand Down
6 changes: 6 additions & 0 deletions docs/content/deployment.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,12 @@ Explore your options for deploying Dagster to your infrastructure or using Dagst

---

## Running Dagster locally

Want to quickly get Dagster up and running on your local machine? Check out the [running Dagster locally](/deployment/guides/running-locally) guide to learn more.

---

## Deploying to your infrastructure

Ready to deploy Dagster to your infrastructure? Use these resources to learn more:
Expand Down
2 changes: 1 addition & 1 deletion docs/content/deployment/concepts.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Learn about the concepts relevant to deploying Dagster. Refer to the [Core conce

## Dagster instance

The `DagsterInstance` defines all of the configuration that Dagster needs for a single deployment - for example, where to store the history of past runs and their associated logs, where to stream the raw logs from op compute functions, and how to launch new runs.
The Dagster instance defines all of the configuration that Dagster needs for a single deployment - for example, where to store the history of past runs and their associated logs, where to stream the raw logs from op compute functions, and how to launch new runs.

[Learn more about setting up your Dagster instance](/deployment/dagster-instance).

Expand Down
57 changes: 5 additions & 52 deletions docs/content/deployment/dagster-daemon.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -16,68 +16,21 @@ Several Dagster features, like [schedules](/concepts/partitions-schedules-sensor

### Running locally

The Dagster daemon can be started locally in a few ways, which are outlined in the following tabs. Once started, the process should be kept running.

<TabGroup>
<TabItem name="From a file">

The Dagster daemon can load a file directly as a code location. In the following example, we used the `-f` argument to supply the name of the file to `dagster-daemon`:

```shell
dagster-daemon run -f my_file.py
```

This command loads the definitions in `my_file.py` as a code location in the same Python environment where the daemon resides.

You can also include multiple files at a time:

```shell
dagster-daemon run -f my_file.py -f my_second_file.py
```

---

</TabItem>
<TabItem name="From a module">

The Dagster daemon can also load Python modules as code locations. When this approach is used, Dagster loads the definitions defined at the top-level of the module, in a variable containing the <PyObject object="Definitions" /> object of its root `__init__.py` file. As this style of development eliminates an entire class of Python import errors, we strongly recommend it for Dagster projects deployed to production.

In the following example, we used the `-m` argument to supply the name of the module to the daemon process:
The easiest way to run the Dagster daemon locally is to run the `dagster dev` command:

```shell
dagster-daemon run -m your_module_name
dagster dev
```

This command loads the definitions in the variable containing the <PyObject object="Definitions" /> object in the named module - defined as the root `__init__.py` file - in the same virtual environment as the daemon.

---

</TabItem>
<TabItem name="Without command line arguments">

To load definitions without supplying command line arguments, you can use the `pyproject.toml` file. This file, included in all Dagster example projects, contains a `tool.dagster` section with a `module_name` variable:

```shell
[tool.dagster]
module_name = "your_module_name" ## name of project's Python module
```
This command launches both [Dagit](/concepts/dagit/dagit) and the Dagster daemon, allowing you to start a full local deployment of Dagster from the command line. See the [Running Dagster Locally guide](/deployment/guides/running-locally) for more information about `dagster dev`.

When defined, you can run this in the same directory as the `pyproject.toml` file:
You can also run the Dagster daemon by itself by running:

```shell
dagster-daemon run
```

Instead of this:

```shell
dagster-daemon run -m your_module_name
```

---

</TabItem>
</TabGroup>
This command takes all the same arguments as `dagster dev` for specifing where to find your code.

### Deploying the daemon

Expand Down
8 changes: 4 additions & 4 deletions docs/content/deployment/dagster-instance.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ description: "Define configuration options for your Dagster instance."
Cloud, refer to the <a href="/dagster-cloud">Dagster Cloud documentation</a>.
</Note>

The <PyObject module="dagster" object="DagsterInstance" displayText="DagsterInstance" /> defines the configuration that Dagster needs for a single deployment - for example, where to store the history of past runs and their associated logs, where to stream the raw logs from op compute functions, and how to launch new runs.
The Dagster instance defines the configuration that Dagster needs for a single deployment - for example, where to store the history of past runs and their associated logs, where to stream the raw logs from op compute functions, and how to launch new runs.

All of the processes and services that make up your Dagster deployment should share a single instance config file, named `dagster.yaml`, so that they can effectively share information.

Expand All @@ -30,7 +30,9 @@ All of the processes and services that make up your Dagster deployment should sh

When a Dagster process like Dagit or Dagster CLI commands are launched, Dagster tries to load your instance. If the environment variable `DAGSTER_HOME` is set, Dagster looks for an instance config file at `$DAGSTER_HOME/dagster.yaml`. This file contains the configuration settings that make up the instance.

By default - if `dagster.yaml` isn't present or the file exists but is empty - Dagster will store this information on the local filesystem, structured like the following:
If `DAGSTER_HOME` isn't set, Dagster tools will use a temporary directory for storage that is cleaned up when the process exits. This can be useful when using Dagster for temporary local development or testing, when you don't care about the results being persisted.

If `DAGSTER_HOME` is set but `dagster.yaml` isn't present or is empty, Dagster will persist data on the local filesystem, structured like the following:

$DAGSTER_HOME
├── dagster.yaml
Expand Down Expand Up @@ -83,8 +85,6 @@ Here's a breakdown of the files and directories that are generated:
</tbody>
</table>
If `DAGSTER_HOME` isn't set, Dagster tools will use an ephemeral instance for execution. In this case, the run and event log storages will be in-memory rather than persisted to disk. Additionally, filesystem storage will use a temporary directory that's cleaned up when the process exits. This is useful for tests and is the default when invoking Python APIs such as <PyObject module="dagster" object="JobDefinition" method="execute_in_process" /> directly.

---
## Configuration reference
Expand Down
4 changes: 4 additions & 0 deletions docs/content/deployment/guides.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,10 @@ Learn how to deploy and execute Dagster with these hands-on guides.
Check out these guides to learn the basics of Dagster deployment, including setting up Dagster and running the Dagit server.

<ArticleList>
<ArticleListItem
title="Running Dagster locally"
href="/deployment/guides/running-locally"
></ArticleListItem>
<ArticleListItem
title="Running Dagster as a service"
href="/deployment/guides/service"
Expand Down
107 changes: 107 additions & 0 deletions docs/content/deployment/guides/running-locally.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
title: Running Dagster Locally | Dagster"
description: How to run Dagster on your local machine.
---

# Running Dagster Locally

The easiest way to start up Dagster services during local development is to run:

```shell
dagster dev
```

from a Python environment that has both the `dagster` and `dagit` Python packages installed. See the [Dagster installation guide](/getting-started/install) for more information on how to install those packages.

The `dagster dev` command launches both [Dagit](/concepts/dagit/dagit) and the [Dagster daemon](/deployment/dagster-daemon) locally, allowing you to start a full deployment of Dagster from the command line. Once started, the process should be kept running.

## Locating your code

There are a few ways that you can tell Dagster how to find the Python code containing your assets and jobs, which are outlined in the following tabs. If you've used the [dagster command line to create a project](/getting-started/create-new-project#bootstrapping-a-new-project) or are using a Dagster example project, you can simply run the `dagster dev` command in the same folder as the project in order to load that code.

<TabGroup>
<TabItem name="From a file">

Dagster can load a file directly as a code location. In the following example, we used the `-f` argument to supply the name of the file:

```shell
dagster dev -f my_file.py
```

This command loads the definitions in `my_file.py` as a code location in the current Python environment

You can also include multiple files at a time:

```shell
dagster dev -f my_file.py -f my_second_file.py
```

---

</TabItem>
<TabItem name="From a module">

Dagster can also load Python modules as code locations. When this approach is used, Dagster loads the definitions defined at the top-level of the module, in a variable containing the <PyObject object="Definitions" /> object of its root `__init__.py` file. As this style of development eliminates an entire class of Python import errors, we strongly recommend it for Dagster projects deployed to production.

In the following example, we used the `-m` argument to supply the name of the module:

```shell
dagster dev -m your_module_name
```

This command loads the definitions in the variable containing the <PyObject object="Definitions" /> object in the named module - defined as the root `__init__.py` file - in the current Python environment.

---

</TabItem>
<TabItem name="Without command line arguments">

To load definitions without supplying command line arguments, you can use the `pyproject.toml` file. This file, included in all Dagster example projects, contains a `tool.dagster` section with a `module_name` variable:

```shell
[tool.dagster]
module_name = "your_module_name" ## name of project's Python module
```

When defined, you can run this in the same directory as the `pyproject.toml` file:

```shell
dagster dev
```

Instead of this:

```shell
dagster dev -m your_module_name
```

---

</TabItem>
</TabGroup>

## Run and asset storage

When running `dagster dev`, you may see log output that looks like this:

```shell
Using temporary directory /Users/rhendricks/tmpqs_fk8_5 for storage.
```

This indicates that any runs or materialized assets that are created during your session will not be persisted once the session ends. This can be useful when using Dagster for temporary local development or testing, when you don't care about the results being persisted.

To designate a more permanent home for your runs and assets, you can set the `DAGSTER_HOME` environment variable to a folder on your filesystem. Dagster will then use that folder for storage on all subsequent runs of `dagster dev`.

## Configuring your local instance

You can optionally use a `dagster.yaml` file to configure your Dagster instance - for example, to configure [run concurrency limits](/deployment/run-coordinator#limiting-run-concurrency) or or specify that runs should be stored in a [Postgres dataabase](/deployment/dagster-instance#postgres-storage) instead of on the filesystem.

If you have the `DAGSTER_HOME` environment variable set, `dagster dev` will look for a `dagster.yaml` file in the `DAGSTER_HOME` folder. If `DAGSTER_HOME` is not set, `dagster dev` will look for that file from the folder where the command was run.

For the full set of options that can be set in the `dagster.yaml` file, see the [Dagster instance](/deployment/dagster-instance) section.

## Moving to Production

`dagster dev` is primarily useful for running Dagster for local development and testing, but is not suitable for the demands of most production deployments. For example, in a production deployment, you might want to run multiple Dagit replicas, have zero-downtime continuous deployment of your code, or set up your Dagster daemon to automatically restart if it crashes.

For information about deploying Dagster in production, see our other [Deploying Dagster guides](/deployment/open-source#deploying-dagster).
2 changes: 1 addition & 1 deletion docs/content/deployment/guides/service.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ DAGSTER_HOME=/opt/dagster/dagster_home dagit -h 0.0.0.0 -p 3000

In this configuration, Dagit will write execution logs to `$DAGSTER_HOME/logs` and listen on _0.0.0.0:3000_.

## Running dagster-daemon
## Running the Dagster daemon

If you're using [schedules](/concepts/partitions-schedules-sensors/schedules), [sensors](/concepts/partitions-schedules-sensors/sensors), or [backfills](/concepts/partitions-schedules-sensors/backfills), or want to set limits on the number of runs that can be executed at once, you'll want to also run a [dagster-daemon service](/deployment/dagster-daemon) as part of your deployment. To run this service locally, run the following command:

Expand Down
4 changes: 4 additions & 0 deletions docs/content/deployment/open-source.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,10 @@ Learn about the concepts relevant to deploying Dagster.
Check out these guides to learn the basics of Dagster deployment, including setting up Dagster and running the Dagit server.

<ArticleList>
<ArticleListItem
title="Running Dagster locally"
href="/deployment/guides/running-locally"
></ArticleListItem>
<ArticleListItem
title="Running Dagster as a service"
href="/deployment/guides/service"
Expand Down

0 comments on commit 7e3032c

Please sign in to comment.