Skip to content

Commit

Permalink
Add docs
Browse files Browse the repository at this point in the history
  • Loading branch information
edgarrmondragon committed Aug 5, 2023
1 parent 23dd8a6 commit 2db3f63
Show file tree
Hide file tree
Showing 6 changed files with 64 additions and 39 deletions.
6 changes: 3 additions & 3 deletions docs/docs/concepts/project.md
Original file line number Diff line number Diff line change
Expand Up @@ -356,9 +356,9 @@ In a newly initialized project, this directory will be included in [`.gitignore`
While you would usually not want to modify files in this directory directly, knowing what's in there can aid in debugging:

- `.meltano/meltano.db`: The default SQLite [system database](#system-database).
- `.meltano/logs/elt/<state_id>/<run_id>/elt.log`, e.g. `.meltano/logs/elt/gitlab-to-postgres/<UUID>/elt.log`: [`meltano elt`](/reference/command-line-interface#elt) and [`meltano run`](/reference/command-line-interface#run) output logs for the specified pipeline run.
- `.meltano/logs/elt/<state_id>/<run_id>/elt.log`, e.g. `.meltano/logs/elt/gitlab-to-postgres/<UUID>/elt.log`: [`meltano el`](/reference/command-line-interface#el), [`meltano elt`](/reference/command-line-interface#elt) and [`meltano run`](/reference/command-line-interface#run) output logs for the specified pipeline run.
- `.meltano/run/bin`: Symlink to the [`meltano` executable](/reference/command-line-interface) most recently used in this project.
- `.meltano/run/elt/<state_id>/<run_id>/`, e.g. `.meltano/run/elt/gitlab-to-postgres/<UUID>/`: Directory used by [`meltano elt`](/reference/command-line-interface#elt) and [`meltano run`](/reference/command-line-interface#run) to store pipeline-specific generated plugin config files, like an [extractor](/concepts/plugins#extractors)'s `tap.config.json`, `tap.properties.json`, and `state.json`.
- `.meltano/run/elt/<state_id>/<run_id>/`, e.g. `.meltano/run/elt/gitlab-to-postgres/<UUID>/`: Directory used by [`meltano el`](/reference/command-line-interface#el), [`meltano elt`](/reference/command-line-interface#elt) and [`meltano run`](/reference/command-line-interface#run) to store pipeline-specific generated plugin config files, like an [extractor](/concepts/plugins#extractors)'s `tap.config.json`, `tap.properties.json`, and `state.json`.
- `.meltano/run/<plugin name>/`, e.g. `.meltano/run/tap-gitlab/`: Directory used by [`meltano invoke`](/reference/command-line-interface#invoke) to store generated plugin config files.
- `.meltano/<plugin type>/<plugin name>/venv/`, e.g. `.meltano/extractors/tap-gitlab/venv/`: [Python virtual environment](https://docs.python.org/3/glossary.html#term-virtual-environment) directory that a plugin's [pip package](https://pip.pypa.io/en/stable/) was installed into by [`meltano add`](/reference/command-line-interface#add) or [`meltano install`](/reference/command-line-interface#install).

Expand All @@ -376,7 +376,7 @@ While you would usually not want to modify the system database directly, knowing

Meltano's CLI utilizes the following tables:

- `runs` table: One row for each [`meltano elt`](/reference/command-line-interface#elt) or [`meltano run`](/reference/command-line-interface#run) pipeline run, holding started/ended timestamps and [incremental replication state](/guide/integration#incremental-replication-state).
- `runs` table: One row for each [`meltano el`](/reference/command-line-interface#el), [`meltano elt`](/reference/command-line-interface#elt) or [`meltano run`](/reference/command-line-interface#run) pipeline run, holding started/ended timestamps and [incremental replication state](/guide/integration#incremental-replication-state).
- `plugin_settings` table: [Plugin configuration](/guide/configuration#configuration-layers) set using [`meltano config <plugin> set`](/reference/command-line-interface#config) or [the UI](/reference/ui) when the project is [deployed as read-only](/reference/settings#project-readonly).
- `user` table: Users for [the deprecated Meltano UI](/guide/troubleshooting#meltano-ui) created using [`meltano user add`](/reference/command-line-interface#user).

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/contribute/plugins.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ Try to run the tap with the `--discover` switch, which should output a catalog o
###### State

1. Try to run the tap connect and extract data first, watching for `STATE` messages.
1. Do two ELT run with `target-postgres`, then validate that:
1. Do two EL runs with `target-postgres`, then validate that:
1. All the tables in the schema created have a PRIMARY KEY constraint. (this is important for incremental updates)
1. There is no duplicates after multiple extractions

Expand Down
8 changes: 4 additions & 4 deletions docs/docs/contribute/tests.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,17 +44,17 @@ the bellow in `docs/example-library/transition-from-elt-to-run/index.md`:
````markdown
# Example of how to transition from `meltano elt` to `meltano run`

This example shows how to transition an `elt` task with a custom state-id to a `job` executed via `run`.
This example shows how to transition an `el` or `elt` task with a custom state-id to a `job` executed via `run`.
To follow along with this example, download link to meltano yml to a fresh project and run:

```
meltano install
```

Then assuming you had an `elt` job invoked like so:
Then assuming you had an `el` job invoked like so:

```shell
meltano elt --state-id=my-custom-id tap-gitlab target-postgres
meltano el --state-id=my-custom-id tap-gitlab target-postgres
```

You would first need to rename the id to match meltano's internal pattern:
Expand All @@ -76,7 +76,7 @@ Our integration framework will then parse this markdown, searching for code fenc

```shell
meltano install
meltano elt --state-id=my-custom-id tap-gitlab target-postgres
meltano el --state-id=my-custom-id tap-gitlab target-postgres
meltano state copy my-custom-id tap-gitlab-to-target-postgres
meltano job add my-new-job --task="tap-gitlab target-postgres"
meltano run my-new-job
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/guide/complete_tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -851,7 +851,7 @@ meltano schedule add <pipeline name> --extractor <extractor> --loader <loader> -
meltano schedule add gitlab-to-postgres --extractor tap-gitlab --loader target-postgres --interval @daily
```
The `pipeline name` argument corresponds to the `--state-id` option on `meltano elt`, which identifies related EL(T) runs when storing and looking up [incremental replication state](/guide/integration#incremental-replication-state).
The `pipeline name` argument corresponds to the `--state-id` option on `meltano el`, which identifies related EL runs when storing and looking up [incremental replication state](/guide/integration#incremental-replication-state).
To have scheduled runs pick up where your [earlier manual run](#run-a-data-integration-el-pipeline) left off, ensure you use the same pipeline name.
Expand Down
83 changes: 54 additions & 29 deletions docs/docs/reference/command-line-interface.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ Specifically, it will:
(Some plugin types have slightly different or additional behavior; refer to the [plugin type documentation](/concepts/plugins#types) for more details.)

Once the plugin has been added to your project, you can configure it using [`meltano config`](#config),
invoke its executable using [`meltano invoke`](#invoke), and use it in a pipeline using [`meltano elt`](#elt).
invoke its executable using [`meltano invoke`](#invoke), and use it in a pipeline using [`meltano run`](#run).

To learn more about adding a plugin to your project, refer to the [Plugin Management guide](/guide/plugin-management#adding-a-plugin-to-your-project).

Expand Down Expand Up @@ -404,25 +404,24 @@ The `discover` command does not run relative to a [Meltano Environment](https://

Open the Meltano documentation site in the default browser.

## `elt`
## `el`

This allows you to run your ELT pipeline to Extract, Load, and Transform data using an [extractor](/concepts/plugins#extractors) and [loader](/concepts/plugins#loaders) of your choosing,
and optional [transformations](/concepts/plugins#transformers).
This allows you to run your EL pipeline to Extract and Load data using an [extractor](/concepts/plugins#extractors) and [loader](/concepts/plugins#loaders) of your choosing,

To allow subsequent pipeline runs with the same extractor/loader/transform combination to pick up right where the previous run left off,
each ELT run has a State ID that is used to store and look up the [incremental replication state](/guide/integration#incremental-replication-state) in the [system database](/guide/production#storing-metadata). If no stable identifier is provided using the `--state-id` flag or the `MELTANO_STATE_ID` environment variable, extraction will always start from scratch and a one-off State ID is automatically generated using the current date and time.
To allow subsequent pipeline runs with the same extractor/loader combination to pick up right where the previous run left off,
each EL run has a State ID that is used to store and look up the [incremental replication state](/guide/integration#incremental-replication-state) in the [system database](/guide/production#storing-metadata). If no stable identifier is provided using the `--state-id` flag or the `MELTANO_STATE_ID` environment variable, extraction will always start from scratch and a one-off State ID is automatically generated using the current date and time.

All the output generated by this command is also logged inside the [`.meltano` directory](/concepts/project#meltano-directory) at `.meltano/logs/elt/{state_id}/{run_id}/elt.log`. The `run_id` is a UUID autogenerated at each run.

:::info

<p>The command <a href="/reference/command-line-interface#run"><code>meltano run</code></a> is the recommended way to run cross-plugin workflows, including ELT, in a composable manner.</p>
<p>The command <a href="/reference/command-line-interface#run"><code>meltano run</code></a> is the recommended way to run cross-plugin workflows in a composable manner.</p>
:::

### How to use

```bash
meltano elt <extractor> <loader> [--transform={run,skip,only}] [--state-id TEXT]
meltano el <extractor> <loader> [--state-id TEXT]
```

#### Parameters
Expand Down Expand Up @@ -465,22 +464,22 @@ meltano elt <extractor> <loader> [--transform={run,skip,only}] [--state-id TEXT]
- `extractor-config`: Dump the extractor [config file](https://hub.meltano.com/singer/spec#config-files) that would be passed to the tap's executable using the `--config` option.
- `loader-config`: Dump the loader [config file](https://hub.meltano.com/singer/spec#config-files) that would be passed to the target's executable using the `--config` option.

Like any standard output, the dumped content can be [redirected](<https://en.wikipedia.org/wiki/Redirection_(computing)>) to a file using `>`, e.g. `meltano elt ... --dump=state > state.json`.
Like any standard output, the dumped content can be [redirected](<https://en.wikipedia.org/wiki/Redirection_(computing)>) to a file using `>`, e.g. `meltano el ... --dump=state > state.json`.

#### Examples

```bash
meltano elt tap-gitlab target-postgres --transform=run --state-id=gitlab-to-postgres
meltano el tap-gitlab target-postgres --state-id=gitlab-to-postgres

meltano elt tap-gitlab target-postgres --state-id=gitlab-to-postgres --full-refresh
meltano el tap-gitlab target-postgres --state-id=gitlab-to-postgres --full-refresh

meltano elt tap-gitlab target-postgres --catalog extract/tap-gitlab.catalog.json
meltano elt tap-gitlab target-postgres --state extract/tap-gitlab.state.json
meltano el tap-gitlab target-postgres --catalog extract/tap-gitlab.catalog.json
meltano el tap-gitlab target-postgres --state extract/tap-gitlab.state.json

meltano elt tap-gitlab target-postgres --select commits
meltano elt tap-gitlab target-postgres --exclude project_members
meltano el tap-gitlab target-postgres --select commits
meltano el tap-gitlab target-postgres --exclude project_members

meltano elt tap-gitlab target-postgres --state-id=gitlab-to-postgres --dump=state > extract/tap-gitlab.state.json
meltano el tap-gitlab target-postgres --state-id=gitlab-to-postgres --dump=state > extract/tap-gitlab.state.json
```

### Using `elt` with Environments
Expand All @@ -495,18 +494,18 @@ you can learn more about what's going on behind the scenes by setting Meltano's
using the `MELTANO_CLI_LOG_LEVEL` environment variable or the `--log-level` CLI option:

```bash
MELTANO_CLI_LOG_LEVEL=debug meltano elt ...
MELTANO_CLI_LOG_LEVEL=debug meltano el ...

meltano --log-level=debug elt ...
meltano --log-level=debug el ...
```

In debug mode, `meltano elt` will log the arguments and [environment](/guide/configuration#accessing-from-plugins) used to invoke the Singer tap and target executables (and `dbt`, when running transformations), including the paths to the generated
In debug mode, `meltano el` will log the arguments and [environment](/guide/configuration#accessing-from-plugins) used to invoke the Singer tap and target executables (and `dbt`, when running transformations), including the paths to the generated
[config](https://hub.meltano.com/singer/spec#config-files),
[catalog](https://hub.meltano.com/singer/spec#catalog-files), and
[state](https://hub.meltano.com/singer/spec#state-files) files, for you to review:

```bash
$ meltano --log-level=debug elt tap-gitlab target-jsonl --state-id=gitlab-to-jsonl
$ meltano --log-level=debug el tap-gitlab target-jsonl --state-id=gitlab-to-jsonl
meltano | INFO Running extract & load...
meltano | INFO Found state from 2020-08-05 21:30:20.487312.
meltano | DEBUG Invoking: ['demo-project/.meltano/extractors/tap-gitlab/venv/bin/tap-gitlab', '--config', 'demo-project/.meltano/run/tap-gitlab/tap.config.json', '--state', 'demo-project/.meltano/run/tap-gitlab/state.json']
Expand All @@ -531,6 +530,32 @@ meltano | DEBUG Incremental state: {'project_7603319': '2020-08-05T21
meltano | INFO Extract & load complete!
```

## `elt`

:::caution
This command is deprecated in favor of `el`.
:::

This is identical to the [`el`](#el) command, except that it also runs [transformations](/concepts/plugins#transformers).

### How to use

```bash
meltano elt <extractor> <loader> [--transform={run,skip,only}] [--state-id TEXT]
```

#### Parameters

All the same parameters as [`meltano el`](#el) are supported, with the following additions:

- The `--transform` option can be:

#### Examples

```bash
meltano elt tap-gitlab target-postgres --transform=run --state-id=gitlab-to-postgres
```

## `environment`

Use the `environment` command to manage [Environments](/concepts/environments) in your Meltano project.
Expand Down Expand Up @@ -1052,11 +1077,11 @@ meltano job remove simple-demo
<p>An <code>orchestrator</code> plugin is required to use <code>meltano schedule</code>: refer to the <a href="/guide/orchestration">Orchestration</a> documentation to get started with Meltano orchestration.</p>
:::

Use the `schedule` command to define ELT or Job pipelines to be run by an orchestrator at regular intervals.
Use the `schedule` command to define EL or Job pipelines to be run by an orchestrator at regular intervals.
These scheduled pipelines will be added to your [`meltano.yml` project file](/concepts/project#meltano-yml-project-file).
You can schedule both [jobs](#job) or legacy [`meltano elt`](#elt) tasks.
You can schedule both [jobs](#job) or legacy [`meltano el`](#el) and [`meltano elt`](#elt) tasks.

You can run a specific scheduled pipeline's corresponding [`meltano run`](#run) or [`meltano elt`](#elt) command as a one-off using `meltano schedule run <schedule_name>`.
You can run a specific scheduled pipeline's corresponding [`meltano run`](#run), [`meltano el`](#el) or [`meltano elt`](#elt) command as a one-off using `meltano schedule run <schedule_name>`.
Any command line options (e.g. `--select=<entity>` or `--dry-run`) will be passed on to the underlying commands.

### How to use
Expand All @@ -1068,8 +1093,8 @@ The interval argument can be a [cron expression](https://en.wikipedia.org/wiki/C
# Add a schedule
# Schedule a job named "my_job" to run everyday
meltano schedule add <schedule_name> --job my_job --interval "@daily"
# Schedule an ELT task to run hourly
meltano schedule add <schedule_name> --extractor <tap> --loader <target> --transform run --interval "@hourly"
# Schedule an EL task to run hourly
meltano schedule add <schedule_name> --extractor <tap> --loader <target> --interval "@hourly"

# List all schedules
meltano schedule list [--format=json]
Expand All @@ -1081,7 +1106,7 @@ meltano schedule remove <schedule_name>
meltano schedule set <schedule_name> --interval <new-interval>
# Update a named schedule changing the referenced job
meltano schedule set <schedule_name> --job <new-job>
# Update a named ELT scheduled changing the interval AND changing the extractor
# Update a named EL scheduled changing the interval AND changing the extractor
meltano schedule set <schedule_name> --extractor <new-tap> --interval <new-interval>

# Run a schedule
Expand All @@ -1107,9 +1132,9 @@ meltano schedule set gitlab-sync --job gitlab-to-postgres
# Update the schedule named "gitlab-sync" to run weekly instead of daily
meltano schedule set gitlab-sync --interval "@weekly"

# Add a legacy ELT based schedule named "gitlab-to-jsonl" to run every minute
# Add a legacy EL based schedule named "gitlab-to-jsonl" to run every minute
# This specifies that the following command is to be run every minute:
# meltano elt tap-gitlab target-jsonl --state-id=gitlab-to-jsonl
# meltano el tap-gitlab target-jsonl --state-id=gitlab-to-jsonl
meltano schedule add gitlab-to-jsonl --extractor tap-gitlab --loader target-jsonl --interval="* * * * *"
# Update the schedule named "gitlab-to-jsonl" to use target-csv instead of target-jsonl
meltano schedule set gitlab-to-jsonl --loader target-csv
Expand Down Expand Up @@ -1319,7 +1344,7 @@ Merge new state onto existing state for a state ID.
<p>Merged state is computed at <em>execution</em> time.
The <samp>merge</samp> command merely
adds a new <samp>payload</samp> to the database which is merged together with
existing payloads the next time state is read via <samp>meltano elt</samp>, <samp>meltano run</samp>, or <samp>meltano state get</samp>.
existing payloads the next time state is read via <samp>meltano el</samp>, <samp>meltano elt</samp>, <samp>meltano run</samp>, or <samp>meltano state get</samp>.
</p>
:::
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/reference/settings.md
Original file line number Diff line number Diff line change
Expand Up @@ -418,7 +418,7 @@ root:

## `meltano elt`

These settings can be used to modify the behavior of [`meltano elt`](/reference/command-line-interface#elt).
These settings can be used to modify the behavior of [`meltano el`](/reference/command-line-interface#el) and [`meltano elt`](/reference/command-line-interface#elt).

### `elt.buffer_size`

Expand Down

0 comments on commit 2db3f63

Please sign in to comment.