# Lesson Notes

This notebook collects and organizes all user questions related to this project.


## Lesson Summary

This lesson teaches an end-to-end ELT ingestion and orchestration workflow using Meltano, BigQuery, dbt, and Dagster. You'll create Meltano projects to extract data from GitHub and a Supabase Postgres database, test extraction to a local JSON target, and then load the data into BigQuery as the data warehouse. After ingestion, you'll scaffold a dbt project to declare sources and build models that transform the raw tables into materialized analytics tables. Finally, you'll scaffold a Dagster project, implement two Software-Defined Assets (one to fetch pandas releases and one to compute summary statistics), register a job and a schedule, and configure an I/O manager so assets are persisted and visible in the Dagster UI.

## Syntax Summary

| Syntax / Token | Where seen | Purpose / Usage |
|---|---:|---|
| `meltano init <name>` | Meltano CLI | Create a new Meltano project scaffold |
| `meltano add extractor <tap-name>` | Meltano CLI | Add a data extractor (tap) to the project |
| `meltano add loader <target-name>` | Meltano CLI | Add a loader (target) to write extracted data |
| `meltano config <plugin> set --interactive` | Meltano CLI | Interactively set plugin configuration (adds to `meltano.yml` and secrets to `.env`) |
| `meltano select <tap> <entity> <field>` | Meltano CLI | Choose which entities/attributes the tap should extract |
| `meltano run <tap> <target>` | Meltano CLI | Execute extract -> load pipeline |
| `dbt init <project>` | dbt CLI | Scaffold a new dbt project |
| `dbt debug` / `dbt run` / `dbt clean` | dbt CLI | Validate connection, run models, clean artifacts |
| `profiles.yml` (YAML) | dbt config | Configure dbt connection targets (e.g., BigQuery service account) |
| `.env` with `GITHUB_TOKEN` | env file | Store secrets for local runs (e.g., GitHub personal access token) |
| `@asset` decorator | Dagster (Python) | Define a Software-Defined Asset function that returns materialized data |
| `context.add_output_metadata` | Dagster (Python) | Attach metadata (previews, counts, images) to assets for the UI |
| `define_asset_job(...)` / `ScheduleDefinition(...)` | Dagster API | Create a job to materialize assets and a schedule to run the job periodically |



### ELT Flow (compact)

1. Extract — pull raw data from sources (APIs, databases, files).
2. Load — store the raw data directly into the data warehouse (e.g., BigQuery).
3. Transform — run SQL/dbt models inside the warehouse to produce analytics-ready tables.

```
Sources (APIs, DBs, Files)
        |
        v
     Extract (taps)
        |
        v
     Load (warehouse: BigQuery, Snowflake)
        |
        v
    Transform (dbt / SQL inside warehouse)
        |
        v
   Analytics-ready tables / dashboards
```