# Getting Started

### Usage
Curie will always assume to be operating from the project's root directory, even when the script is in deep child directories. This is because Curie will always look for the `project.yml` file in the project's root directory. This file is used to document the project's metadata.

* **pathways.yaml** - You'll find this in `<project>/config/` and it contains the grand master list of all the pathways in your project. This is the file that Curie will use to comprehend pipelines and their dependencies.
* **connections.yaml** - Pointed to in the 'pathways.yaml' file, this file contains the connection information for all the data sources in your project. This is the file that Curie will use to connect to your data sources.
* **project.yml** - This file contains the metadata for your project. This is the file that Curie will use to understand the project's name, description, and other metadata.
* **/blueprints** - This directory contains the blueprints for your project. This is the directory that Curie will use to find the blueprints for your project. Each **blueprint** describes a single pipeline in your project. These are yaml files.

<div class="alert alert-block alert-success">Please refer to the module's README for more information on the project's structure and best practices.</div>

## Initializing a Project

In [None]:
cd parentDirectory
curie -i CurieProjectName

## Compile a Pipeline

 Compiled pipelines are stored in the `<project>/scripts/compiled/<pipe>/<mode>` directory.
 
**Note:** There is no way to compile only one table in a pipeline. You must compile the entire pipeline.

In [None]:
@REM Compile a pipeline in RUN mode
curie -r my_pipe -c
@REM OR
curie --run my_pipe --compile

In [None]:
from curie import Curie
c = Curie()
mode = 'run'
c.path('test-etl').mode(mode)

## Executing a Pipeline
**Modes** - `run` and `save`.

1. **Run mode** - Will execute the pipeline against the database affecting the associated tabels.

2. **Save mode** - Will execute the pipeline against the database and save the results to a file.



**All Tables** - Can be specified by swapping the list of tables with a single period. (Example: `curie --run pipe --tables .` or `curie --save pipe --tables .`)

**Using a list of tables** - In python you can use a list of tables to specify which tables to run the pipeline against. In the command line you can use a space separated list of tables. (Example: `curie --run pipe --tables table1 table2 table3` or `curie --save pipe --tables table1 table2 table3`)

In [None]:
@REM Run a pipeline in RUN mode for all tables
curie -r my_pipe -t .
@REM OR
curie --run my_pipe --tables .

@REM Run a pipeline in RUN mode for a specific table
curie -r my_pipe -t my_table
@REM OR
curie --run my_pipe --tables my_table

@REM Run a pipeline in SAVE mode for all tables
curie -s my_pipe -t .
@REM OR
curie --save my_pipe --tables .

@REM Run a pipeline in SAVE mode for a specific table
curie -s my_pipe -t my_table
@REM OR
curie --save my_pipe --tables my_table

In [None]:
from curie import Curie

c = Curie()

# Run all tables using the 'test-etl' pipeline
c.path('test-etl').mode('run').execute('.')
# Run specific tables using the 'test-etl' pipeline
c.path('test-etl').mode('run').execute(['test_table_1', 'test_table_2'])

# Save all tables using the 'test-etl' pipeline
c.path('test-etl').mode('save').execute('.')
# Save specific tables using the 'test-etl' pipeline
c.path('test-etl').mode('save').execute(['test_table_1', 'test_table_2'])

## Cleaning up your project

**Facets** - a term to describe elements found in your project that were generated by Curie. This includes datasets and compiled pipelines. <div class="alert alert-block alert-warning">If you have seed datasets in your project, ensure they are not in `<project>/data/<pipe>/<run,save>/` before running the cleanup command.</div>

In [None]:
@REM Clean all facets in my pipeline
curie --clean my_pipe
@REM Clean datasets from my pipeline
curie --clean my_pipe --facet data
@REM Clean Comiled scripts from my pipeline
curie --clean my_pipe --facet compiled
@REM Clean everything from all pipelines
curie --clean .

In [None]:
c = Curie()
# Clean all facets of the 'test-etl' pipeline
c.path('test-etl').clean('.')
# Clean specific facets of the 'test-etl' pipeline
c.path('test-etl').clean('compiled', 'data')
# Clean all facets of all pipelines
c.clean('.')
# Clean specific facets of all pipelines
c.clean('compiled', 'data')