# Directory as entry point

This example shows how you can build a pipeline from a directory with scripts
(without defining a `pipeline.yaml` file).

You can use this sample project structure as a starting template for new
script or notebook-based (just replace the *.py files with *.ipynb) projects.

## Setup environment

(**Note**: Only required if you are running this example in your computer, not
required if using Binder/Deepnote)

~~~sh
conda env create --file environment.yaml
conda activate spec-api-directory
~~~

## Pipeline description

This pipeline contains 5 steps. The last task train a model and outputs a report
and a trained model file. To get the pipeline description:


In [1]:
%%sh
ploomber status --entry-point '*.py'

name         Last run     Outdated?    Product      Doc (short)    Location
-----------  -----------  -----------  -----------  -------------  ----------
get-actions  a day ago    False        {'nb': File  Get actions    get-
             (Nov 20, 20               (/Users/Edu  data and       actions.py
             at 18:53)                 /dev/projec  make some
                                       ts-ploomber  charts
                                       /spec-api-d
                                       irectory/ou
                                       tput/get-ac
                                       tions.ipynb
                                       ), 'data':
                                       File(/Users
                                       /Edu/dev/pr
                                       ojects-ploo
                                       mber/spec-
                                       api-directo
                                       ry/output/a
                

100%|██████████| 5/5 [00:00<00:00, 9593.56it/s]


`--entry-point '*.py'` means "all files with py extension are tasks in the
pipeline". If all the files in the current directory are tasks, you can also
use the shortcut `--entry-point .`.

## Build the pipeline from the command line


In [2]:
%%sh
mkdir output
ploomber build --entry-point '*.py'

name           Ran?      Elapsed (s)    Percentage
-------------  ------  -------------  ------------
clean-actions  True          3.09134       35.1007
train-model    True          5.71573       64.8993
get-actions    False         0              0
get-users      False         0              0
clean-users    False         0              0


mkdir: output: File exists
100%|██████████| 5/5 [00:00<00:00, 9541.18it/s]
Building task "clean-actions":   0%|          | 0/2 [00:00<?, ?it/s]
Executing:   0%|          | 0/4 [00:00<?, ?cell/s][A
Executing:  25%|██▌       | 1/4 [00:01<00:04,  1.39s/cell][A
Executing: 100%|██████████| 4/4 [00:03<00:00,  1.30cell/s]
Building task "train-model":  50%|█████     | 1/2 [00:03<00:03,  3.09s/it]  
Executing:   0%|          | 0/14 [00:00<?, ?cell/s][A
Executing:   7%|▋         | 1/14 [00:00<00:10,  1.29cell/s][A
Executing:  14%|█▍        | 2/14 [00:04<00:20,  1.70s/cell][A
Executing:  43%|████▎     | 6/14 [00:04<00:09,  1.20s/cell][A
Executing:  71%|███████▏  | 10/14 [00:04<00:03,  1.17cell/s][A
Executing: 100%|██████████| 14/14 [00:05<00:00,  2.46cell/s]
Building task "train-model": 100%|██████████| 2/2 [00:08<00:00,  4.41s/it]


Output is stored in the `output/` directory.

## Where to go from here

Building pipelines from a collection of scripts/notebooks without a
`pipeline.yaml` is quick an easy, but has limited features. See the
[`spec-api-python/`](../spec-api-python/README.ipynb) to see how you can declare
your tasks in a YAML file.