# Start dlt pipeline from scratch
---

## Fundamental command line

- Basic command line on macos:
    -`pwd`: Print working directory
        Example: `pwd`
    - `ls`: List directory contents
        Example: `ls`
    - `cd`: change directory
        Example: `cd /path/to/directory`
    - `cd ..` :Navigate up one directory level from the current working directory

    - `mkdir`: Make directory
        Example: `mkdir new_folder`
    - `touch`: Create a New File
        Example: `touch filename.txt`
    - `cp`: Copy files or directories
        Example: `cp source.txt destination.txt`
    - `mv`: Move (or Rename) Files or Directories
        Example: `mv oldname.txt newname.txt`
    - `rm`: Remove Files or Directories
        Example: `rm file.txt`. For directories: `rm -r directory_name`

    - Show and hide the hidden folders on mac `shift+command+.`

    - Comment mulitple lines on mac and windows `cmd+/`.

    - Install dlt via conda `conda install -c conda-forge dlt`

    - About conda [click](https://conda.io/projects/conda/en/latest/user-guide/getting-started.html)

- Basic command line on windows:
    - Open command prompt: Press `Win + R`, type `cmd`, and press `Enter`.
    - `dir`: List directory contents
        Example: `dir`
    - `cd`: Change directory
        Example: `cd C:\Path\To\Directory`
    -   `cls`: Clear the screen
        Example: `cls`
    - `exit`: Close the command prompt
        Example: `exit`
    - `cd`: Display the current directory.
        Example: `cd C:\Users`
    - `cd ..` : Move up one level.

    - `cd \` : Move to Root Directory:
    
    - `copy`: Copy files from one location to another
        Example: `copy file1.txt D:\Backup\file1.txt`
    - `move`: Move or rename files and directories
        Example: `move file1.txt D:\Backup\`
    - `del`: Delete files
        Example: `del file1.txt`
    - `mkdir`: Create a new directory
        Example: `mkdir NewFolder`
    - `rmdir`: Remove a directory
        Example: `rmdir NewFolder`

    ---





## Python virtual environment by `Virtualenv` via VsCode

- Create project: go to `File` > `Open Folder..`.

- Create venv
go to `Command Pallette` select `Python: Create Envirionment` selec `Venv` select python 3.11

- Open terminal go to `View` select `Terminal`

- Activate virtualenv on macos: `source .venv/bin/activate` and on windows: `.venv\Scripts\activate`

- Deactivat virtualenv on macos: `Deactivate`

- Usage: `pip <command> [options]` more details [click](https://pip.pypa.io/en/stable/cli/)

- General options for `pip`:
    Options:
    - -h => Show help. For an example `pip -h`.

- Commands:
    - install => install packages, for an example: `pip install pandas`
    - uninstall => uninstall packages, for an example: `pip uninstall pandas`
    - freeze => show installed packages in requirement format, for examples: `pip freeze` and `pip freeze > requirements.txt`    
---


## Create the first `dlt` pipeline

>  `pipeline`is a connection that moves the data from your Python code to a destination. The pipeline accepts dlt `sources` or `resources` as well as `generators`, `async generators`, `lists` and any `iterables`. Once the pipeline runs, all resources get evaluated and the data is loaded at destination.

For setting the pipeline,

```
dlt.pipeline(pipeline_name, destination, dataset_name)
```

- `pipeline_name` >  a name of the pipeline that will be used to identify it in trace and monitoring events and to restore its state and data schemas on subsequent runs.

- `destination` > a name of the `destination` to which dlt will load the data. 

- `dataset_name` >  name of the dataset to which the data will be loaded. It may be `schema` in relational databases or folder.

For running the pipeline,

`pipeline.run(data,write_disposition, table_name)`

- `data` (the first argument) may be a dlt `source`, `resource`, `generator` function, or any Iterator / Iterable (i.e. a list or the result of map function).

- `write_disposition` controls how to write data to a table. Defaults to "append".
    - `append` will always add new data at the end of the table.
    - `replace` will replace existing data with new data.
    - `skip` will prevent data from loading.
    - `merge` will deduplicate and merge data based on primary_key and merge_key hints.

- `table_name` - specified in case when table name cannot be inferred i.e. from the resources or name of the generator function.


### Start create the first dlt pipeline

- Install packages: `pip insatall`

- Install packages: `pip install dlt pandas "dlt[duckdb]" streamlit`

- Run script python file from venv `pipenv run python file-name.py`



- Show the results on streamlit app `dlt pipeline pipline-name show` 

- Quit streamlit app `Ctl+c`

