#### Analytics Engineering

# Intro to Analytics Engineering / Pipelines

------------------------

### 🏗️ Project "Weather vs Flights Data" <br>🎯 Goal: Construct ELT data pipeline using python, pandas, more advanced SQL, SQLalchemy, and dbt




------------------------

In this week, you will learn how to connect to your database with Python and how to design and to automate a pipeline. Next week we could use the collected data for a Dashboard to visualize the results. 

The next days, you will work with a comprehensive real world API from https://dev.meteostat.net/api/. We will get access to historical weather records (daily and hourly) from basically everywhere around over the world. We will aim to obtain weather data for 3 airport locations: New York (JFK), Los Angeles (LAX), And Miami (MIA) from the last year. 


#### Milestones

1. Access data from real world API using a python script

2. Using SQLalchemy import raw data into postgres database 

3. Connect database to dbt cloud - which will be used to parse and prepare the data

4. Clean prepare the data on dbt cloud using CTE and the power of dbt

5. Answer questions with the data and save the answers in a data mart for your stakeholders


----------

### What is the use of .env file in projects?    
### How to store sensitive information like API keys or database credentials  in .env file?

### 🔑🔑🔑🔑`.env` Files  🔑🔑🔑🔑🔑


 It's common for teams to maintain distinct "environments" for their codebase. These separate environments allow thorough testing before deploying changes to the production environment, where they interact with end-users. In scenarios involving multiple environments, developers often opt to use multiple .env files to store credentials. For instance, they might have one 
 .env file containing database keys for development and another for production.

 This separation of code and credentials lower the risk of unauthorized individuals gaining access to sensitive data in the cloud.

**.env** files are specifically designed to store credentials in a key-value format for the various services that the program utilizes. These files are intended to be stored locally and not shared in online code repositories, ensuring that sensitive information remains confidential. Each developer within a team typically manages one or more `.env` files, tailored for the specific environments they are working on.

## Key uses and benefits of using a .env file in projects:

- ***Centralized Configuration:*** The .env file stores all project configuration settings, like API keys and database connections, separately from the code, making management easier.

- ***Environment Variables:*** It allows you to set environment variables that your application can use, promoting flexibility and separation of configuration from code.

- ***Storing Sensitive Information:*** Sensitive data, such as passwords and access tokens, can be safely stored in the .env file, protecting them from exposure in the codebase.

- ***Ease of Deployment***: Customizes settings for different environments (development, staging, production) without needing code changes.

- ***Git Ignore:*** The .env file should be added to .gitignore to prevent it from being tracked by version control systems, protecting sensitive information.

- ***Readability and Maintainability:*** Improves code readability and maintainability by keeping configuration settings organized and easy to update.

- ***Security Best Practices:*** Keeps sensitive information out of version-controlled files, reducing the risk of accidental exposure and simplifying access control management.

- ***Library Usage:*** Libraries like python-dotenv in Python projects can read the .env file and set environment variables, making them accessible in the application.

#### Usage

In this section, we’ll walk through how to use a `.env` file in a basic python project.

1. To begin, head to the root of your **week folder** and create an empty `.env` file containing credentials you’d like injected into your codebase. It may look something like this:

```python
POSTGRES_USER = 'saramaras'
POSTGRES_PASS = 'OmNU2guAJkp3KwDE' # never add your password to a jupyter notebook!
POSTGRES_HOST = 'data-analytics-course-2.c8g8r1deus2v.eu-central-1.rds.amazonaws.com'
POSTGRES_PORT = '5432'
POSTGRES_DB = 'postgres'
POSTGRES_SCHEMA = 'sara_dont_touch'
```

2. Keep in mind that the `.env` file should NOT be uploaded to **github**. A file called `.env_example` could be uploaded in order to give an example of what the `.env` file should contain. Therefore the `.env` file should be always in your `.gitignore` file! (it should be there already)


3. Now to inject the secrets into your project, you can use a popular module like dotenv; it will parse the `.env` file and make your secrets accessible within your codebase under the process object. Go ahead and install the module:

In [None]:
pip install python-dotenv

4. Import the module at the top of the start script for your codebase:

**First option:** `dotenv_values()` will only read the `.env` file and return a *temporary* dictionary

In [None]:
# getting API and DB credentials

from dotenv import dotenv_values

config = dotenv_values()
pg_user = config['POSTGRES_USER']  # align the key label with your .env file !
pg_host = config['POSTGRES_HOST']
pg_port = config['POSTGRES_PORT']
pg_db = config['POSTGRES_DB']
#pg_schema = config['POSTGRES_SCHEMA']
pg_pass = config['POSTGRES_PASS']


In [None]:
pg_host

**Second option:** `load_dotenv()` : will actually load the key/values into your running os enviromental variables

In [None]:
# getting API and DB credentials

# from dotenv import load_dotenv

# load_dotenv()
# pg_host = os.getenv('POSTGRES_HOST') # align the key label with your .env file !
# pg_user = os.getenv('POSTGRES_USER')  # align the key label with your .env file !
# pg_host = os.getenv('POSTGRES_HOST')
# pg_port = os.getenv('POSTGRES_PORT')
# pg_db = os.getenv('POSTGRES_DB')
# pg_schema = os.getenv('POSTGRES_SCHEMA')
# pg_pass = os.getenv('POSTGRES_PASS')

In [None]:
pg_host

Cool. We’ve successfully added a `.env` file into your project with some secrets and accessed those secrets in your codebase. Additionally, when you push your code via git, your secrets will stay on your machine.