# Section 0 — Introduction & Environment Setup

A step‑by‑step guide to set up a reliable Python environment for Data & Analytics.

## 1) Install Python
Choose one path:

**Option A — Official Python (recommended for most)**
- Download: https://www.python.org/downloads/
- Windows: check **"Add Python to PATH"** during installation.

**Option B — Miniconda / Anaconda (data‑friendly)**
- Miniconda (lighter): https://docs.conda.io/en/latest/miniconda.html  
- Anaconda (full): https://www.anaconda.com/download

> Tip: Managing multiple versions on macOS/Linux? Consider **pyenv** — https://github.com/pyenv/pyenv


### Verify your Python installation

In [None]:

import sys
print("Python executable:", sys.executable)
print("Python version:", sys.version)


## 2) Create & Use Virtual Environments
Virtual environments isolate project dependencies.

### Option A — `venv` (built‑in)
```bash
# macOS/Linux
python3 -m venv .venv
source .venv/bin/activate

# Windows (PowerShell)
python -m venv .venv
.venv\Scripts\Activate.ps1

# Deactivate
deactivate
```

### Option B — `conda` environments
```bash
conda create -n myenv python=3.11 -y
conda activate myenv
```

> When to choose:
> - **venv**: simple projects, fast startup, standard tooling.  
> - **conda**: scientific stacks, mixing Python/R, or managing native libs.


### Check interpreter & pip in the current environment

In [None]:

import sys, subprocess, shutil
print("Interpreter:", sys.executable)
print("pip path:", shutil.which("pip"))


## 3) Package Management

### `pip` (default)
- Docs: https://pip.pypa.io/
```bash
python -m pip install --upgrade pip
pip install numpy pandas matplotlib jupyterlab
pip freeze > requirements.txt
# Later
pip install -r requirements.txt
```

### `pipx` (CLI tools installed globally but isolated)
- Docs: https://pipx.pypa.io/
```bash
# macOS/Linux
python3 -m pip install --user pipx
python3 -m pipx ensurepath

# Windows
python -m pip install --user pipx
python -m pipx ensurepath

# Examples
pipx install ruff
pipx install black
```

### `poetry` (optional: modern dependency & build tool)
- Docs: https://python-poetry.org/
```bash
pipx install poetry
poetry init
poetry add numpy pandas
poetry shell
```


## 4) Editor & Extensions

### Visual Studio Code
- Download: https://code.visualstudio.com/
- Extensions:
  - **Python** (Microsoft)
  - **Jupyter** (Microsoft)
  - **Pylance**
  - **Black Formatter** or **Ruff**

> Command Palette → *Python: Select Interpreter* → choose your **.venv** or **conda** env.


## 5) Jupyter & Notebooks

Install if needed:
```bash
pip install jupyterlab
```
Start Jupyter:
```bash
jupyter lab
# or
jupyter notebook
```
Select the kernel that matches your virtual environment for reproducibility.


### Quick notebook test (run me)

In [None]:

import pandas as pd
import numpy as np

df = pd.DataFrame({'x': range(5), 'y': np.random.randn(5)})
df


## 6) First Data Stack (Test Drive)
Install core data packages and verify versions.
```bash
pip install numpy pandas matplotlib seaborn
```


In [None]:

import numpy as np, pandas as pd, matplotlib.pyplot as plt
import seaborn as sns

print("NumPy:", np.__version__)
print("Pandas:", pd.__version__)
print("Matplotlib:", plt.matplotlib.__version__)
print("Seaborn:", sns.__version__)


## 7) Project Structure Template
Use a clean, reusable skeleton:

```
my-python-project/
│
├─ .gitignore
├─ README.md
├─ requirements.txt        # or pyproject.toml if using Poetry
├─ .env.example            # sample environment variables
│
├─ data/
│  ├─ raw/                 # immutable, untouched data
│  ├─ interim/             # cleaned/intermediate
│  └─ processed/           # ready for modeling/reporting
│
├─ notebooks/
│  └─ 01_exploration.ipynb
│
├─ src/
│  ├─ __init__.py
│  ├─ data/                # loaders, extractors
│  ├─ features/            # feature prep/transform
│  ├─ models/              # training/inference scripts
│  └─ viz/                 # charts/plots
│
└─ tests/
```

**`.gitignore` basics for Python**
```
__pycache__/
*.pyc
*.pyo
*.pyd
*.egg-info/
.venv/
.env
.ipynb_checkpoints/
```


## 8) Code Style & Quality
- **PEP 8** (style guide): https://peps.python.org/pep-0008/
- **Black** (formatter): https://black.readthedocs.io/
```bash
pip install black
black src/
```
- **Ruff** (linter+formatter): https://docs.astral.sh/ruff/
```bash
pipx install ruff
ruff check src/
ruff format src/
```
- **Pre‑commit** hooks (optional):
```bash
pip install pre-commit
pre-commit install
```


## 9) Configuration & Secrets
Use environment variables and avoid hardcoding secrets.

Install:
```bash
pip install python-dotenv
```
`.env` example:
```
DATABASE_URL=postgresql://user:pass@host:5432/db
OPENAI_API_KEY=sk-xxxx
```
Load in code:
```python
from dotenv import load_dotenv
import os

load_dotenv()
db_url = os.getenv("DATABASE_URL")
```
> Never commit real secrets — only `.env.example`.


## 10) Git & Version Control
- Install Git: https://git-scm.com/
- Initialize repo:
```bash
git init
git add .
git commit -m "Initial project structure"
```
- Create a new GitHub repo: https://github.com/new
- Link and push:
```bash
git remote add origin https://github.com/<user>/<repo>.git
git branch -M main
git push -u origin main
```


## 11) Quick Troubleshooting
- **`python` vs `python3`**: use whichever your OS maps; try both.
- **Activation fails (Windows PowerShell)**: run as Admin and:
```powershell
Set-ExecutionPolicy RemoteSigned
```
- **Package conflicts**: use a fresh virtual env; pin versions in `requirements.txt`.
- **Jupyter not seeing your env**: install the kernel explicitly:
```bash
pip install ipykernel
python -m ipykernel install --user --name myenv --display-name "Python (myenv)"
```


## 12) Reference Links (Official Docs)
- Python: https://docs.python.org/3/
- pip: https://pip.pypa.io/
- venv: https://docs.python.org/3/library/venv.html
- conda: https://docs.conda.io/
- Jupyter: https://jupyter.org/
- VS Code Python: https://code.visualstudio.com/docs/python/python-tutorial
- Black: https://black.readthedocs.io/
- Ruff: https://docs.astral.sh/ruff/
- python-dotenv: https://saurabh-kumar.com/python-dotenv/
