# Day 5: Git + GitHub + Ship Week 1

**Goal:** Publish a clean GitHub repo for your CSV Profiler with a clear README and reproducible setup.

## Learning Objectives

By the end of today, you can:
- Explain Git's mental model: **working tree → staging → commits**
- Use core commands: `status`, `add`, `commit`, `log`, `diff`
- Create a GitHub repo and push your code
- Write a README that lets anyone run your project

---
# Session 1: Git Essentials

## Why Git?

Git gives you:
- A **timeline** of your work (commits)
- A safe way to **experiment** (branches)
- A way to **collaborate** without overwriting (merges)
- A permanent record for employers (GitHub)

## Git Mental Model

Three places for your files:

1. **Working Tree** - Files on your disk (what you see)
2. **Staging Area** - What will be in the next commit
3. **Repository** - Commit history (permanent snapshots)

```
edit files → git add → git commit → git push
```

## Essential Git Commands

| Command | Description |
|---------|-------------|
| `git init` | Start tracking a project |
| `git status` | Show what's changed |
| `git add <file>` | Stage changes |
| `git add .` | Stage all changes |
| `git commit -m "message"` | Create a snapshot |
| `git log` | View history |
| `git diff` | See unstaged changes |
| `git diff --staged` | See staged changes |

## Safe Undo

| Command | Description |
|---------|-------------|
| `git restore <file>` | Discard unstaged changes |
| `git restore --staged <file>` | Unstage a file |
| `git revert <hash>` | Undo a commit (creates new commit) |

---
## Exercise 1: Git Command Quiz

Fill in the blanks with the correct git commands.

In [None]:
# Git Command Quiz - fill in the blanks (as strings)

# 1. Show current status of working directory
cmd_status = "git status"

# 2. Stage all changes for commit
cmd_stage_all = "git add ."

# 3. Create a commit with a message
cmd_commit = "git commit -m 'message'"

# 4. Show commit history
cmd_log = "git log"

# 5. Show difference between working directory and last commit
cmd_diff = "git diff"

# 6. Discard changes in a file (restore from last commit)
cmd_restore = "git restore filename"

print("Your answers:")
print(f"1. Status: {cmd_status}")
print(f"2. Stage all: {cmd_stage_all}")
print(f"3. Commit: {cmd_commit}")
print(f"4. Log: {cmd_log}")
print(f"5. Diff: {cmd_diff}")
print(f"6. Restore: {cmd_restore}")

<details>
<summary>Click to reveal solution</summary>

```python
cmd_status = "git status"
cmd_stage_all = "git add ."
cmd_commit = "git commit -m 'message'"
cmd_log = "git log"
cmd_diff = "git diff"
cmd_restore = "git restore filename"
```
</details>

---
## Exercise 2: Create a .gitignore

Your `.gitignore` file tells Git which files to NOT track.

You should ignore:
- `.venv/` (virtual environment)
- `__pycache__/` (Python bytecode)
- `outputs/` (generated files)
- `.env` (secrets)
- OS files (`.DS_Store`, `Thumbs.db`)

In [None]:
# Write your .gitignore content
gitignore_content = """
# Write your .gitignore rules here
# Each line is a pattern to ignore

# Python bytecode
__pycache__/
*.pyc

# Virtual environment
.venv/

# IDE settings
.vscode/
.idea/

# Local outputs
outputs/

# Environment files with secrets
.env

# OS files
.DS_Store
Thumbs.db
"""

print("Your .gitignore:")
print(gitignore_content)

<details>
<summary>Click to reveal solution</summary>

```
# Python bytecode
__pycache__/
*.py[cod]
*.pyo

# Virtual environment
.venv/
venv/
env/

# IDE settings
.vscode/
.idea/
*.swp

# Local outputs
outputs/
*.log

# Environment files with secrets
.env
.env.local

# OS files
.DS_Store
Thumbs.db
```
</details>

---
# Session 2: GitHub

## Git vs GitHub

- **Git**: Version control tool on your machine
- **GitHub**: A hosted place to store Git repos + collaborate

## Connecting to GitHub

```bash
# Add remote (after creating repo on GitHub)
git remote add origin <YOUR_REPO_URL>

# Set main branch
git branch -M main

# Push (first time, with -u to set upstream)
git push -u origin main

# Push (subsequent times)
git push
```

## Clone vs Download

Always **clone** instead of downloading ZIP:
- Keeps Git history
- Can commit and push
- Can pull updates

---
## README: Your Repo's Front Door
A good README answers:
- What is this?
- What can it do?
- How do I install it?
- How do I run it?
- What does output look like?

## README Template
```markdown
# CSV Profiler

Generate a profiling report for a CSV file.

## Features
- CLI: JSON + Markdown report
- Streamlit GUI: upload CSV + export reports
## Setup
    uv venv -p 3.11
    source .venv/bin/activate  # or .venv\Scripts\activate on Windows
    uv pip install -r requirements.txt
## Run CLI
    PYTHONPATH=src uv run python -m csv_profiler.cli profile data/sample.csv
## Run GUI
    PYTHONPATH=src uv run streamlit run app.py

## Output Files

The profiler generates:
- `report.json` - Machine-readable statistics
- `report.md` - Human-readable report
```

---
# Submission Checklist

## Your repo should have:

- [ ] `src/csv_profiler/` - Your Python package
  - [ ] `__init__.py`
  - [ ] `io.py` - CSV reading
  - [ ] `profiling.py` - Profile logic
  - [ ] `render.py` - JSON/Markdown output
  - [ ] `cli.py` - Typer CLI
- [ ] `app.py` - Streamlit app
- [ ] `data/sample.csv` - Sample data for testing
- [ ] `requirements.txt` - Dependencies
- [ ] `.gitignore` - Files to ignore
- [ ] `README.md` - Setup and usage instructions

## "Done" means:

1. Fresh clone works
2. CLI generates `report.json` and `report.md`
3. Streamlit app runs and exports reports
4. README has clear instructions

---
# Week 1 Summary

## What You Built
- **CSV Profiler** that reads any CSV and generates:
  - JSON report (machine-readable)
  - Markdown report (human-readable)
- **CLI** using Typer
- **GUI** using Streamlit

## Skills You Learned
- Python fundamentals: types, containers, functions, classes
- File I/O: CSV reading, JSON/Markdown writing
- OOP basics: classes, properties, inheritance
- CLI development with Typer
- GUI development with Streamlit
- Git: commits, branches, pushing to GitHub

## Next Week: Data Work (ETL + EDA)

You'll learn:
- pandas for data manipulation
- Data cleaning and transformation
- Exploratory data analysis
- Data visualization

## Congratulations!

You've completed Week 1 of the AI Professionals Bootcamp!