# Level 1 — Week 7 Practice (Starter Notebook)

This notebook provides starter scaffolding for **Capstone engineering quality**:

- CLI skeleton (`argparse`)
- config loading via environment variables (`.env`)
- basic testing patterns (`pytest` and a `smoke_test.py` alternative)

## References (docs)
- Python `argparse` (official): https://docs.python.org/3/library/argparse.html
- Twelve-Factor App — config: https://12factor.net/config
- python-dotenv: https://github.com/theskumar/python-dotenv
- pytest docs: https://docs.pytest.org/
- Python errors/exceptions: https://docs.python.org/3/tutorial/errors.html


## Note

Jupyter isn’t ideal for building CLIs/tests, but this notebook shows starter code snippets you can copy into:
- `analyze.py` (CLI entrypoint)
- `src/` modules
- `tests/`

Use it as a reference while implementing your Capstone repository structure.


In [None]:
import os
import textwrap


## CLI skeleton (`argparse`)

Copy this into `analyze.py`.


In [None]:
cli_skeleton = textwrap.dedent('''
from __future__ import annotations

import argparse
from pathlib import Path

def build_parser() -> argparse.ArgumentParser:
    p = argparse.ArgumentParser(description='Capstone analyzer')
    p.add_argument('--input', required=True, help='Path to input CSV')
    p.add_argument('--out', default='output', help='Output directory')
    return p

def main() -> int:
    args = build_parser().parse_args()
    in_path = Path(args.input)
    out_dir = Path(args.out)
    out_dir.mkdir(exist_ok=True)

    # TODO: call your pipeline here
    # pipeline_run(in_path, out_dir)

    print('OK')
    return 0

if __name__ == '__main__':
    raise SystemExit(main())
''')
print(cli_skeleton)


## Config and secrets (`.env`)

In real projects, keep secrets out of code. You can load them from environment variables.

Copy pattern into your project and do not commit `.env`.


In [None]:
dotenv_pattern = textwrap.dedent('''
import os

try:
    from dotenv import load_dotenv
    load_dotenv()
except ModuleNotFoundError:
    pass

API_KEY = os.getenv('API_KEY')
if not API_KEY:
    raise RuntimeError('Missing API_KEY. Put it in .env or environment variables.')
''')
print(dotenv_pattern)


## Testing: pytest starter

Copy this into `tests/test_pipeline.py`.

The idea is: test normal + edge + failure cases.


In [None]:
pytest_example = textwrap.dedent('''
from pathlib import Path
import pandas as pd

# from src.pipeline import pipeline_run

def test_happy_path(tmp_path: Path):
    df = pd.DataFrame({'a': [1, 2], 'b': [3, 4]})
    in_path = tmp_path / 'in.csv'
    out_dir = tmp_path / 'out'
    df.to_csv(in_path, index=False)

    # pipeline_run(in_path, out_dir)
    # assert (out_dir / 'report.json').exists()
    assert in_path.exists()

def test_missing_file(tmp_path: Path):
    missing = tmp_path / 'missing.csv'
    # with pytest.raises(FileNotFoundError):
    #     pipeline_run(missing, tmp_path / 'out')
    assert not missing.exists()
''')
print(pytest_example)


## Testing alternative: `smoke_test.py`

If you’re not ready for pytest yet, a smoke test script is acceptable in Level 1 (but pytest is preferred).

Copy into `smoke_test.py`.


In [None]:
smoke_test = textwrap.dedent('''
from pathlib import Path
import pandas as pd

def main() -> int:
    tmp = Path('output')
    tmp.mkdir(exist_ok=True)

    df = pd.DataFrame({'a': [1, 2], 'b': [3, 4]})
    in_path = tmp / 'smoke.csv'
    df.to_csv(in_path, index=False)

    # TODO: run your pipeline
    # pipeline_run(in_path, tmp)

    print('SMOKE OK')
    return 0

if __name__ == '__main__':
    raise SystemExit(main())
''')
print(smoke_test)
