# Module 3: Python Libraries

### The Scenario

Your Python script needs to find configuration files, read environment variables for database credentials, and handle command-line arguments. You're hardcoding paths like `/home/myuser/config.json`, but it breaks on your colleague's machine.

### The Goal

By the end of this module, you will:
- Navigate the **file system** with `os` and `pathlib`
- Access **environment variables** safely
- Understand the **Python interpreter** via `sys`
- Build proper **command-line interfaces**

---

## Lesson 1: The `os` Module

### The Problem

You write `open('/home/user/data.txt')` and it works on your Linux machine. But on Windows, paths use backslashes, and the home directory is different.

### The "Aha!" Moment

The `os` module provides **portable** functions that work across operating systems.

### Key Functions

| Function | Purpose | Example |
|----------|---------|--------|
| `os.getcwd()` | Current directory | `/home/user` |
| `os.listdir(path)` | List contents | `['file1.txt', ...]` |
| `os.path.exists(path)` | Check if exists | `True` / `False` |
| `os.path.join(a, b)` | Join paths | Cross-platform |
| `os.makedirs(path)` | Create nested dirs | Like `mkdir -p` |

In [3]:
import os

# Current directory
print(f"Current directory: {os.getcwd()}")
print(f"OS name: {os.name}")  # 'posix' (Linux/Mac) or 'nt' (Windows)

# List directory
print("\nFiles in current directory:")
for item in os.listdir('.')[:5]:
    print(f"  {item}")

Current directory: /Users/aayushostwal/Desktop/aayush/courses/python
OS name: posix

Files in current directory:
  Module 6 Advanced Python Concepts.ipynb
  Module 5 Memory, GIL, & Internal Performance.ipynb
  Module 7 Modern Tooling and Packaging.ipynb
  Module 3 Python Libraries.ipynb
  Module 2 Masterclass in Python Data Structures.ipynb


In [6]:
# Path operations - cross-platform!
path = os.path.join('folder', 'subfolder', 'file.txt')
print(f"Joined path: {path}")

# Split path
dirname, filename = os.path.split('/home/user/docs/report.pdf')
print(f"Directory: {dirname}")
print(f"Filename: {filename}")

# Get extension
name, ext = os.path.splitext('report.pdf')
print(f"Name: {name}, Extension: {ext}")

Joined path: folder/subfolder/file.txt
Directory: /home/user/docs
Filename: report.pdf
Name: report, Extension: .pdf


### os.path Quick Reference

| Function | Purpose |
|----------|--------|
| `os.path.join(a, b)` | Join paths (cross-platform) |
| `os.path.split(path)` | Split into (dir, filename) |
| `os.path.splitext(path)` | Split into (name, extension) |
| `os.path.dirname(path)` | Get directory part |
| `os.path.basename(path)` | Get filename part |
| `os.path.exists(path)` | Check if exists |
| `os.path.isdir(path)` | Check if directory |
| `os.path.isfile(path)` | Check if file |
| `os.path.abspath(path)` | Get absolute path |

---

## Lesson 2: Environment Variables

### The Problem

You hardcoded the database password in your code. Now it's in Git, and everyone can see it.

### The "Aha!" Moment

Environment variables store configuration **outside your code**. They're the standard way to handle secrets, paths, and environment-specific settings.

### Best Practices

| Do | Don't |
|----|-------|
| Use `os.environ.get()` with defaults | Use `os.environ[]` (raises KeyError) |
| Store secrets in env vars | Hardcode passwords in code |
| Use `.env` files locally | Commit `.env` to git |

In [3]:
# Reading environment variables
home = os.environ.get('HOME', 'Not set')
user = os.environ.get('USER', 'Not set')
shell = os.environ.get('SHELL', 'Not set')

print(f"HOME:  {home}")
print(f"USER:  {user}")
print(f"SHELL: {shell}")

# Safe access with default
debug = os.environ.get('DEBUG', 'false')
print(f"DEBUG: {debug}")

HOME:  /Users/aayushostwal
USER:  aayushostwal
SHELL: /bin/zsh
DEBUG: false


In [4]:
# Setting environment variables (current process only)
os.environ['MY_CONFIG'] = 'production'
print(f"MY_CONFIG: {os.environ['MY_CONFIG']}")

# Check existence
if 'MY_CONFIG' in os.environ:
    print("Config is set!")

# Delete
del os.environ['MY_CONFIG']
print(f"After delete: {os.environ.get('MY_CONFIG', 'Not found')}")

MY_CONFIG: production
Config is set!
After delete: Not found


---

## Lesson 3: Modern Path Handling with `pathlib`

### The Problem

`os.path.join(os.path.dirname(path), 'subdir', 'file.txt')` is verbose and hard to read.

### The "Aha!" Moment

`pathlib` (Python 3.4+) provides an **object-oriented** interface to paths. The `/` operator makes path joining intuitive.

### pathlib vs os.path

| os.path | pathlib |
|---------|--------|
| `os.path.join(a, b)` | `path / 'subdir'` |
| `os.path.dirname(p)` | `path.parent` |
| `os.path.basename(p)` | `path.name` |
| `os.path.splitext(p)` | `path.stem`, `path.suffix` |
| `os.path.exists(p)` | `path.exists()` |

In [5]:
from pathlib import Path

# Special paths
home = Path.home()
cwd = Path.cwd()

print(f"Home: {home}")
print(f"CWD:  {cwd}")

# Join with / operator (readable!)
config = home / '.config' / 'myapp' / 'settings.json'
print(f"Config: {config}")

Home: /Users/aayushostwal
CWD:  /Users/aayushostwal/Desktop/aayush/courses/python
Config: /Users/aayushostwal/.config/myapp/settings.json


In [6]:
# Path properties
p = Path('/home/user/documents/report.pdf')

print(f"Path:   {p}")
print(f"Name:   {p.name}")     # report.pdf
print(f"Stem:   {p.stem}")     # report
print(f"Suffix: {p.suffix}")   # .pdf
print(f"Parent: {p.parent}")   # /home/user/documents
print(f"Parts:  {p.parts}")    # ('/', 'home', 'user', ...)

Path:   /home/user/documents/report.pdf
Name:   report.pdf
Stem:   report
Suffix: .pdf
Parent: /home/user/documents
Parts:  ('/', 'home', 'user', 'documents', 'report.pdf')


In [7]:
# Glob - find files matching pattern
print("Notebook files:")
for notebook in Path('.').glob('*.ipynb'):
    print(f"  {notebook}")

# Recursive glob
print("\nAll Python files (recursive):")
for py_file in list(Path('.').glob('**/*.py'))[:5]:
    print(f"  {py_file}")

Notebook files:
  Module 5 Memory, GIL, & Internal Performance.ipynb
  Module 3 Python Libraries.ipynb
  Module 2 Masterclass in Python Data Structures.ipynb
  Module 4 File Systems & Module Architecture.ipynb

All Python files (recursive):
  module4_examples/utils_bad.py
  module4_examples/processor.py
  module4_examples/utils_good.py
  module4_examples/mypackage/__init__.py
  module4_examples/mypackage/core.py


### pathlib Quick Reference

| Property/Method | Purpose | Result |
|-----------------|---------|--------|
| `path.name` | Filename | `'report.pdf'` |
| `path.stem` | Name without extension | `'report'` |
| `path.suffix` | Extension | `'.pdf'` |
| `path.parent` | Parent directory | `Path('/home/user')` |
| `path.parts` | Path components | `('/', 'home', ...)` |
| `path / 'sub'` | Join paths | `Path('/home/sub')` |
| `path.exists()` | Check if exists | `True` / `False` |
| `path.glob('*.py')` | Find files | Iterator of Paths |
| `path.read_text()` | Read file | String content |
| `path.write_text(s)` | Write file | None |

---

## Lesson 4: The `sys` Module

### The Problem

Your script imports a module, but Python says it can't find it. You need to understand how Python finds modules.

### The "Aha!" Moment

The `sys` module exposes Python interpreter internals:
- `sys.path`: Where Python looks for modules
- `sys.argv`: Command-line arguments
- `sys.version`: Python version info

### Key `sys` Variables

| Variable | Purpose |
|----------|--------|
| `sys.version` | Python version string |
| `sys.platform` | OS identifier (`darwin`, `linux`, `win32`) |
| `sys.path` | Module search paths |
| `sys.argv` | Command-line arguments |
| `sys.executable` | Path to Python interpreter |

In [8]:
import sys

# Version info
print(f"Python version: {sys.version}")
print(f"Major.Minor: {sys.version_info.major}.{sys.version_info.minor}")
print(f"Platform: {sys.platform}")
print(f"Executable: {sys.executable}")

Python version: 3.12.0 | packaged by Anaconda, Inc. | (main, Oct  2 2023, 12:22:05) [Clang 14.0.6 ]
Major.Minor: 3.12
Platform: darwin
Executable: /Users/aayushostwal/miniconda3/envs/plat3.12/bin/python


In [9]:
# Module search path
print("Python searches for modules in:")
for i, path in enumerate(sys.path[:5]):
    print(f"  {i}: {path or '(current directory)'}")
print(f"  ... ({len(sys.path)} total)")

# Add custom path
# sys.path.insert(0, '/my/modules')

Python searches for modules in:
  0: /Users/aayushostwal/miniconda3/envs/plat3.12/lib/python312.zip
  1: /Users/aayushostwal/miniconda3/envs/plat3.12/lib/python3.12
  2: /Users/aayushostwal/miniconda3/envs/plat3.12/lib/python3.12/lib-dynload
  3: (current directory)
  4: /Users/aayushostwal/miniconda3/envs/plat3.12/lib/python3.12/site-packages
  ... (47 total)


In [10]:
# Object memory sizes
print("Object sizes (bytes):")
print(f"  Empty list:   {sys.getsizeof([])}")
print(f"  [1,2,3]:      {sys.getsizeof([1,2,3])}")
print(f"  Empty dict:   {sys.getsizeof({})}")
print(f"  Empty string: {sys.getsizeof('')}")

# Recursion limit
print(f"\nRecursion limit: {sys.getrecursionlimit()}")

Object sizes (bytes):
  Empty list:   56
  [1,2,3]:      88
  Empty dict:   64
  Empty string: 41

Recursion limit: 3000


---

## Lesson 5: Command-Line Arguments

### The Problem

Your script has hardcoded input/output paths. You want to run it like: `python process.py input.csv output.csv`

### The "Aha!" Moment

- `sys.argv` gives raw access to arguments
- `argparse` provides validation, help text, and type conversion

### argparse Quick Reference

| Feature | Code | Result |
|---------|------|--------|
| Positional arg | `add_argument('file')` | Required |
| Optional flag | `add_argument('-v', '--verbose')` | `-v` or `--verbose` |
| Boolean flag | `action='store_true'` | `-v` → `True` |
| With value | `type=int, default=1` | `-n 5` → `5` |
| Choices | `choices=['a', 'b']` | Only `a` or `b` |

In [11]:
# sys.argv - raw arguments
print(f"sys.argv: {sys.argv}")
print(f"Script: {sys.argv[0]}")

# In a real script run as: python script.py --verbose input.txt
# sys.argv = ['script.py', '--verbose', 'input.txt']

sys.argv: ['/Users/aayushostwal/miniconda3/envs/plat3.12/lib/python3.12/site-packages/ipykernel_launcher.py', '--f=/Users/aayushostwal/Library/Jupyter/runtime/kernel-v3582e69ec9aea41ef90de813922ed0f9d280d51b0.json']
Script: /Users/aayushostwal/miniconda3/envs/plat3.12/lib/python3.12/site-packages/ipykernel_launcher.py


In [12]:
import argparse

# Create parser
parser = argparse.ArgumentParser(
    description='Process files',
    epilog='Example: python script.py -v input.txt output.txt'
)

# Add arguments
parser.add_argument('input', help='Input file')
parser.add_argument('output', help='Output file')
parser.add_argument('-v', '--verbose', action='store_true', help='Verbose output')
parser.add_argument('-n', '--count', type=int, default=1, help='Iterations')

# Parse (demo with a list instead of sys.argv)
args = parser.parse_args(['input.txt', 'output.txt', '-v', '-n', '5'])

print(f"Input:   {args.input}")
print(f"Output:  {args.output}")
print(f"Verbose: {args.verbose}")
print(f"Count:   {args.count}")

Input:   input.txt
Output:  output.txt
Verbose: True
Count:   5


---

## Lesson 6: Standard I/O Streams

### The Problem

You want to separate normal output from error messages, and capture program output for testing.

### The "Aha!" Moment

Every program has three streams:

| Stream | Purpose | Default |
|--------|---------|--------|
| `sys.stdin` | Input | Keyboard |
| `sys.stdout` | Normal output | Console |
| `sys.stderr` | Error output | Console (separate) |

### Exit Codes

| Code | Meaning |
|------|--------|
| 0 | Success |
| 1 | General error |
| 2 | Command line error |
| 130 | Ctrl+C |

In [14]:
# Write to different streams
print("This goes to stdout")
print("This goes to stderr", file=sys.stderr)

# Flush output immediately
print("Flushed immediately", flush=True)

This goes to stdout
Flushed immediately


This goes to stderr


In [15]:
# Capture stdout (useful for testing)
from io import StringIO
from contextlib import redirect_stdout

buffer = StringIO()
with redirect_stdout(buffer):
    print("This is captured!")
    print("So is this!")

captured = buffer.getvalue()
print(f"Captured:\n{captured}")

Captured:
This is captured!
So is this!



---

## Summary

### `os` Module

| Function | Purpose |
|----------|--------|
| `os.getcwd()` | Get current directory |
| `os.listdir(path)` | List directory contents |
| `os.path.join(a, b)` | Join paths |
| `os.path.exists(path)` | Check if exists |
| `os.environ` | Environment variables |
| `os.makedirs(path)` | Create nested directories |

### `pathlib` Module

| Feature | Example |
|---------|---------|
| Join paths | `path / 'subdir' / 'file'` |
| Get filename | `path.name` |
| Get extension | `path.suffix` |
| Check exists | `path.exists()` |
| Glob files | `path.glob('*.py')` |
| Read file | `path.read_text()` |

### `sys` Module

| Variable | Purpose |
|----------|--------|
| `sys.version` | Python version |
| `sys.path` | Module search paths |
| `sys.argv` | Command-line arguments |
| `sys.exit(code)` | Exit program |
| `sys.getsizeof(obj)` | Object size |

---

**Next Module:** File Systems & Module Architecture