# 📝 Documenting Your Work with Scripts and Logs

One of the most common issues in research is poor documentation. Scripts, notebooks, and logs help ensure your work is **transparent, repeatable, and understandable**.


## 💡 Basic View

### Why does documentation matter?

- So you don’t forget what you did
- So someone else can understand your work
- So you can *re-run* your analysis reliably
- So you can trust your own results in 6 months!

### Example: Undocumented vs Documented Code

❌ Undocumented:

```python
df = df[df.x > 0]
```

✅ Documented:

```python
# Remove negative values from the variable 'x'
df = df[df["x"] > 0]
```

Just one line of comment makes the purpose clear.


In [None]:
# Example: A basic logging approach
import pandas as pd

print("Loading data...")
df = pd.read_csv("data.csv")

print("Filtering by BMI > 18.5")
df = df[df["BMI"] > 18.5]

print("Saving cleaned file.")
df.to_csv("cleaned.csv", index=False)

This is a **very basic form of logging** — telling yourself or others what happens at each step.


## 🔍 Advanced View

<details><summary>Click to expand</summary>

### Real Logging with the `logging` Module

Python has a built-in `logging` module for more advanced workflows:

```python
import logging
logging.basicConfig(level=logging.INFO)

logging.info("Loading data")
```

This is useful in production scripts and tools.

### Good Practices

- Use descriptive comments, not obvious ones
- Keep a changelog or version history
- Use Git to track changes
- Write protocol files in Markdown or Quarto to describe:
  - What you did
  - What decisions were made
  - Where the data came from

</details>


## 🧪 Exercises

1. Create a short data-cleaning script and add print statements to explain each step.
2. Try rewriting your print statements using `logging`.
3. Optional: Create a markdown file documenting the steps you took and link to your dataset.
