# 🚗 Python vs R vs Other Tools

Data analysis can be done with many tools. This notebook helps you understand where Python fits — and why it is widely used in food and nutrition science.


## 💡 Basic View

| Tool      | Strengths                          | Limitations                     |
|-----------|------------------------------------|----------------------------------|
| **Python** | General-purpose, readable, automation, machine learning | Slightly more setup |
| **R**      | Great for statistics, visualisation | Less general-purpose            |
| **Excel**  | Familiar, quick for small tables   | No version control, error-prone |
| **SPSS**   | Menu-driven stats                 | Less flexible, limited scripts  |
| **Prism**  | Beautiful graphs, quick comparisons | Not programmable                |
| **XLStat** | Built into Excel, good for sensory | Expensive, closed environment   |

Python is our tool of choice because:
- It's open-source and free
- It works for small scripts or massive data pipelines
- You can write readable code and share it


In [None]:
# Python is readable
data = [2.3, 4.5, 1.2, 5.7]
mean = sum(data) / len(data)
print(f"The average is {mean:.2f}")

Compare with R:

```r
data <- c(2.3, 4.5, 1.2, 5.7)
mean(data)
```

And Excel:

| A       |
|---------|
| 2.3     |
| 4.5     |
| 1.2     |
| 5.7     |

Use formula: `=AVERAGE(A1:A4)` — but there’s no clear log of this step.


## 🔍 Advanced View

<details><summary>Click to expand deeper comparisons</summary>

- Python is often used in **pipelines**, web apps, dashboards, and modelling
- R has **ggplot2**, **dplyr**, and built-in stats
- Excel is best for one-off reports or small tables — not reproducible workflows

</details>


## 🧪 Exercises

1. Write a short Python script to calculate the median of `[1, 2, 3, 4, 100]`.
2. Try making the same plot in Prism and in Python (using seaborn or matplotlib).
3. Optional: Google “Python vs R for epidemiology” and summarise what you find.
