
# Lecture 25: Plotting (Hands-on Lab)

Welcome to the hands-on lab for **Plotting with Python**.  
This notebook is designed for a **2-hour session** where you'll practice data visualization with **matplotlib** using real datasets.

We will cover:
1. Basics of plotting with `matplotlib`
2. Line plots, scatter plots, and multiple plots
3. Customizing plots (titles, labels, legends, styles)
4. Subplots and comparisons
5. Working with real datasets (`USPopulation.txt`, `countryPops.txt`)
6. Using logarithmic scales
7. Benford's Law exploration with population data

---


In [None]:

# Install matplotlib if not available (uncomment below if running on a fresh environment)
# !pip install matplotlib

import matplotlib.pyplot as plt
import numpy as np

# Enable inline plotting for Jupyter
%matplotlib inline



## Part 1: Simple Plotting

Let's begin with a simple example of plotting mathematical functions.


In [None]:

# Create x values
x = list(range(0, 20))

# Different functions
linear = [n for n in x]
quadratic = [n**2 for n in x]
cubic = [n**3 for n in x]
exponential = [2**n for n in x]

# Plot them all on the same graph
plt.figure(figsize=(10,6))
plt.plot(x, linear, label="Linear (n)")
plt.plot(x, quadratic, label="Quadratic (n^2)")
plt.plot(x, cubic, label="Cubic (n^3)")
plt.plot(x, exponential, label="Exponential (2^n)")

plt.title("Simple Function Plots")
plt.xlabel("n")
plt.ylabel("Value")
plt.legend()
plt.grid(True)
plt.show()



## Part 2: Scatter vs Line Plot

Notice how order matters when plotting with `plt.plot()`. Let's compare with `plt.scatter()`.


In [None]:

unsorted_x = [5, 2, 8, 1, 7]
unsorted_y = [x**2 for x in unsorted_x]

plt.figure(figsize=(12,5))

plt.subplot(1,2,1)
plt.plot(unsorted_x, unsorted_y, 'r-o')
plt.title("Line Plot (Order Matters)")

plt.subplot(1,2,2)
plt.scatter(unsorted_x, unsorted_y, c='b')
plt.title("Scatter Plot (No Connecting Lines)")

plt.show()



## Part 3: Multiple Figures and Subplots
You can plot in separate figures or create subplots for comparisons.


In [None]:

# Example: Linear vs Quadratic
plt.figure(figsize=(12,5))

plt.subplot(1,2,1)
plt.plot(x, linear, 'g-')
plt.title("Linear Function")

plt.subplot(1,2,2)
plt.plot(x, quadratic, 'm-')
plt.title("Quadratic Function")

plt.tight_layout()
plt.show()



## Part 4: Real Example - US Population Growth

We will use the file `USPopulation.txt` which contains population data every 10 years from 1610 to 2010.


In [None]:

# Load US population data
years, population = [], []

with open("lec25_USPopulation.txt") as f:
    for line in f:
        year, pop = line.strip().split()
        years.append(int(year))
        population.append(int(pop.replace(",", "")))

plt.figure(figsize=(10,6))
plt.plot(years, population, 'b-o')
plt.title("US Population Over Time")
plt.xlabel("Year")
plt.ylabel("Population")
plt.grid(True)
plt.show()



## Part 5: Changing the Scale

Sometimes a **logarithmic scale** reveals trends better.


In [None]:

plt.figure(figsize=(10,6))

plt.plot(years, population, 'r-o')
plt.yscale("log")
plt.title("US Population Over Time (Log Scale)")
plt.xlabel("Year")
plt.ylabel("Population (log scale)")
plt.grid(True, which="both")
plt.show()



## Part 6: Country Populations and Benford's Law

We will explore the `countryPops.txt` dataset and check if the population numbers follow **Benford's Law**.


In [None]:

# Load country population data
populations = []
with open("lec25_countryPops.txt") as f:
    for line in f:
        parts = line.strip().split("\t")
        if len(parts) >= 3:
            pop = parts[2].replace(",", "")
            if pop.isdigit():
                populations.append(int(pop))

# Extract first digits
first_digits = [int(str(p)[0]) for p in populations if p > 0]

# Count frequency
digit_counts = [first_digits.count(d) for d in range(1,10)]
total = sum(digit_counts)
digit_freq = [c/total for c in digit_counts]

# Benford's Law prediction
benford = [np.log10(1 + 1/d) for d in range(1,10)]

# Plot
digits = list(range(1,10))
plt.figure(figsize=(10,6))
plt.bar(digits, digit_freq, alpha=0.6, label="Country Populations")
plt.plot(digits, benford, 'r-o', label="Benford's Law")
plt.xticks(digits)
plt.xlabel("First Digit")
plt.ylabel("Frequency")
plt.title("Benford's Law in Country Populations")
plt.legend()
plt.show()



## Wrap-Up and Next Steps

In this lab you learned:
- Basics of plotting with matplotlib
- Differences between line and scatter plots
- Creating multiple plots and subplots
- Working with real datasets (`USPopulation.txt`, `countryPops.txt`)
- Using logarithmic scaling
- Exploring **Benford's Law**

For practice:
1. Try adding labels and customizing styles (colors, line styles, markers).
2. Create subplots comparing US population in linear vs log scale.
3. Explore other datasets and visualize them with histograms, bar plots, or scatter plots.
