# Python Matplotlib Basics - Data Visualization

**Class Duration**: 3 hours  
**Structure**: Lecture & Practice 2 hours + Quiz 1 hour  
**Level**: Intermediate

---

## Learning Objectives

After completing this lesson, students will be able to:  
이 수업을 마친 후 학생들은 다음을 할 수 있습니다:

- Install and import the matplotlib library  
matplotlib 라이브러리 설치 및 가져오기
- Create simple line graphs  
간단한 선 그래프(Line Graph) 생성하기
- Make bar charts for data comparison  
데이터 비교용 막대 그래프(Bar Chart) 만들기
- Add titles and labels to graphs  
그래프에 제목과 라벨(Label) 추가하기
- Visualize data distribution with histograms  
히스토그램(Histogram)으로 데이터 분포 시각화하기

---

## 1. Introduction to Matplotlib

**Matplotlib** is a Python library that creates graphs and charts.  
**Matplotlib**은 그래프와 차트를 만들어주는 파이썬 라이브러리입니다.

Think of it as a digital brush for painting with numbers.  
숫자로 그림을 그리는 디지털 붓이라고 생각하세요.

### Why Use Graphs?

**Easy to understand**: Pictures are easier to understand than numbers  
이해하기 쉬움: 그림이 숫자보다 이해하기 쉽습니다

**Pattern discovery**: Quickly identify trends  
패턴 발견: 트렌드를 빠르게 확인할 수 있습니다

**Data comparison**: Side-by-side comparisons are possible  
데이터 비교: 나란히 비교가 가능합니다

**Effective communication**: Simply convey complex data  
효과적인 의사소통: 복잡한 데이터를 간단하게 전달

### Importance of Data Visualization

Data visualization is a very important skill in modern society:  
데이터 시각화는 현대 사회에서 매우 중요한 기술입니다:

```
Example: COVID-19 Spread Status
- Numbers only: "1,234 confirmed cases, 987 recovered"
- With graph: See increase/decrease trends over time at a glance
```

---

## 2. Installing Matplotlib

Before using matplotlib, you need to install it using **pip**.  
matplotlib을 사용하기 전에 **pip**를 사용해서 설치해야 합니다.

### Installation Command

In [None]:
%%bash
pip install matplotlib

### Importing Matplotlib

In [None]:
import matplotlib.pyplot as plt

This is the standard way to import matplotlib for graph creation.  
이것이 그래프 그리기를 위해 matplotlib을 가져오는 표준 방법입니다.

### Font Settings (Optional)

In [None]:
import matplotlib.pyplot as plt
import matplotlib.font_manager as fm

# Font settings for special characters
plt.rcParams['font.family'] = 'DejaVu Sans'

# Prevent minus sign display issues
plt.rcParams['axes.unicode_minus'] = False

---

## 3. Line Graphs

Line graphs show data changes over time.  
선 그래프는 시간에 따른 데이터 변화를 보여줍니다.

It's like tracking your height as you grow.  
자라면서 키를 추적하는 것과 같습니다.

### Simple Line Graph

In [None]:
import matplotlib.pyplot as plt

# Data
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

# Create line graph
plt.plot(x, y)
plt.show()

### Line Graph with Labels

In [None]:
import matplotlib.pyplot as plt

# Monthly sales data
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May']
sales = [1000, 1500, 1200, 1800, 2000]

# Create graph
plt.plot(months, sales)
plt.title('Monthly Sales Performance')
plt.xlabel('Month')
plt.ylabel('Sales ($1000)')
plt.show()

### Multiple Line Graph

In [None]:
import matplotlib.pyplot as plt

# Monthly sales data for two products
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May']
product_a = [100, 150, 120, 180, 200]
product_b = [80, 130, 140, 160, 190]

# Multiple line graph
plt.plot(months, product_a, label='Product A', marker='o')
plt.plot(months, product_b, label='Product B', marker='s')
plt.title('Monthly Sales by Product')
plt.xlabel('Month')
plt.ylabel('Units Sold')
plt.legend()  # Show legend
plt.grid(True)  # Show grid
plt.show()

---

## 4. Bar Charts

Bar charts compare different categories.  
막대 그래프는 서로 다른 카테고리를 비교합니다.

It's like comparing heights of different buildings.  
서로 다른 건물의 높이를 비교하는 것과 같습니다.

### Simple Bar Chart

In [None]:
import matplotlib.pyplot as plt

# Subject scores
subjects = ['Math', 'English', 'Science']
scores = [85, 90, 78]

# Create bar chart
plt.bar(subjects, scores)
plt.title('Scores by Subject')
plt.ylabel('Score')
plt.show()

### Colorful Bar Chart

In [None]:
import matplotlib.pyplot as plt

# Fruit sales
fruits = ['Apple', 'Banana', 'Orange', 'Grape']
sales = [50, 30, 40, 25]
colors = ['red', 'yellow', 'orange', 'purple']

# Create colorful bar chart
plt.bar(fruits, sales, color=colors)
plt.title('Fruit Sales')
plt.xlabel('Fruit Type')
plt.ylabel('Units Sold')
plt.show()

### Horizontal Bar Chart

In [None]:
import matplotlib.pyplot as plt

# City population data
cities = ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix']
population = [8.4, 3.9, 2.7, 2.3, 1.7]

# Horizontal bar chart
plt.barh(cities, population, color='skyblue')
plt.title('City Population')
plt.xlabel('Population (millions)')
plt.ylabel('City')

# Add values to each bar
for i, value in enumerate(population):
    plt.text(value + 0.1, i, f'{value}M', va='center')

plt.show()

---

## 5. Histograms

Histograms show how often different values appear.  
히스토그램은 서로 다른 값들이 얼마나 자주 나타나는지 보여줍니다.

It's like counting how many students received different grades.  
서로 다른 성적을 받은 학생 수를 세는 것과 같습니다.

### Simple Histogram

In [None]:
import matplotlib.pyplot as plt

# Student height data (inches)
height_data = [63, 65, 67, 69, 66, 68, 62, 71, 64, 67, 
               65, 68, 64, 69, 66, 63, 68, 63, 69, 65]

# Create histogram
plt.hist(height_data, bins=5, color='lightblue', edgecolor='black')
plt.title('Student Height Distribution')
plt.xlabel('Height (inches)')
plt.ylabel('Number of Students')
plt.show()

### Detailed Histogram

In [None]:
import matplotlib.pyplot as plt

# Test score data
scores = [75, 82, 90, 88, 76, 95, 68, 84, 91, 77, 
          86, 79, 92, 85, 73, 89, 78, 83, 87, 94, 
          71, 80, 93, 88, 74, 96, 69, 81, 85, 90]

# Create histogram
plt.hist(scores, bins=8, color='lightgreen', alpha=0.7, edgecolor='black')
plt.title('Test Score Distribution')
plt.xlabel('Score')
plt.ylabel('Number of Students')
plt.grid(True, alpha=0.3)

# Show average line
average_score = sum(scores) / len(scores)
plt.axvline(average_score, color='red', linestyle='--', label=f'Average: {average_score:.1f}')
plt.legend()
plt.show()

print(f"Average score: {average_score:.1f}")
print(f"Highest score: {max(scores)}")
print(f"Lowest score: {min(scores)}")

---

## 6. Styling Graphs

Making graphs look better and more informative.  
그래프를 더 좋아 보이고 정보가 풍부하게 만들기.

### Essential Graph Elements

In [None]:
import matplotlib.pyplot as plt

# Weekly temperature data
days = ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun']
temperature = [72, 77, 73, 81, 75, 79, 70]

# Create styled graph
plt.figure(figsize=(10, 6))  # Set graph size
plt.plot(days, temperature, color='blue', marker='o', linewidth=3, markersize=8)
plt.title('Weekly Temperature Changes', fontsize=16, fontweight='bold')
plt.xlabel('Day', fontsize=12)
plt.ylabel('Temperature (°F)', fontsize=12)
plt.grid(True, alpha=0.3)

# Add temperature values to each point
for i, temp in enumerate(temperature):
    plt.annotate(f'{temp}°F', (i, temp), textcoords="offset points", 
                xytext=(0,10), ha='center')

plt.tight_layout()
plt.show()

### Graph Style Options

**Colors**: 'red', 'blue', 'green', 'orange', '#FF5733' (hex code)  
색상(Colors): 'red', 'blue', 'green', 'orange', '#FF5733' (헥스 코드)

**Markers**: 'o' (circle), 's' (square), '^' (triangle), '*' (star)  
마커(Markers): 'o' (원), 's' (사각형), '^' (삼각형), '*' (별)

**Line styles**: '-' (solid), '--' (dashed), ':' (dotted), '-.' (dash-dot)  
선 스타일(Line styles): '-' (실선), '--' (점선), ':' (점점선), '-.' (일점쇄선)

### Multiple Graphs with Subplots

In [None]:
import matplotlib.pyplot as plt

# Data
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun']
revenue = [1000, 1500, 1200, 1800, 2000, 1700]
expenses = [800, 1100, 900, 1300, 1400, 1200]

# Create 2x1 subplots
plt.figure(figsize=(12, 8))

# First subplot: Revenue
plt.subplot(2, 1, 1)
plt.plot(months, revenue, 'go-', linewidth=2)
plt.title('Monthly Revenue')
plt.ylabel('Revenue ($1000)')
plt.grid(True)

# Second subplot: Expenses
plt.subplot(2, 1, 2)
plt.plot(months, expenses, 'ro-', linewidth=2)
plt.title('Monthly Expenses')
plt.xlabel('Month')
plt.ylabel('Expenses ($1000)')
plt.grid(True)

plt.tight_layout()
plt.show()

---

## Practice Problems

### Practice 1: CSV Data Visualization

**Problem**: Read CSV data and create different types of graphs.  
**문제**: CSV 데이터를 읽고 다른 종류의 그래프를 만드세요.

**First, create a 'monthly_data.csv' file as follows:**  
**먼저 'monthly_data.csv' 파일을 다음과 같이 만드세요:**

```
month,temperature,rainfall
Jan,59,1.8
Feb,64,1.5
Mar,72,2.1
Apr,77,1.6
May,82,1.4
Jun,90,0.8
```

**Solution**:  
**정답**:

In [None]:
import matplotlib.pyplot as plt
import csv

# Read CSV data
months = []
temperature = []
rainfall = []

with open('monthly_data.csv', 'r', encoding='utf-8') as file:
    csv_reader = csv.DictReader(file)
    for row in csv_reader:
        months.append(row['month'])
        temperature.append(int(row['temperature']))
        rainfall.append(float(row['rainfall']))

# Create line graph for temperature and bar chart for rainfall
plt.figure(figsize=(12, 5))

# First subplot: Temperature
plt.subplot(1, 2, 1)
plt.plot(months, temperature, color='red', marker='o', linewidth=2)
plt.title('Monthly Temperature Changes')
plt.xlabel('Month')
plt.ylabel('Temperature (°F)')
plt.grid(True, alpha=0.3)

# Second subplot: Rainfall
plt.subplot(1, 2, 2)
plt.bar(months, rainfall, color='blue', alpha=0.7)
plt.title('Monthly Rainfall')
plt.xlabel('Month')
plt.ylabel('Rainfall (inches)')

plt.tight_layout()
plt.show()

# Data analysis output
print(f"Hottest month: {months[temperature.index(max(temperature))]} ({max(temperature)}°F)")
print(f"Wettest month: {months[rainfall.index(max(rainfall))]} ({max(rainfall)} inches)")

### Practice 2: Stock Price Trend Graph

**Problem**: Create a stock price trend visualization.  
**문제**: 주식 가격 추이 시각화를 만드세요.

**Data:**  
**데이터:**

In [None]:
# Stock price data
periods = ['Week 1', 'Week 2', 'Week 3', 'Week 4', 'Week 5', 'Week 6']
stock_prices = [100, 105, 98, 110, 115, 108]

**Solution**:  
**정답**:

In [None]:
import matplotlib.pyplot as plt

# Stock price data
periods = ['Week 1', 'Week 2', 'Week 3', 'Week 4', 'Week 5', 'Week 6']
stock_prices = [100, 105, 98, 110, 115, 108]

# Create stock price graph
plt.figure(figsize=(12, 7))
plt.plot(periods, stock_prices, color='green', marker='o', linewidth=3, markersize=10)

# Add styling
plt.title('Stock Price Trend', fontsize=18, fontweight='bold')
plt.xlabel('Period', fontsize=14)
plt.ylabel('Price ($)', fontsize=14)
plt.grid(True, alpha=0.3)

# Add price labels to each point
for i, price in enumerate(stock_prices):
    plt.annotate(f'${price}', (i, price), textcoords="offset points", 
                xytext=(0,15), ha='center', fontsize=12, fontweight='bold')

# Color code gains/losses
for i in range(1, len(stock_prices)):
    if stock_prices[i] > stock_prices[i-1]:
        plt.plot([i-1, i], [stock_prices[i-1], stock_prices[i]], 'g-', linewidth=3)
    else:
        plt.plot([i-1, i], [stock_prices[i-1], stock_prices[i]], 'r-', linewidth=3)

plt.tight_layout()
plt.show()

# Calculate and display profit/loss
start_price = stock_prices[0]
end_price = stock_prices[-1]
change = end_price - start_price
return_rate = (change / start_price) * 100

print(f"Starting price: ${start_price}")
print(f"Ending price: ${end_price}")
print(f"Change: ${change:+}")
print(f"Return rate: {return_rate:+.1f}%")

### Practice 3: Survey Results Visualization

**Problem**: Visualize various survey data with multiple graph types.  
**문제**: 다양한 설문조사 데이터를 여러 그래프로 시각화하세요.

**Data:**  
**데이터:**

In [None]:
# Survey data
age_groups = ['10-19', '20-29', '30-39', '40-49', '50+']
respondents = [25, 45, 35, 28, 17]

satisfaction = ['Very\nUnsatisfied', 'Unsatisfied', 'Neutral', 'Satisfied', 'Very\nSatisfied']
satisfaction_counts = [5, 12, 30, 38, 15]

products = ['Product A', 'Product B', 'Product C', 'Product D']
preferences = [30, 25, 35, 10]

**Solution**:  
**정답**:

In [None]:
import matplotlib.pyplot as plt

# Survey data
age_groups = ['10-19', '20-29', '30-39', '40-49', '50+']
respondents = [25, 45, 35, 28, 17]

satisfaction = ['Very\nUnsatisfied', 'Unsatisfied', 'Neutral', 'Satisfied', 'Very\nSatisfied']
satisfaction_counts = [5, 12, 30, 38, 15]

products = ['Product A', 'Product B', 'Product C', 'Product D']
preferences = [30, 25, 35, 10]

# Create 2x2 subplots
plt.figure(figsize=(15, 12))

# 1. Respondents by age group (bar chart)
plt.subplot(2, 2, 1)
colors1 = ['lightblue', 'lightgreen', 'lightcoral', 'lightyellow', 'lightpink']
bars1 = plt.bar(age_groups, respondents, color=colors1, edgecolor='black')
plt.title('Respondents by Age Group', fontsize=14, fontweight='bold')
plt.xlabel('Age Group')
plt.ylabel('Number of Respondents')

# Add numbers to each bar
for bar, count in zip(bars1, respondents):
    plt.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.5, 
             f'{count}', ha='center', va='bottom', fontweight='bold')

# 2. Satisfaction distribution (horizontal bar chart)
plt.subplot(2, 2, 2)
colors2 = ['red', 'orange', 'yellow', 'lightgreen', 'green']
bars2 = plt.barh(satisfaction, satisfaction_counts, color=colors2, edgecolor='black')
plt.title('Service Satisfaction', fontsize=14, fontweight='bold')
plt.xlabel('Number of Respondents')

# Add numbers to each bar
for bar, count in zip(bars2, satisfaction_counts):
    plt.text(bar.get_width() + 0.5, bar.get_y() + bar.get_height()/2, 
             f'{count}', ha='left', va='center', fontweight='bold')

# 3. Product preferences (pie chart)
plt.subplot(2, 2, 3)
colors3 = ['gold', 'lightcoral', 'lightskyblue', 'lightgreen']
wedges, texts, autotexts = plt.pie(preferences, labels=products, colors=colors3, 
                                   autopct='%1.1f%%', startangle=90)
plt.title('Product Preferences', fontsize=14, fontweight='bold')

# 4. Age groups and satisfaction comparison (line graph)
plt.subplot(2, 2, 4)
plt.plot(age_groups, respondents, 'bo-', label='Respondents', linewidth=2, markersize=8)
plt.plot(satisfaction[:3], satisfaction_counts[:3], 'ro-', label='Satisfaction (Top 3)', linewidth=2, markersize=8)
plt.title('Response Overview by Age Group', fontsize=14, fontweight='bold')
plt.xlabel('Category')
plt.ylabel('Count')
plt.legend()
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Statistics summary
print("=== SURVEY RESULTS SUMMARY ===")
print(f"Total respondents: {sum(respondents)}")
print(f"Largest age group: {age_groups[respondents.index(max(respondents))]} ({max(respondents)} people)")
print(f"Satisfied+ respondents: {satisfaction_counts[3] + satisfaction_counts[4]} ({((satisfaction_counts[3] + satisfaction_counts[4])/sum(satisfaction_counts)*100):.1f}%)")
print(f"Most popular product: {products[preferences.index(max(preferences))]} ({max(preferences)}%)")

---

## Quiz

### Quiz 1: Basic Line Graph

**Problem**: Create a line graph with x = [1, 2, 3, 4, 5] and y = [2, 4, 6, 8, 10]. Add appropriate title and axis labels.  
**문제**: x = [1, 2, 3, 4, 5]와 y = [2, 4, 6, 8, 10]으로 선 그래프를 그리고 적절한 제목과 축 라벨을 추가하세요.

**Write your answer here**:  
**답을 여기에 작성하세요**:

In [None]:
# Write your code here





### Quiz 2: Subject Scores Bar Chart

**Problem**: Create a bar chart with subjects = ['Math', 'English', 'Science'] and scores = [85, 90, 78]. Set the title as "Scores by Subject" and add appropriate labels.  
**문제**: subjects = ['Math', 'English', 'Science']와 scores = [85, 90, 78]로 막대 그래프를 그리고 "Scores by Subject" 제목을 설정하세요. 적절한 라벨도 추가하세요.

**Write your answer here**:  
**답을 여기에 작성하세요**:

In [None]:
# Write your code here





### Quiz 3: Height Distribution Histogram

**Problem**: Create a histogram with heights = [63, 65, 67, 69, 66, 68, 62, 71, 64, 67] and set appropriate bins. Add title and labels.  
**문제**: heights = [63, 65, 67, 69, 66, 68, 62, 71, 64, 67]로 히스토그램을 그리고 적절한 구간으로 설정하세요. 제목과 라벨을 추가하세요.

**Write your answer here**:  
**답을 여기에 작성하세요**:

In [None]:
# Write your code here





---

## References

1. **Matplotlib Official Tutorial**: https://matplotlib.org/stable/tutorials/index.html  
   - Official documentation and tutorials  
     공식 문서와 튜토리얼

2. **Matplotlib Basic Examples**: https://matplotlib.org/stable/gallery/index.html  
   - Collection of various graph examples  
     다양한 그래프 예제 모음

3. **Python Graph Gallery**: https://python-graph-gallery.com/  
   - Beautiful graph examples  
     아름다운 그래프 예제들

4. **Real Python Matplotlib Guide**: https://realpython.com/python-matplotlib-guide/  
   - Practical matplotlib guide  
     실용적인 matplotlib 가이드

---

## Key Points

### Remember

**Always import first**: Always import matplotlib.pyplot as plt  
먼저 가져오기: 항상 matplotlib.pyplot을 plt로 가져오기

**Don't forget plt.show()**: Remember to show the graph  
plt.show(): 그래프 보여주는 것 잊지 말기

**Add labels**: Title, xlabel, ylabel make graphs clear  
라벨 추가: 제목, x라벨, y라벨이 그래프를 명확하게 만듦

**Choose right graph type**: Line for trends, bar for comparisons  
올바른 그래프 선택: 트렌드는 선 그래프, 비교는 막대 그래프

### Common Graph Types

**Line Graph**: Time series, trend analysis  
선 그래프(Line Graph): 시계열, 트렌드 분석

**Bar Chart**: Category comparisons  
막대 그래프(Bar Chart): 카테고리 비교

**Histogram**: Distribution of values  
히스토그램(Histogram): 값의 분포 확인

**Pie Chart**: Proportions of a whole  
원형 그래프(Pie Chart): 전체 대비 비율

### Graph Creation Checklist

**Choose appropriate graph type**  
적절한 그래프 유형 선택

**Add title and axis labels**  
제목과 축 라벨 추가

**Add legend when needed**  
범례(Legend) 필요시 추가

**Apply colors and styles**  
색상과 스타일 적용

**Decide whether to show grid**  
격자(Grid) 표시 여부 결정

### Real-World Applications

**Business analytics**: Sales trends, performance metrics  
비즈니스 분석: 매출 동향, 성과 지표

**Scientific research**: Experimental data, statistical analysis  
과학 연구: 실험 데이터, 통계 분석

**Education**: Student performance, survey results  
교육: 학생 성과, 설문조사 결과

**Finance**: Stock prices, market trends  
금융: 주식 가격, 시장 동향

### Important Considerations

**Data quality**: Ensure clean, accurate data before visualization  
데이터 품질: 시각화 전에 깨끗하고 정확한 데이터 확보

**Appropriate scaling**: Choose proper axis ranges  
적절한 스케일링: 올바른 축 범위 선택

**Color accessibility**: Consider colorblind-friendly palettes  
색상 접근성: 색맹 친화적인 팔레트 고려

**Clear labeling**: Make graphs self-explanatory  
명확한 라벨링: 그래프가 자명하도록 만들기

---

## Homework

1. **Practice**: Complete all 3 practice problems and try different styles  
연습: 3개 실습을 모두 완료하고 다양한 스타일 적용해보기
2. **Create**: Make different types of graphs with your own data  
생성: 자신만의 데이터로 다른 종류의 그래프 만들기 연습
3. **Experiment**: Try different color and style options  
실험: 서로 다른 색상과 스타일 옵션 시도해보기
4. **Apply**: Create graphs using CSV files from previous weeks  
적용: 이전 주에 만든 CSV 파일로 그래프 생성하기
5. **Advanced**: Display multiple data series in one graph  
고급: 한 개의 그래프에 여러 데이터 시리즈 표시해보기

**Graphs bring data to life - they tell stories that numbers alone cannot!**  
**그래프는 데이터를 생생하게 만듭니다 - 숫자만으로는 할 수 없는 이야기를 들려줍니다!**