
# Section 3.1: Visualizing Data with Scatter, Bar, and Line Charts

In this section, we will explore how to visualize data using different types of charts in Python with the help of pandas and matplotlib libraries. We will start with the basics and gradually move on to more complex examples.

## Scatter Plots
Scatter plots are used to observe relationships between variables. Here's how you can create a scatter plot using pandas.


In [None]:

import pandas as pd
import matplotlib.pyplot as plt

# Load the dataset
iris_data_path = 'iris.data'
column_names = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species']
iris_df = pd.read_csv(iris_data_path, header=None, names=column_names)

# Create a scatter plot
plt.figure(figsize=(8, 6))
for species, group in iris_df.groupby('species'):
    plt.scatter(group['sepal_length'], group['sepal_width'], label=species)
plt.title('Sepal Length vs Sepal Width by Species')
plt.xlabel('Sepal Length (cm)')
plt.ylabel('Sepal Width (cm)')
plt.legend()
plt.show()



## Bar Charts
Bar charts are useful for comparing quantities corresponding to different groups. Below is an example of creating a bar chart using pandas.


In [None]:

# Calculate the average sepal length for each species
average_sepal_length = iris_df.groupby('species')['sepal_length'].mean()

# Create a bar chart
average_sepal_length.plot(kind='bar', figsize=(8, 6))
plt.title('Average Sepal Length by Species')
plt.xlabel('Species')
plt.ylabel('Average Sepal Length (cm)')
plt.show()



## Line Charts
Line charts are ideal for showing trends over time. Here's how you can create a line chart using pandas.


In [None]:

# Sort the dataframe by petal length
sorted_iris_df = iris_df.sort_values('petal_length')

# Create a line chart
plt.figure(figsize=(10, 8))
for species, group in sorted_iris_df.groupby('species'):
    plt.plot(group['petal_length'], label=species)
plt.title('Petal Length Variation by Species')
plt.xlabel('Samples')
plt.ylabel('Petal Length (cm)')
plt.legend()
plt.show()



## Example Problems

Now that you have learned how to create scatter, bar, and line charts, try to solve the following problems using the Iris dataset:

1. Create a scatter plot showing the relationship between petal length and petal width for each species.
2. Generate a bar chart that shows the average petal width for each species.
3. Create a line chart that displays the variation in sepal width for each species across the dataset.
4. (Bonus) Combine a scatter and line chart in one figure to show the relationship between sepal length and sepal width, with a trend line for each species.

Remember to label your axes and add a title to each chart for clarity.
