# Interactive Python Tutor - Part 2: Your First Bar Chart

Welcome to Part 2! Now that we have our data loaded, let's create a visualization to answer a simple question: **What are the most common tree species in this dataset?**

A bar chart is the perfect tool for this. Let's get started.

## Step 1: Load and Prepare Data (Recap)

First, let's run the code from Part 1 again to make sure our libraries and data are loaded.

In [None]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Load the data again
file_path = '2025_trees_steglitz.csv'
trees_df = pd.read_csv(file_path, sep=';')

## Step 2: Counting the Tree Species

To find the most common species, we need to count how many times each unique species appears in the `baumart` column.

Pandas makes this easy with the `.value_counts()` method. We can then use `.nlargest(10)` to get just the top 10.

In [None]:
# Get the top 10 species using value_counts()
top_10_species = trees_df['baumart'].value_counts().nlargest(10)

# Print the result to see what it looks like
print(top_10_species)

## Step 3: Creating the Plot

Now that we have the data for the top 10 species, we can create the bar chart. We'll use the `seaborn` library, which makes beautiful plots with very little code.

Here's a breakdown of the code:
1.  `plt.figure(figsize=(12, 7))`: This creates a blank canvas for our plot and sets its size.
2.  `sns.barplot(x=top_10_species.values, y=top_10_species.index, palette='viridis')`: This is the main function that creates the bar chart.
    * `x=top_10_species.values`: The counts (the numbers) will be on the x-axis (horizontal).
    * `y=top_10_species.index`: The species names will be on the y-axis (vertical).
    * `palette='viridis'`: This just applies a nice color scheme.
3.  `plt.title(...)`, `plt.xlabel(...)`, `plt.ylabel(...)`: These functions set the title and labels for our axes.
4.  `plt.show()`: This function displays the plot we've created.

In [None]:
# Set the figure size for the plot
plt.figure(figsize=(12, 7))

# Create the bar plot
sns.barplot(x=top_10_species.values, y=top_10_species.index, palette='viridis')

# Add a title and labels for clarity
plt.title('Top 10 Most Common Tree Species', fontsize=16)
plt.xlabel('Number of Trees', fontsize=12)
plt.ylabel('Tree Species', fontsize=12)

# Display the plot
plt.show()

### **Question 1:**
According to the bar chart, which tree species is the most common?

**Your Answer (Double-click to edit):**

* Most Common Species: 

### **Your Turn! Task 1:**

Let's practice. Can you modify the code to show the **Top 5** most common species instead of the top 10? Copy the code from the cells above and make the necessary change.

In [None]:
# YOUR CODE HERE
# 1. Get the top 5 species
top_5_species = ...

# 2. Create the plot
plt.figure(figsize=(12, 5))
sns.barplot(x=top_5_species.values, y=top_5_species.index, palette='plasma') # Using a different palette!

# 3. Add titles and labels
plt.title('Top 5 Most Common Tree Species', fontsize=16)
plt.xlabel('Number of Trees', fontsize=12)
plt.ylabel('Tree Species', fontsize=12)

# 4. Show the plot
plt.show()

## End of Part 2

Fantastic! You've now created your first data visualization in Python. You can see how quickly we can go from raw data to a clear, informative chart.

**In the next part, we'll learn about data cleaning by shortening those long tree names to make our plots even better!**