# Introduction to Plotly: Interactive Data Visualization

**Python for Data Science**  
**Learn and Help**

---

## What is Plotly?

Plotly is a powerful Python library for creating **interactive** data visualizations. Unlike static graphs (like those from Matplotlib), Plotly charts allow you to:

- üîç **Zoom in and out** of specific areas
- üëÜ **Hover over data points** to see exact values
- üñ±Ô∏è **Pan across** the visualization
- üëÅÔ∏è **Toggle data series** on/off by clicking legend items
- üíæ **Save** visualizations as PNG images
- üìä **Rotate and explore** 3D plots

This makes Plotly perfect for exploring data and creating dashboards!

## Installation

First, let's make sure Plotly is installed:

In [1]:
# Install Plotly (run this if you haven't already)
!pip install plotly



## Import Libraries

We'll use two main Plotly modules:
- `plotly.express` - Simple, high-level interface (like Seaborn)
- `plotly.graph_objects` - More control and customization (like Matplotlib)

In [2]:
import plotly.express as px
import plotly.graph_objects as go
import pandas as pd
import numpy as np

---

# Part 1: Interactive Features - What Makes Plotly Special!

Let's explore the interactive features that make Plotly different from static graphs.

## 1.1 Hover Information

**Try this:** After running the cell below, move your mouse over the data points!

In [3]:
# Sample data: Student test scores
students = pd.DataFrame({
    'Student': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eva', 'Frank', 'Grace', 'Henry'],
    'Math_Score': [85, 92, 78, 95, 88, 76, 91, 83],
    'Science_Score': [88, 85, 82, 93, 90, 79, 89, 86],
    'Study_Hours': [5, 7, 4, 8, 6, 3, 7, 5]
})

# Create scatter plot
fig = px.scatter(students,
                 x='Study_Hours',
                 y='Math_Score',
                 size='Science_Score',
                 color='Science_Score',
                 hover_data=['Student'],
                 title='Student Performance: Math Scores vs Study Hours',
                 labels={'Study_Hours': 'Hours Studied per Week', 'Math_Score': 'Math Score (%)'})

fig.show()

print("\n‚ú® INTERACTIVE FEATURES TO TRY:")
print("1. Hover your mouse over any point to see student details")
print("2. The hover box shows: Student name, study hours, math score, and science score")


‚ú® INTERACTIVE FEATURES TO TRY:
1. Hover your mouse over any point to see student details
2. The hover box shows: Student name, study hours, math score, and science score


## 1.2 Zoom and Pan

**Try this:** After running the cell below:
- Click and drag to select an area to zoom in
- Double-click to reset the zoom
- Use the pan tool to move around

In [4]:
# Generate more data for better zoom demonstration
np.random.seed(42)
n_points = 200

data = pd.DataFrame({
    'x': np.random.randn(n_points) * 10,
    'y': np.random.randn(n_points) * 10,
    'category': np.random.choice(['Group A', 'Group B', 'Group C'], n_points)
})

fig = px.scatter(data, x='x', y='y', color='category',
                title='Zoom and Pan Demo - Try clicking and dragging!',
                width=800, height=500)

fig.show()

print("\n‚ú® INTERACTIVE FEATURES TO TRY:")
print("1. Click and drag to select an area ‚Üí Zoom in")
print("2. Double-click anywhere ‚Üí Reset zoom")
print("3. Click the 'Pan' button in the toolbar ‚Üí Click and drag to move around")
print("4. Click 'Box Select' or 'Lasso Select' ‚Üí Select multiple points")


‚ú® INTERACTIVE FEATURES TO TRY:
1. Click and drag to select an area ‚Üí Zoom in
2. Double-click anywhere ‚Üí Reset zoom
3. Click the 'Pan' button in the toolbar ‚Üí Click and drag to move around
4. Click 'Box Select' or 'Lasso Select' ‚Üí Select multiple points


## 1.3 Toggle Legend Items

**Try this:** Click on the legend items to hide/show different data series!

In [5]:
# Monthly sales data for three products
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
sales_data = pd.DataFrame({
    'Month': months * 3,
    'Sales': [120, 135, 150, 145, 160, 175, 180, 170, 165, 185, 190, 200,  # Product A
              80, 85, 95, 100, 110, 105, 115, 120, 125, 130, 135, 140,      # Product B
              60, 65, 70, 68, 75, 80, 85, 90, 88, 95, 100, 105],            # Product C
    'Product': ['Product A']*12 + ['Product B']*12 + ['Product C']*12
})

fig = px.line(sales_data, x='Month', y='Sales', color='Product',
             title='2024 Sales by Product - Click Legend to Toggle!',
             markers=True)

fig.show()

print("\n‚ú® INTERACTIVE FEATURES TO TRY:")
print("1. Single-click a legend item ‚Üí Hide/show that product's line")
print("2. Double-click a legend item ‚Üí Isolate just that product")
print("3. Double-click again ‚Üí Show all products again")
print("4. This helps you focus on specific data without creating new graphs!")


‚ú® INTERACTIVE FEATURES TO TRY:
1. Single-click a legend item ‚Üí Hide/show that product's line
2. Double-click a legend item ‚Üí Isolate just that product
3. Double-click again ‚Üí Show all products again
4. This helps you focus on specific data without creating new graphs!


## 1.4 Toolbar Actions

Look at the toolbar that appears in the top-right corner of each plot. Here's what each button does:

| Icon | Name | What It Does |
|------|------|-------------|
| üì∑ | Download plot as PNG | Save the current view as an image |
| üîç | Zoom | Click and drag to zoom into a region |
| ‚ÜîÔ∏è | Pan | Click and drag to move around |
| ‚òê | Box Select | Select multiple points in a rectangle |
| ‚ö™ | Lasso Select | Select multiple points with a free-form shape |
| üìä | Zoom In/Out | Zoom in or out by set increments |
| üè† | Reset axes | Return to the original view |
| ‚öôÔ∏è | Toggle spike lines | Show/hide lines to axes when hovering |

**Try clicking these buttons and see what happens!**

---

# Part 2: Core Plotly Capabilities

Now let's explore the different types of visualizations Plotly can create!

## 2.1 Bar Charts

Great for comparing categories.

In [None]:
# Video game sales data
games = pd.DataFrame({
    'Game': ['Minecraft', 'GTA V', 'Tetris', 'Wii Sports', 'PUBG', 'Mario Kart 8'],
    'Sales_Millions': [238, 185, 100, 83, 75, 60],
    'Platform': ['Multi', 'Multi', 'Multi', 'Wii', 'Multi', 'Switch']
})

fig = px.bar(games,
             x='Game',
             y='Sales_Millions',
             color='Platform',
             title='Best-Selling Video Games of All Time',
             labels={'Sales_Millions': 'Sales (Millions)'})

fig.show()

print("\n‚ú® Try hovering over the bars to see exact sales numbers!")

## 2.2 Line Charts

Perfect for showing trends over time.

In [None]:
# Temperature data
dates = pd.date_range('2024-01-01', '2024-12-31', freq='M')
temps = pd.DataFrame({
    'Date': dates,
    'Chicago': [25, 28, 40, 52, 63, 73, 76, 75, 68, 56, 42, 30],
    'Miami': [70, 72, 75, 79, 82, 85, 87, 87, 85, 81, 76, 72]
})

# Reshape for Plotly
temps_long = temps.melt(id_vars='Date', var_name='City', value_name='Temperature')

fig = px.line(temps_long,
              x='Date',
              y='Temperature',
              color='City',
              title='2024 Average Temperature Comparison',
              labels={'Temperature': 'Temperature (¬∞F)'},
              markers=True)

fig.show()

print("\n‚ú® Try clicking on cities in the legend to compare them individually!")

## 2.3 Scatter Plots

Show relationships between two numeric variables.

In [None]:
# NBA player stats (fictional example data)
nba = pd.DataFrame({
    'Player': ['LeBron', 'Curry', 'Durant', 'Giannis', 'Jokic', 'Embiid', 'Tatum', 'Doncic'],
    'Points_Per_Game': [25.7, 29.4, 28.2, 31.1, 26.4, 33.1, 26.9, 32.4],
    'Assists_Per_Game': [7.3, 6.1, 5.0, 5.7, 9.0, 4.2, 4.6, 8.0],
    'Position': ['Forward', 'Guard', 'Forward', 'Forward', 'Center', 'Center', 'Forward', 'Guard']
})

fig = px.scatter(nba,
                 x='Assists_Per_Game',
                 y='Points_Per_Game',
                 size='Points_Per_Game',
                 color='Position',
                 hover_data=['Player'],
                 title='NBA Stars: Points vs Assists',
                 labels={'Points_Per_Game': 'Points per Game',
                        'Assists_Per_Game': 'Assists per Game'})

fig.show()

print("\n‚ú® Hover to see player names and their stats!")

## 2.4 Pie Charts

Display parts of a whole.

In [None]:
# Social media usage
social_media = pd.DataFrame({
    'Platform': ['YouTube', 'Instagram', 'TikTok', 'Snapchat', 'Facebook', 'Other'],
    'Time_Spent': [35, 25, 20, 12, 5, 3]
})

fig = px.pie(social_media,
             values='Time_Spent',
             names='Platform',
             title='Teen Social Media Usage (% of time)',
             hole=0.3)  # Makes it a donut chart!

fig.show()

print("\n‚ú® Click on a slice to pull it out! Double-click to reset!")

## 2.5 Histograms

Show the distribution of data.

In [None]:
# Generate test scores
np.random.seed(42)
scores = pd.DataFrame({
    'Score': np.random.normal(75, 10, 100).clip(0, 100),
    'Subject': ['Math'] * 50 + ['Science'] * 50
})

fig = px.histogram(scores,
                   x='Score',
                   color='Subject',
                   nbins=20,
                   title='Test Score Distribution',
                   labels={'Score': 'Test Score (%)'},
                   barmode='overlay',
                   opacity=0.7)

fig.show()

print("\n‚ú® Toggle subjects in the legend to compare distributions!")

## 2.6 Box Plots

Show statistical distribution with quartiles.

In [None]:
# Movie ratings by genre
np.random.seed(42)
ratings = pd.DataFrame({
    'Rating': np.concatenate([np.random.normal(7.5, 1.2, 30),
                             np.random.normal(6.8, 1.5, 30),
                             np.random.normal(7.2, 1.0, 30)]),
    'Genre': ['Action']*30 + ['Comedy']*30 + ['Drama']*30
})

fig = px.box(ratings,
             x='Genre',
             y='Rating',
             title='Movie Ratings by Genre',
             labels={'Rating': 'IMDb Rating'},
             color='Genre',
             points='all')  # Show all points

fig.show()

print("\n‚ú® Hover over the box to see median, quartiles, and outliers!")
print("The line in the middle is the median.")
print("The box shows where 50% of the data falls.")
print("The whiskers show the range of most data.")
print("Dots outside are outliers!")

## 2.7 Heatmaps

Visualize data in a matrix format with colors.

In [6]:
# Student performance across subjects and weeks
weeks = ['Week 1', 'Week 2', 'Week 3', 'Week 4']
subjects = ['Math', 'Science', 'English', 'History', 'Art']

# Create random scores
np.random.seed(42)
performance = np.random.randint(70, 100, size=(5, 4))

fig = go.Figure(data=go.Heatmap(
    z=performance,
    x=weeks,
    y=subjects,
    colorscale='RdYlGn',
    text=performance,
    texttemplate='%{text}',
    textfont={"size": 14},
    hoverongaps=False))

fig.update_layout(title='Student Performance Heatmap (Weekly Scores)',
                 xaxis_title='Week',
                 yaxis_title='Subject')

fig.show()

print("\n‚ú® Hover over each cell to see the exact score!")
print("Green = Higher scores, Red = Lower scores")


‚ú® Hover over each cell to see the exact score!
Green = Higher scores, Red = Lower scores


## 2.8 3D Scatter Plots

Explore three-dimensional relationships - you can rotate these!

In [None]:
# 3D data: Height, Weight, Age
np.random.seed(42)
n = 50

data_3d = pd.DataFrame({
    'Height': np.random.normal(165, 10, n),
    'Weight': np.random.normal(65, 12, n),
    'Age': np.random.randint(13, 18, n),
    'Gender': np.random.choice(['Male', 'Female'], n)
})

fig = px.scatter_3d(data_3d,
                    x='Height',
                    y='Weight',
                    z='Age',
                    color='Gender',
                    title='3D Plot: Height vs Weight vs Age',
                    labels={'Height': 'Height (cm)', 'Weight': 'Weight (kg)'})

fig.show()

print("\n‚ú® THIS IS SUPER COOL!")
print("1. Click and drag to ROTATE the 3D plot!")
print("2. Scroll to zoom in and out")
print("3. Hover to see data points")
print("4. You can view your data from any angle!")

---

# Part 3: Customization

Plotly gives you lots of control over how your graphs look!

## 3.1 Updating Layout and Styling

In [None]:
# Sample data
data = pd.DataFrame({
    'Category': ['A', 'B', 'C', 'D', 'E'],
    'Values': [23, 45, 56, 78, 32]
})

fig = px.bar(data, x='Category', y='Values')

# Customize the figure
fig.update_layout(
    title='Customized Bar Chart',
    title_font_size=24,
    title_font_color='darkblue',
    xaxis_title='Categories',
    yaxis_title='Values',
    font=dict(size=14),
    plot_bgcolor='lightgray',
    width=800,
    height=500
)

# Customize the bars
fig.update_traces(
    marker_color='coral',
    marker_line_color='darkred',
    marker_line_width=2
)

fig.show()

## 3.2 Adding Annotations and Shapes

In [None]:
# Create a line chart
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun']
sales = [100, 120, 115, 140, 160, 155]

fig = go.Figure()
fig.add_trace(go.Scatter(x=months, y=sales, mode='lines+markers', name='Sales'))

# Add a horizontal line (target)
fig.add_hline(y=130, line_dash="dash", line_color="red",
              annotation_text="Sales Target", annotation_position="right")

# Add an annotation for peak sales
fig.add_annotation(x='May', y=160,
                  text="Peak Sales!",
                  showarrow=True,
                  arrowhead=2,
                  arrowcolor="green",
                  font=dict(size=14, color="green"))

fig.update_layout(title='Sales with Annotations',
                 xaxis_title='Month',
                 yaxis_title='Sales ($1000s)')

fig.show()

print("\n‚ú® Annotations help highlight important points in your data!")

---

# Part 4: Multiple Subplots

Create dashboards with multiple plots in one figure!

In [None]:
from plotly.subplots import make_subplots

# Create subplots: 2 rows, 2 columns
fig = make_subplots(
    rows=2, cols=2,
    subplot_titles=('Temperature', 'Rainfall', 'Humidity', 'Wind Speed')
)

# Sample weather data
days = list(range(1, 8))
temp = [72, 75, 73, 78, 80, 76, 74]
rain = [0.1, 0.3, 0.0, 0.5, 0.2, 0.0, 0.4]
humidity = [65, 70, 60, 75, 68, 62, 72]
wind = [5, 8, 6, 10, 7, 5, 9]

# Add traces
fig.add_trace(go.Scatter(x=days, y=temp, name='Temp', mode='lines+markers'), row=1, col=1)
fig.add_trace(go.Bar(x=days, y=rain, name='Rain'), row=1, col=2)
fig.add_trace(go.Scatter(x=days, y=humidity, name='Humidity', mode='lines'), row=2, col=1)
fig.add_trace(go.Bar(x=days, y=wind, name='Wind'), row=2, col=2)

# Update layout
fig.update_layout(height=600, width=900, title_text="Weekly Weather Dashboard")
fig.update_xaxes(title_text="Day", row=2, col=1)
fig.update_xaxes(title_text="Day", row=2, col=2)
fig.update_yaxes(title_text="¬∞F", row=1, col=1)
fig.update_yaxes(title_text="Inches", row=1, col=2)
fig.update_yaxes(title_text="%", row=2, col=1)
fig.update_yaxes(title_text="mph", row=2, col=2)

fig.show()

print("\n‚ú® Each subplot is interactive! Try hovering and zooming in each one!")

---

# Part 5: Practice Exercises

Now it's your turn! Try these exercises to practice what you've learned.

## Exercise 1: Create Your Own Scatter Plot

Create a scatter plot using the data below showing the relationship between hours spent gaming and grades.

In [None]:
# Data provided
gaming_data = pd.DataFrame({
    'Student': ['Alex', 'Brian', 'Casey', 'Dana', 'Eli', 'Fiona', 'George', 'Hannah'],
    'Gaming_Hours': [2, 5, 1, 8, 3, 6, 4, 1.5],
    'GPA': [3.8, 3.2, 3.9, 2.5, 3.6, 2.9, 3.4, 3.7]
})

# YOUR CODE HERE:
# Create a scatter plot with Gaming_Hours on x-axis and GPA on y-axis
# Include student names in the hover data
# Add an appropriate title and labels


## Exercise 2: Create a Line Chart

Create a line chart showing how your sleep hours changed over a week.

In [None]:
# Data provided
sleep_data = pd.DataFrame({
    'Day': ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'],
    'Sleep_Hours': [7, 6.5, 7.5, 6, 8, 9, 8.5]
})

# YOUR CODE HERE:
# Create a line chart with markers
# Add appropriate title and labels


## Exercise 3: Create a Bar Chart with Custom Colors

Create a bar chart showing favorite pizza toppings and customize the colors!

In [None]:
# Data provided
pizza_data = pd.DataFrame({
    'Topping': ['Pepperoni', 'Mushrooms', 'Olives', 'Sausage', 'Onions'],
    'Votes': [45, 23, 12, 38, 15]
})

# YOUR CODE HERE:
# Create a bar chart
# Try using fig.update_traces() to change the color
# Add title and labels


---

# Summary: Plotly vs Static Plots

## Why Choose Plotly?

| Feature | Static Plots (Matplotlib) | Interactive Plots (Plotly) |
|---------|--------------------------|---------------------------|
| Hover Information | ‚ùå No | ‚úÖ Yes - see exact values |
| Zoom In/Out | ‚ùå No | ‚úÖ Yes - explore details |
| Toggle Data Series | ‚ùå No | ‚úÖ Yes - click legend |
| Pan Around | ‚ùå No | ‚úÖ Yes - move the view |
| Save as Image | ‚úÖ Yes | ‚úÖ Yes - built-in button |
| 3D Rotation | ‚ùå Limited | ‚úÖ Yes - full 360¬∞ |
| Data Exploration | ‚ö†Ô∏è Limited | ‚úÖ Excellent |
| Dashboard Creation | ‚ö†Ô∏è Harder | ‚úÖ Easy with subplots |

## When to Use Plotly:
- üìä Creating dashboards
- üîç Exploring large datasets
- üåê Building web applications
- üéì Presenting data interactively
- üìà Data analysis and investigation

## When Static Plots Might Be Better:
- üìÑ Printing in reports (though Plotly can export too!)
- üé® Very specific custom artistic visualizations
- üíæ Minimal file size requirements

---

## Next Steps:

1. **Practice with Your Own Data**: Try creating Plotly visualizations with datasets from your other classes!

2. **Explore Plotly Express**: Check out the [Plotly Express documentation](https://plotly.com/python/plotly-express/) for more chart types

3. **Build a Dashboard**: Combine multiple plots to tell a story with your data

4. **Advanced Features**: Learn about:
   - Animated plots
   - Geographic maps
   - Statistical charts
   - Interactive widgets

---

**Remember**: The best way to learn is by doing! Experiment with different chart types and customize them to make them your own. Every plot in Plotly is interactive by default - that's the magic! ‚ú®

