# Plotly Cheat Sheet for Data Scientists

Plotly is a powerful library for creating interactive visualizations. It's excellent for web applications, dashboards, and when you need user interaction with your plots.

**Key Advantages**:
- Interactive by default (zoom, pan, hover, select)
- Publication-quality plots
- 3D plotting capabilities
- Easy web integration
- Both high-level (Plotly Express) and low-level (Graph Objects) APIs


## 1. Setup and Basic Concepts


In [1]:
import plotly.express as px  # High-level interface (recommended for most uses)
import plotly.graph_objects as go  # Low-level interface (more control)
import plotly.figure_factory as ff  # Specialized plots
import pandas as pd
import numpy as np

# For Jupyter notebooks
import plotly.io as pio
pio.renderers.default = "notebook"  # or "colab" for Google Colab

# Load sample data
df = px.data.tips()  # Same as seaborn tips dataset
iris = px.data.iris()
gapminder = px.data.gapminder()

# Basic configuration
import plotly.offline as pyo
pyo.init_notebook_mode(connected=True)  # For offline use


## 2. Plotly Express (High-Level API)

### The px approach - similar to seaborn but interactive


In [2]:
# Scatter plot
fig = px.scatter(df, x='total_bill', y='tip', title='Tips vs Total Bill')
fig.show()

# Scatter with color and size
fig = px.scatter(df, x='total_bill', y='tip', 
                 color='time', size='size',
                 hover_data=['day'],
                 title='Tips Analysis')
fig.show()

# Line plot
fig = px.line(gapminder.query("country=='Canada'"), 
              x='year', y='lifeExp',
              title='Life Expectancy in Canada')
fig.show()

# Bar plot
fig = px.bar(df.groupby('day')['total_bill'].mean().reset_index(),
             x='day', y='total_bill',
             title='Average Bill by Day')
fig.show()

# Histogram
fig = px.histogram(df, x='total_bill', nbins=20,
                   title='Distribution of Total Bills')
fig.show()

# Box plot
fig = px.box(df, x='day', y='total_bill', color='time',
             title='Bill Distribution by Day and Time')
fig.show()


## 3. Advanced Plotly Express


In [3]:
# Faceted plots (subplots)
fig = px.scatter(df, x='total_bill', y='tip',
                 facet_col='time', facet_row='sex',
                 color='day',
                 title='Tips by Time and Sex')
fig.show()

# 3D scatter plot
fig = px.scatter_3d(iris, x='sepal_length', y='sepal_width', z='petal_length',
                    color='species', size='petal_width',
                    title='3D Iris Dataset')
fig.show()

# Animated plots
fig = px.scatter(gapminder, x="gdpPercap", y="lifeExp", 
                 size="pop", color="continent",
                 hover_name="country", log_x=True, size_max=55,
                 animation_frame="year", animation_group="country",
                 title="Gapminder Animation")
fig.show()

# Choropleth map
fig = px.choropleth(gapminder.query("year==2007"), 
                    locations="iso_alpha", color="lifeExp",
                    hover_name="country", color_continuous_scale=px.colors.sequential.Plasma,
                    title="Life Expectancy by Country (2007)")
fig.show()

# Parallel coordinates
fig = px.parallel_coordinates(iris, color="species_id", 
                              labels={"species_id": "Species",
                                      "sepal_width": "Sepal Width",
                                      "sepal_length": "Sepal Length",  
                                      "petal_width": "Petal Width",
                                      "petal_length": "Petal Length"},
                              title="Iris Parallel Coordinates")
fig.show()


## 4. Graph Objects (Low-Level API)

### When you need more control


In [4]:
# Basic scatter plot with Graph Objects
fig = go.Figure()
fig.add_trace(go.Scatter(x=df['total_bill'], y=df['tip'],
                         mode='markers',
                         name='Tips',
                         marker=dict(size=8, color='blue')))
fig.update_layout(title='Tips vs Total Bill',
                  xaxis_title='Total Bill ($)',
                  yaxis_title='Tip ($)')
fig.show()

# Multiple traces
fig = go.Figure()

for time in df['time'].unique():
    subset = df[df['time'] == time]
    fig.add_trace(go.Scatter(x=subset['total_bill'], y=subset['tip'],
                             mode='markers',
                             name=time,
                             marker=dict(size=8)))

fig.update_layout(title='Tips by Time of Day',
                  xaxis_title='Total Bill ($)',
                  yaxis_title='Tip ($)')
fig.show()

# Bar chart with custom styling
fig = go.Figure()
day_avg = df.groupby('day')['total_bill'].mean()
fig.add_trace(go.Bar(x=day_avg.index, y=day_avg.values,
                     marker_color=['red', 'blue', 'green', 'orange']))
fig.update_layout(title='Average Bill by Day',
                  xaxis_title='Day',
                  yaxis_title='Average Bill ($)')
fig.show()


## 5. Subplots and Complex Layouts


In [5]:
from plotly.subplots import make_subplots

# Create subplots
fig = make_subplots(rows=2, cols=2,
                    subplot_titles=('Scatter', 'Box', 'Histogram', 'Bar'))

# Add traces to different subplots
fig.add_trace(go.Scatter(x=df['total_bill'], y=df['tip'], mode='markers'),
              row=1, col=1)

fig.add_trace(go.Box(y=df['total_bill'], name='Total Bill'),
              row=1, col=2)

fig.add_trace(go.Histogram(x=df['total_bill'], nbinsx=20),
              row=2, col=1)

day_counts = df['day'].value_counts()
fig.add_trace(go.Bar(x=day_counts.index, y=day_counts.values),
              row=2, col=2)

fig.update_layout(height=600, showlegend=False,
                  title_text="Multiple Plot Types")
fig.show()

# Mixed subplot types
fig = make_subplots(
    rows=2, cols=2,
    specs=[[{"type": "scatter"}, {"type": "scatter"}],
           [{"type": "scatter", "colspan": 2}, None]],
    subplot_titles=('Plot 1', 'Plot 2', 'Plot 3'))

fig.add_trace(go.Scatter(x=[1, 2, 3], y=[4, 5, 6]), row=1, col=1)
fig.add_trace(go.Scatter(x=[1, 2, 3], y=[2, 3, 4]), row=1, col=2)
fig.add_trace(go.Scatter(x=[1, 2, 3, 4], y=[2, 3, 4, 5]), row=2, col=1)

fig.update_layout(height=500, title_text="Custom Subplot Layout")
fig.show()


## 6. Heatmaps and Matrix Visualizations


In [6]:
# Correlation heatmap
corr_matrix = df.select_dtypes(include=[np.number]).corr()

fig = go.Figure(data=go.Heatmap(
    z=corr_matrix.values,
    x=corr_matrix.columns,
    y=corr_matrix.columns,
    colorscale='RdBu',
    zmid=0,
    text=corr_matrix.round(2).values,
    texttemplate="%{text}",
    textfont={"size": 10},
    hoverongaps=False))

fig.update_layout(title='Correlation Matrix',
                  xaxis_title='Variables',
                  yaxis_title='Variables')
fig.show()

# Annotated heatmap with Plotly Express
fig = px.imshow(corr_matrix, 
                text_auto=True, 
                color_continuous_scale='RdBu_r',
                title='Correlation Matrix (Plotly Express)')
fig.show()

# 2D histogram
fig = go.Figure(data=go.Histogram2d(x=df['total_bill'], y=df['tip'],
                                    colorscale='Blues'))
fig.update_layout(title='2D Histogram of Tips vs Total Bill')
fig.show()


## 7. Time Series and Financial Plots


In [7]:
# Create sample time series data
dates = pd.date_range('2023-01-01', periods=100, freq='D')
values = np.cumsum(np.random.randn(100)) + 100
ts_df = pd.DataFrame({'date': dates, 'value': values})

# Basic time series
fig = px.line(ts_df, x='date', y='value', title='Time Series Plot')
fig.show()

# Time series with range selector
fig = go.Figure()
fig.add_trace(go.Scatter(x=ts_df['date'], y=ts_df['value'],
                         mode='lines', name='Value'))

fig.update_layout(
    title='Time Series with Range Selector',
    xaxis=dict(
        rangeselector=dict(
            buttons=list([
                dict(count=7, label="7d", step="day", stepmode="backward"),
                dict(count=30, label="30d", step="day", stepmode="backward"),
                dict(step="all")
            ])
        ),
        rangeslider=dict(visible=True),
        type="date"
    )
)
fig.show()

# Candlestick chart (for financial data)
# Create sample OHLC data
np.random.seed(42)
dates = pd.date_range('2023-01-01', periods=30, freq='D')
open_prices = 100 + np.cumsum(np.random.randn(30) * 0.5)
high_prices = open_prices + np.random.rand(30) * 2
low_prices = open_prices - np.random.rand(30) * 2
close_prices = open_prices + np.random.randn(30) * 0.5

fig = go.Figure(data=go.Candlestick(x=dates,
                                    open=open_prices,
                                    high=high_prices,
                                    low=low_prices,
                                    close=close_prices))
fig.update_layout(title='Candlestick Chart')
fig.show()


## 8. Customization and Styling


In [8]:
# Custom color schemes
fig = px.scatter(df, x='total_bill', y='tip', color='time',
                 color_discrete_map={'Lunch': 'red', 'Dinner': 'blue'})
fig.show()

# Custom layout
fig = px.scatter(df, x='total_bill', y='tip', color='day')
fig.update_layout(
    title=dict(text='Tips Analysis', x=0.5, font=dict(size=20)),
    xaxis=dict(title='Total Bill ($)', gridcolor='lightgray'),
    yaxis=dict(title='Tip ($)', gridcolor='lightgray'),
    plot_bgcolor='white',
    paper_bgcolor='lightgray',
    font=dict(family="Arial", size=12),
    showlegend=True,
    legend=dict(x=0.02, y=0.98, bgcolor='rgba(255,255,255,0.8)')
)
fig.show()

# Custom hover information
fig = px.scatter(df, x='total_bill', y='tip', 
                 hover_data={'day': True, 'time': True, 'size': True},
                 hover_name='day')
fig.update_traces(hovertemplate='<b>%{hovertext}</b><br>' +
                                'Total Bill: $%{x}<br>' +
                                'Tip: $%{y}<br>' +
                                'Time: %{customdata[0]}<br>' +
                                'Party Size: %{customdata[1]}')
fig.show()

# Annotations
fig = px.scatter(df, x='total_bill', y='tip')
fig.add_annotation(x=40, y=8,
                   text="High tip region",
                   showarrow=True,
                   arrowhead=2,
                   arrowsize=1,
                   arrowwidth=2,
                   arrowcolor="#636363")
fig.show()


## 9. Interactive Features


In [9]:
# Dropdown menus
fig = go.Figure()

# Add all data initially
fig.add_trace(go.Scatter(x=df['total_bill'], y=df['tip'],
                         mode='markers', name='All Data'))

# Create dropdown buttons
dropdown_buttons = []
for day in df['day'].unique():
    subset = df[df['day'] == day]
    dropdown_buttons.append(
        dict(label=day,
             method="restyle",
             args=[{"x": [subset['total_bill']], 
                    "y": [subset['tip']],
                    "name": [day]}])
    )

# Add "All" option
dropdown_buttons.insert(0, 
    dict(label="All",
         method="restyle", 
         args=[{"x": [df['total_bill']], 
                "y": [df['tip']],
                "name": ["All Data"]}])
)

fig.update_layout(
    updatemenus=[
        dict(
            buttons=dropdown_buttons,
            direction="down",
            showactive=True,
            x=0.01,
            xanchor="left",
            y=1.02,
            yanchor="top"
        ),
    ],
    title="Interactive Dropdown"
)
fig.show()

# Range slider
fig = px.scatter(df, x='total_bill', y='tip', color='day')
fig.update_layout(xaxis=dict(rangeslider=dict(visible=True)))
fig.show()

# Crossfilter-style selection
fig = px.scatter(df, x='total_bill', y='tip', color='day',
                 title='Click and drag to select points')
fig.update_layout(dragmode='select')
fig.show()


## 10. Exporting and Sharing


In [11]:
# Static image export (requires kaleido: pip install kaleido)
fig = px.scatter(df, x='total_bill', y='tip', color='day')

# Check if kaleido is available for image export
try:
    import kaleido
    KALEIDO_AVAILABLE = True
    print("✅ Kaleido available - static image export enabled")
except ImportError:
    KALEIDO_AVAILABLE = False
    print("⚠️  Kaleido not available - static image export disabled")
    print("   Install with: pip install kaleido")

# Export as various formats (only if kaleido available)
if KALEIDO_AVAILABLE:
    try:
        fig.write_image("plot.png", width=800, height=600)
        fig.write_image("plot.pdf", width=800, height=600)
        fig.write_image("plot.svg", width=800, height=600)
        print("✅ Static images exported successfully!")
    except Exception as e:
        print(f"⚠️  Static image export failed: {e}")
else:
    print("📝 Skipping static image export (kaleido not installed)")
    print("   Alternative: Use fig.show() and save manually from browser")

# HTML export (interactive) - always works
try:
    fig.write_html("plot.html")
    print("✅ Interactive HTML exported successfully!")
except Exception as e:
    print(f"⚠️  HTML export failed: {e}")

# Show as HTML string - always works
try:
    html_str = fig.to_html(include_plotlyjs='cdn')  # Uses CDN
    html_str_inline = fig.to_html(include_plotlyjs='inline')  # Inline JS
    print("✅ HTML string generation successful!")
except Exception as e:
    print(f"⚠️  HTML string generation failed: {e}")

# JSON export/import - always works
try:
    json_str = fig.to_json()
    fig_from_json = go.Figure(fig.to_dict())
    print("✅ JSON export/import successful!")
except Exception as e:
    print(f"⚠️  JSON export/import failed: {e}")

print("\nExport Summary:")
print("- Interactive HTML: Always available")
print("- JSON format: Always available") 
print(f"- Static images (PNG/PDF/SVG): {'Available' if KALEIDO_AVAILABLE else 'Requires kaleido package'}")
if not KALEIDO_AVAILABLE:
    print("\nTo enable static image export:")
    print("  conda install -c conda-forge python-kaleido")
    print("  # or")
    print("  pip install kaleido")


⚠️  Kaleido not available - static image export disabled
   Install with: pip install kaleido
📝 Skipping static image export (kaleido not installed)
   Alternative: Use fig.show() and save manually from browser
✅ Interactive HTML exported successfully!
✅ HTML string generation successful!
✅ JSON export/import successful!

Export Summary:
- Interactive HTML: Always available
- JSON format: Always available
- Static images (PNG/PDF/SVG): Requires kaleido package

To enable static image export:
  conda install -c conda-forge python-kaleido
  # or
  pip install kaleido


## 11. Performance Tips


In [12]:
# For large datasets, use WebGL rendering
fig = go.Figure()
fig.add_trace(go.Scattergl(  # Note: Scattergl instead of Scatter
    x=np.random.randn(10000),
    y=np.random.randn(10000),
    mode='markers',
    marker=dict(size=3)
))
fig.update_layout(title='Large Dataset with WebGL')
fig.show()

# Sampling for very large datasets
def sample_data(df, n_samples=1000):
    if len(df) <= n_samples:
        return df
    return df.sample(n=n_samples, random_state=42)

# Use with large datasets
large_sample = sample_data(df, 100)  # Sample for demo
fig = px.scatter(large_sample, x='total_bill', y='tip')
fig.show()

# Reduce file size for web
fig = px.scatter(df, x='total_bill', y='tip')
fig.update_layout(
    margin=dict(l=0, r=0, t=30, b=0),  # Reduce margins
    showlegend=False,  # Remove legend if not needed
    font=dict(size=10)  # Smaller font
)

print("Performance optimizations applied!")


Performance optimizations applied!


## 12. Quick Reference

### When to Use Plotly Express vs Graph Objects

| Use Case | Recommended API | Why |
|----------|----------------|-----|
| **Quick exploration** | Plotly Express | One-line plots, automatic styling |
| **Standard plots** | Plotly Express | Less code, built-in best practices |
| **Custom layouts** | Graph Objects | Full control over every element |
| **Multiple traces** | Graph Objects | Easier to manage complex plots |
| **Animations** | Plotly Express | Built-in animation support |
| **Dashboards** | Both | PE for speed, GO for customization |

### Most Common Functions

```python
# Plotly Express (80% of use cases)
px.scatter()        # Scatter plots
px.line()          # Line plots  
px.bar()           # Bar charts
px.histogram()     # Histograms
px.box()           # Box plots
px.heatmap()       # Heatmaps (use px.imshow)
px.choropleth()    # Maps

# Graph Objects (when you need control)
go.Scatter()       # Custom scatter
go.Bar()          # Custom bars
go.Heatmap()      # Custom heatmaps
go.Figure()       # Create empty figure
make_subplots()   # Multiple plots
```

### Key Advantages Over Other Libraries
- **Interactive by default** - No extra code needed
- **Web-ready** - Easy to embed in websites/dashboards  
- **3D plots** - Built-in 3D visualization
- **Animation** - Timeline-based animations
- **Geographic** - Built-in map projections
- **Professional** - Publication-quality output

### Remember
- Use `fig.show()` to display plots
- Plotly Express for speed, Graph Objects for control
- Always consider your audience - interactivity is powerful but not always needed
- Use WebGL (`Scattergl`) for large datasets (>10k points)
