# Streamlit Tutorial: Building Interactive Web Applications

Welcome to this comprehensive tutorial on building web applications with Streamlit! Streamlit is a powerful Python library that allows you to create beautiful, interactive web applications with minimal code.

## Table of Contents
1. [Introduction to Streamlit](#introduction)
2. [Installation and Setup](#installation)
3. [Basic Streamlit Components](#basic-components)
4. [Data Visualization](#data-visualization)
5. [Interactive Elements](#interactive-elements)
6. [Working with Real Data](#real-data)
7. [Advanced Features](#advanced-features)
8. [Deployment](#deployment)
9. [Best Practices](#best-practices)

## 1. Introduction to Streamlit {#introduction}

Streamlit is an open-source Python library that makes it easy to create and share beautiful, custom web applications for machine learning and data science. Here's why Streamlit is popular:

- **Simple**: No need to learn HTML/CSS/JavaScript
- **Fast**: Build apps in minutes, not hours
- **Interactive**: Built-in widgets for user interaction
- **Pythonic**: Uses pure Python syntax
- **Free**: Open source and free to use

### When to use Streamlit:
- Creating data dashboards
- Building ML model demos
- Sharing analysis results
- Creating internal tools
- Prototyping ideas quickly

## 2. Installation and Setup {#installation}

First, let's make sure Streamlit is installed:

In [None]:
# Install required packages
!pip install streamlit pandas numpy matplotlib seaborn plotly

Let's also import the libraries we'll need:

In [2]:
import streamlit as st
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
from datetime import datetime, date
import os

## 3. Basic Streamlit Components {#basic-components}

Let's create our first Streamlit app! In Streamlit, you write regular Python scripts that use special Streamlit functions to display content.

### 3.1 Creating Your First App

Here's a simple "Hello World" Streamlit app:

In [3]:
# Create a simple Streamlit app
app_code = '''
import streamlit as st

# Title of the app
st.title("My First Streamlit App! 🎉")

# Header
st.header("Welcome to Streamlit")

# Subheader
st.subheader("This is a subheader")

# Text
st.text("This is some text.")

# Markdown
st.markdown("**This is bold text** and *this is italic text*")

# Write (automatically detects data type)
st.write("Hello, World!")
st.write(1234)
st.write([1, 2, 3, 4])
'''

# Save the app code to a file
with open('basic_app.py', 'w') as f:
    f.write(app_code)

print("Basic app created! To run it, use: streamlit run basic_app.py")

Basic app created! To run it, use: streamlit run basic_app.py


### 3.2 Text and Formatting Elements

Streamlit provides various ways to display text and format content:

In [5]:
text_app_code = '''
import streamlit as st

st.title("Text and Formatting Demo")

# Different text elements
st.title("This is a title")
st.header("This is a header")
st.subheader("This is a subheader")
st.text("This is plain text")

# Markdown formatting
st.markdown("""
### This is markdown
- **Bold text**
- *Italic text*
- `Code text`
- [Link to Streamlit](https://streamlit.io)
""")

# Code blocks
st.code("""
def hello():
    print("Hello, Streamlit!")
""", language='python')

# LaTeX
st.latex(r"""e^{i\pi} + 1 = 0""")

# Success, info, warning, error messages
st.success("This is a success message!")
st.info("This is an info message.")
st.warning("This is a warning message.")
st.error("This is an error message.")
'''

with open('text_demo.py', 'w') as f:
    f.write(text_app_code)

print("Text demo app created! To run it, use: streamlit run text_demo.py")

Text demo app created! To run it, use: streamlit run text_demo.py


  st.latex(r"""e^{i\pi} + 1 = 0""")


## 4. Data Visualization {#data-visualization}

One of Streamlit's strongest features is its built-in support for data visualization. Let's explore different ways to display data.

### 4.1 Basic Charts

In [6]:
# Let's load our datasets first
penguins_df = pd.read_csv('data/penguins.csv')
chla_df = pd.read_csv('data/chla_subset.csv')

print("Penguins dataset shape:", penguins_df.shape)
print("Penguins columns:", penguins_df.columns.tolist())
print("\nChlorophyll-a dataset shape:", chla_df.shape)
print("Chlorophyll-a columns:", chla_df.columns.tolist())

Penguins dataset shape: (344, 7)
Penguins columns: ['species', 'island', 'bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g', 'sex']

Chlorophyll-a dataset shape: (7177, 6)
Chlorophyll-a columns: ['gnis_name', 'comid', 'centroid_longitude', 'centroid_latitude', 'date_acquired', 'predictions']


In [7]:
charts_app_code = '''
import streamlit as st
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px

st.title("Data Visualization with Streamlit")

# Load the penguins dataset
penguins_df = pd.read_csv('data/penguins.csv')

st.header("Dataset Preview")
st.write("Here's our penguins dataset:")
st.dataframe(penguins_df.head())

# Remove rows with missing values for visualization
penguins_clean = penguins_df.dropna()

st.header("Built-in Streamlit Charts")

# Line chart
st.subheader("Line Chart")
chart_data = pd.DataFrame(
    np.random.randn(20, 3),
    columns=['a', 'b', 'c']
)
st.line_chart(chart_data)

# Area chart
st.subheader("Area Chart")
st.area_chart(chart_data)

# Bar chart
st.subheader("Bar Chart")
species_counts = penguins_clean['species'].value_counts()
st.bar_chart(species_counts)

# Scatter plot
st.subheader("Scatter Plot")
scatter_data = penguins_clean[['bill_length_mm', 'bill_depth_mm']]
st.scatter_chart(scatter_data, x='bill_length_mm', y='bill_depth_mm')

st.header("Matplotlib Integration")
fig, ax = plt.subplots()
ax.scatter(penguins_clean['bill_length_mm'], penguins_clean['flipper_length_mm'])
ax.set_xlabel('Bill Length (mm)')
ax.set_ylabel('Flipper Length (mm)')
ax.set_title('Bill Length vs Flipper Length')
st.pyplot(fig)

st.header("Plotly Integration")
fig = px.scatter(penguins_clean, 
                x='bill_length_mm', 
                y='bill_depth_mm',
                color='species',
                title='Bill Dimensions by Species')
st.plotly_chart(fig)
'''

with open('charts_demo.py', 'w') as f:
    f.write(charts_app_code)

print("Charts demo app created! To run it, use: streamlit run charts_demo.py")

Charts demo app created! To run it, use: streamlit run charts_demo.py


## 5. Interactive Elements {#interactive-elements}

The real power of Streamlit comes from its interactive widgets. Let's explore the most commonly used ones.

### 5.1 Input Widgets

In [8]:
widgets_app_code = '''
import streamlit as st
import pandas as pd
import numpy as np
from datetime import datetime, date

st.title("Interactive Widgets Demo")

st.header("Input Widgets")

# Text input
name = st.text_input("What's your name?", value="Enter your name here")
st.write(f"Hello, {name}!")

# Number input
age = st.number_input("How old are you?", min_value=0, max_value=120, value=25)
st.write(f"You are {age} years old.")

# Slider
temperature = st.slider("Select temperature", min_value=-10, max_value=40, value=20)
st.write(f"Temperature: {temperature}°C")

# Select box
option = st.selectbox("Choose your favorite color", 
                     ["Red", "Green", "Blue", "Yellow"])
st.write(f"Your favorite color is {option}")

# Multi-select
options = st.multiselect("Choose multiple options",
                        ["Option 1", "Option 2", "Option 3", "Option 4"],
                        default=["Option 1", "Option 2"])
st.write(f"You selected: {options}")

# Radio buttons
genre = st.radio("What's your favorite movie genre?",
                ["Comedy", "Drama", "Documentary"])
st.write(f"You selected: {genre}")

# Checkbox
agree = st.checkbox("I agree to the terms and conditions")
if agree:
    st.success("Thanks for agreeing!")

# Date input
birthday = st.date_input("When is your birthday?", 
                        value=date(1990, 1, 1))
st.write(f"Your birthday is: {birthday}")

# Time input
time = st.time_input("What time is it?")
st.write(f"Time: {time}")

# File uploader
uploaded_file = st.file_uploader("Choose a CSV file", type="csv")
if uploaded_file is not None:
    df = pd.read_csv(uploaded_file)
    st.write("File uploaded successfully!")
    st.dataframe(df.head())

# Button
if st.button("Click me!"):
    st.balloons()
    st.success("Button clicked!")
'''

with open('widgets_demo.py', 'w') as f:
    f.write(widgets_app_code)

print("Widgets demo app created! To run it, use: streamlit run widgets_demo.py")

Widgets demo app created! To run it, use: streamlit run widgets_demo.py


## 6. Working with Real Data {#real-data}

Now let's create a more comprehensive app using our real datasets - the penguins data and chlorophyll-a data.

### 6.1 Palmer Penguins Explorer

In [9]:
penguins_app_code = '''
import streamlit as st
import pandas as pd
import numpy as np
import plotly.express as px
import seaborn as sns
import matplotlib.pyplot as plt

# Page config
st.set_page_config(page_title="Palmer Penguins Explorer", 
                   page_icon="🐧", 
                   layout="wide")

st.title("🐧 Palmer Penguins Explorer")
st.markdown("Explore the famous Palmer Penguins dataset interactively!")

# Load data
@st.cache_data
def load_data():
    df = pd.read_csv('data/penguins.csv')
    return df

df = load_data()

# Sidebar for filters
st.sidebar.header("Filter Options")

# Species filter
species_options = df['species'].dropna().unique()
selected_species = st.sidebar.multiselect(
    "Select Species", 
    species_options, 
    default=species_options
)

# Island filter
island_options = df['island'].dropna().unique()
selected_islands = st.sidebar.multiselect(
    "Select Islands", 
    island_options, 
    default=island_options
)

# Sex filter
sex_options = df['sex'].dropna().unique()
selected_sex = st.sidebar.multiselect(
    "Select Sex", 
    sex_options, 
    default=sex_options
)

# Filter data
filtered_df = df[
    (df['species'].isin(selected_species)) &
    (df['island'].isin(selected_islands)) &
    (df['sex'].isin(selected_sex))
]

# Main content
col1, col2, col3, col4 = st.columns(4)

with col1:
    st.metric("Total Penguins", len(filtered_df))
with col2:
    avg_bill_length = filtered_df['bill_length_mm'].mean()
    st.metric("Avg Bill Length", f"{avg_bill_length:.1f} mm")
with col3:
    avg_body_mass = filtered_df['body_mass_g'].mean()
    st.metric("Avg Body Mass", f"{avg_body_mass:.0f} g")
with col4:
    species_count = filtered_df['species'].nunique()
    st.metric("Species Count", species_count)

# Visualizations
st.subheader("Data Exploration")

tab1, tab2, tab3 = st.tabs(["Scatter Plots", "Distributions", "Raw Data"])

with tab1:
    col1, col2 = st.columns(2)
    
    with col1:
        # Bill dimensions scatter plot
        fig1 = px.scatter(
            filtered_df.dropna(), 
            x='bill_length_mm', 
            y='bill_depth_mm',
            color='species',
            title='Bill Length vs Bill Depth',
            labels={'bill_length_mm': 'Bill Length (mm)', 
                   'bill_depth_mm': 'Bill Depth (mm)'}
        )
        st.plotly_chart(fig1, use_container_width=True)
    
    with col2:
        # Body mass vs flipper length
        fig2 = px.scatter(
            filtered_df.dropna(), 
            x='flipper_length_mm', 
            y='body_mass_g',
            color='species',
            size='bill_length_mm',
            title='Body Mass vs Flipper Length',
            labels={'flipper_length_mm': 'Flipper Length (mm)', 
                   'body_mass_g': 'Body Mass (g)'}
        )
        st.plotly_chart(fig2, use_container_width=True)

with tab2:
    col1, col2 = st.columns(2)
    
    with col1:
        # Species distribution
        fig3 = px.histogram(
            filtered_df, 
            x='species',
            title='Species Distribution'
        )
        st.plotly_chart(fig3, use_container_width=True)
    
    with col2:
        # Body mass distribution
        fig4 = px.histogram(
            filtered_df.dropna(), 
            x='body_mass_g',
            color='species',
            title='Body Mass Distribution',
            labels={'body_mass_g': 'Body Mass (g)'}
        )
        st.plotly_chart(fig4, use_container_width=True)

with tab3:
    st.subheader("Filtered Dataset")
    st.dataframe(filtered_df, use_container_width=True)
    
    # Download button
    csv = filtered_df.to_csv(index=False)
    st.download_button(
        label="Download filtered data as CSV",
        data=csv,
        file_name='filtered_penguins.csv',
        mime='text/csv'
    )

# Summary statistics
st.subheader("Summary Statistics")
st.dataframe(filtered_df.describe(), use_container_width=True)
'''

with open('penguins_explorer.py', 'w') as f:
    f.write(penguins_app_code)

print("Penguins explorer app created! To run it, use: streamlit run penguins_explorer.py")

Penguins explorer app created! To run it, use: streamlit run penguins_explorer.py


### 6.2 Chlorophyll-a Monitoring Dashboard

In [10]:
chla_app_code = '''
import streamlit as st
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from datetime import datetime

# Page config
st.set_page_config(page_title="Chlorophyll-a Monitor", 
                   page_icon="🌊", 
                   layout="wide")

st.title("🌊 Chlorophyll-a Monitoring Dashboard")
st.markdown("Monitor chlorophyll-a levels in New York's inland waters")

# Load data
@st.cache_data
def load_chla_data():
    df = pd.read_csv('data/chla_subset.csv')
    df['date_acquired'] = pd.to_datetime(df['date_acquired'])
    return df

df = load_chla_data()

# Sidebar filters
st.sidebar.header("Filter Options")

# Water body filter
water_bodies = df['gnis_name'].unique()
selected_water_bodies = st.sidebar.multiselect(
    "Select Water Bodies",
    water_bodies,
    default=water_bodies[:3]  # Select first 3 by default
)

# Date range filter
min_date = df['date_acquired'].min().date()
max_date = df['date_acquired'].max().date()

date_range = st.sidebar.date_input(
    "Select Date Range",
    value=(min_date, max_date),
    min_value=min_date,
    max_value=max_date
)

# Chlorophyll-a threshold
chla_threshold = st.sidebar.slider(
    "Chlorophyll-a Alert Threshold",
    min_value=0.0,
    max_value=50.0,
    value=20.0,
    step=0.5
)

# Filter data
if len(date_range) == 2:
    start_date, end_date = date_range
    filtered_df = df[
        (df['gnis_name'].isin(selected_water_bodies)) &
        (df['date_acquired'].dt.date >= start_date) &
        (df['date_acquired'].dt.date <= end_date)
    ]
else:
    filtered_df = df[df['gnis_name'].isin(selected_water_bodies)]

# Key metrics
col1, col2, col3, col4 = st.columns(4)

with col1:
    st.metric("Total Measurements", len(filtered_df))

with col2:
    avg_chla = filtered_df['predictions'].mean()
    st.metric("Avg Chlorophyll-a", f"{avg_chla:.2f} μg/L")

with col3:
    max_chla = filtered_df['predictions'].max()
    st.metric("Max Chlorophyll-a", f"{max_chla:.2f} μg/L")

with col4:
    alerts = len(filtered_df[filtered_df['predictions'] > chla_threshold])
    st.metric("High Readings", alerts)

# Visualizations
st.subheader("Chlorophyll-a Analysis")

tab1, tab2, tab3 = st.tabs(["Time Series", "Spatial View", "Statistics"])

with tab1:
    # Time series plot
    fig1 = px.line(
        filtered_df,
        x='date_acquired',
        y='predictions',
        color='gnis_name',
        title='Chlorophyll-a Levels Over Time',
        labels={'date_acquired': 'Date', 
               'predictions': 'Chlorophyll-a (μg/L)',
               'gnis_name': 'Water Body'}
    )
    
    # Add threshold line
    fig1.add_hline(y=chla_threshold, 
                   line_dash="dash", 
                   line_color="red",
                   annotation_text=f"Alert Threshold: {chla_threshold} μg/L")
    
    st.plotly_chart(fig1, use_container_width=True)

with tab2:
    # Map view
    fig2 = px.scatter_mapbox(
        filtered_df,
        lat='centroid_latitude',
        lon='centroid_longitude',
        color='predictions',
        size='predictions',
        hover_name='gnis_name',
        hover_data=['date_acquired', 'predictions'],
        color_continuous_scale='Viridis',
        title='Chlorophyll-a Levels by Location',
        mapbox_style='open-street-map',
        zoom=6
    )
    
    fig2.update_layout(height=500)
    st.plotly_chart(fig2, use_container_width=True)

with tab3:
    col1, col2 = st.columns(2)
    
    with col1:
        # Distribution histogram
        fig3 = px.histogram(
            filtered_df,
            x='predictions',
            nbins=30,
            title='Chlorophyll-a Distribution',
            labels={'predictions': 'Chlorophyll-a (μg/L)'}
        )
        st.plotly_chart(fig3, use_container_width=True)
    
    with col2:
        # Box plot by water body
        fig4 = px.box(
            filtered_df,
            x='gnis_name',
            y='predictions',
            title='Chlorophyll-a by Water Body',
            labels={'gnis_name': 'Water Body', 
                   'predictions': 'Chlorophyll-a (μg/L)'}
        )
        fig4.update_xaxes(tickangle=45)
        st.plotly_chart(fig4, use_container_width=True)

# Summary table
st.subheader("Summary by Water Body")
summary = filtered_df.groupby('gnis_name').agg({
    'predictions': ['count', 'mean', 'min', 'max', 'std'],
    'date_acquired': ['min', 'max']
}).round(2)

summary.columns = ['Measurements', 'Mean', 'Min', 'Max', 'Std Dev', 'First Date', 'Last Date']
st.dataframe(summary, use_container_width=True)

# Alerts section
st.subheader("High Chlorophyll-a Alerts")
alerts_df = filtered_df[filtered_df['predictions'] > chla_threshold].sort_values('predictions', ascending=False)

if len(alerts_df) > 0:
    st.warning(f"Found {len(alerts_df)} readings above threshold ({chla_threshold} μg/L)")
    st.dataframe(alerts_df[['gnis_name', 'date_acquired', 'predictions']].head(10), 
                use_container_width=True)
else:
    st.success("No readings above the threshold - water quality looks good!")
'''

with open('chla_dashboard.py', 'w') as f:
    f.write(chla_app_code)

print("Chlorophyll-a dashboard created! To run it, use: streamlit run chla_dashboard.py")

Chlorophyll-a dashboard created! To run it, use: streamlit run chla_dashboard.py


## 7. Advanced Features {#advanced-features}

Let's explore some advanced Streamlit features that can make your apps more professional and user-friendly.

### 7.1 Session State and Caching

In [11]:
advanced_app_code = '''
import streamlit as st
import pandas as pd
import time
import numpy as np

st.title("Advanced Streamlit Features")

# Session State Example
st.header("Session State Example")
st.write("Session state allows you to store data across reruns")

# Initialize session state
if 'counter' not in st.session_state:
    st.session_state.counter = 0

if 'user_data' not in st.session_state:
    st.session_state.user_data = []

# Counter example
col1, col2, col3 = st.columns(3)

with col1:
    if st.button("Increment"):
        st.session_state.counter += 1

with col2:
    if st.button("Decrement"):
        st.session_state.counter -= 1

with col3:
    if st.button("Reset"):
        st.session_state.counter = 0

st.write(f"Counter value: {st.session_state.counter}")

# User data collection
st.subheader("Data Collection with Session State")

with st.form("data_form"):
    name = st.text_input("Name")
    age = st.number_input("Age", min_value=0, max_value=120)
    city = st.text_input("City")
    
    submitted = st.form_submit_button("Add Entry")
    
    if submitted and name:
        st.session_state.user_data.append({
            'Name': name,
            'Age': age,
            'City': city
        })
        st.success(f"Added {name} to the list!")

if st.session_state.user_data:
    st.write("Collected Data:")
    df = pd.DataFrame(st.session_state.user_data)
    st.dataframe(df)
    
    if st.button("Clear All Data"):
        st.session_state.user_data = []
        st.rerun()

# Caching Example
st.header("Caching Example")
st.write("Caching helps improve performance by storing expensive computations")

@st.cache_data
def expensive_computation(n):
    """Simulate an expensive computation"""
    time.sleep(2)  # Simulate processing time
    return np.random.randn(n, 3).cumsum(axis=0)

n_points = st.slider("Number of data points", 100, 10000, 1000)

with st.spinner("Computing... (this will be cached after first run)"):
    data = expensive_computation(n_points)

st.line_chart(data)
st.info("Try changing the slider - subsequent runs with the same value will be instant!")

# Progress bars
st.header("Progress Indicators")

if st.button("Start Long Process"):
    progress_bar = st.progress(0)
    status_text = st.empty()
    
    for i in range(100):
        progress_bar.progress(i + 1)
        status_text.text(f"Progress: {i+1}%")
        time.sleep(0.01)
    
    status_text.text("Done!")
    st.success("Process completed!")

# Columns and containers
st.header("Layout Examples")

# Columns
col1, col2, col3 = st.columns([2, 1, 1])

with col1:
    st.write("This is a wide column")
    st.bar_chart(np.random.randn(10))

with col2:
    st.write("Column 2")
    st.metric("Metric 1", "123")

with col3:
    st.write("Column 3")
    st.metric("Metric 2", "456")

# Expander
with st.expander("Click to expand"):
    st.write("This content is hidden by default")
    st.image("https://via.placeholder.com/300x200", caption="Placeholder image")

# Container
container = st.container()
container.write("This is inside a container")
st.write("This is outside the container")
container.write("This will appear above the previous line")
'''

with open('advanced_features.py', 'w') as f:
    f.write(advanced_app_code)

print("Advanced features app created! To run it, use: streamlit run advanced_features.py")

Advanced features app created! To run it, use: streamlit run advanced_features.py


### 7.2 Multi-page Applications

In [12]:
# Create a multi-page app structure
multipage_main = '''
import streamlit as st
import pandas as pd

st.set_page_config(page_title="Multi-page App", page_icon="📊", layout="wide")

st.title("📊 Multi-page Streamlit App")
st.markdown("""
Welcome to our multi-page application! Use the sidebar to navigate between pages.

This app demonstrates:
- Multi-page navigation
- Data exploration
- Interactive visualizations
- Machine learning demos
""")

# Sample data preview
st.header("Quick Data Preview")

try:
    penguins_df = pd.read_csv('data/penguins.csv')
    st.dataframe(penguins_df.head())
    st.info("💡 Navigate to the 'Data Explorer' page to interact with this data!")
except FileNotFoundError:
    st.warning("Data files not found. Please ensure the data folder is in the correct location.")

st.markdown("""
## App Features:
- **Home**: Overview and navigation
- **Data Explorer**: Interactive data analysis
- **Visualizations**: Advanced plotting
- **ML Demo**: Simple machine learning examples
""")
'''

# Create pages directory
import os
os.makedirs('pages', exist_ok=True)

with open('multipage_app.py', 'w') as f:
    f.write(multipage_main)

# Data Explorer page
data_explorer_page = '''
import streamlit as st
import pandas as pd
import plotly.express as px

st.title("📈 Data Explorer")

# Load data
@st.cache_data
def load_data():
    try:
        return pd.read_csv('data/penguins.csv')
    except FileNotFoundError:
        return pd.DataFrame()  # Return empty DataFrame if file not found

df = load_data()

if df.empty:
    st.error("Could not load data. Please check if the data file exists.")
else:
    st.success(f"Loaded {len(df)} records")
    
    # Data overview
    col1, col2, col3 = st.columns(3)
    
    with col1:
        st.metric("Total Records", len(df))
    with col2:
        st.metric("Features", len(df.columns))
    with col3:
        st.metric("Species", df['species'].nunique() if 'species' in df.columns else 0)
    
    # Interactive filters
    st.subheader("Filter Data")
    
    if 'species' in df.columns:
        species = st.multiselect(
            "Select Species",
            df['species'].unique(),
            default=df['species'].unique()
        )
        df_filtered = df[df['species'].isin(species)]
    else:
        df_filtered = df
    
    # Display filtered data
    st.subheader("Filtered Dataset")
    st.dataframe(df_filtered)
    
    # Quick visualization
    if len(df_filtered) > 0 and 'species' in df_filtered.columns:
        st.subheader("Quick Visualization")
        
        numeric_columns = df_filtered.select_dtypes(include=['float64', 'int64']).columns
        
        if len(numeric_columns) >= 2:
            col1, col2 = st.columns(2)
            
            with col1:
                x_axis = st.selectbox("X-axis", numeric_columns)
            with col2:
                y_axis = st.selectbox("Y-axis", numeric_columns, index=1)
            
            if x_axis and y_axis:
                fig = px.scatter(
                    df_filtered.dropna(),
                    x=x_axis,
                    y=y_axis,
                    color='species' if 'species' in df_filtered.columns else None,
                    title=f"{x_axis} vs {y_axis}"
                )
                st.plotly_chart(fig, use_container_width=True)
'''

with open('pages/1_📈_Data_Explorer.py', 'w') as f:
    f.write(data_explorer_page)

# Visualizations page
viz_page = '''
import streamlit as st
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import numpy as np

st.title("📊 Advanced Visualizations")

# Load data
@st.cache_data
def load_data():
    try:
        return pd.read_csv('data/penguins.csv')
    except FileNotFoundError:
        return pd.DataFrame()

df = load_data()

if not df.empty:
    df_clean = df.dropna()
    
    # Chart type selection
    chart_type = st.selectbox(
        "Select Chart Type",
        ["Scatter Plot", "Box Plot", "Violin Plot", "Correlation Heatmap", "3D Scatter"]
    )
    
    if chart_type == "Scatter Plot":
        st.subheader("Interactive Scatter Plot")
        
        numeric_cols = df_clean.select_dtypes(include=[np.number]).columns
        
        col1, col2, col3 = st.columns(3)
        with col1:
            x_col = st.selectbox("X-axis", numeric_cols)
        with col2:
            y_col = st.selectbox("Y-axis", numeric_cols, index=1)
        with col3:
            size_col = st.selectbox("Size by", ["None"] + list(numeric_cols))
        
        fig = px.scatter(
            df_clean,
            x=x_col,
            y=y_col,
            color="species" if "species" in df_clean.columns else None,
            size=size_col if size_col != "None" else None,
            title=f"{x_col} vs {y_col}"
        )
        st.plotly_chart(fig, use_container_width=True)
    
    elif chart_type == "Box Plot":
        st.subheader("Box Plot Analysis")
        
        numeric_cols = df_clean.select_dtypes(include=[np.number]).columns
        selected_col = st.selectbox("Select Variable", numeric_cols)
        
        fig = px.box(
            df_clean,
            x="species" if "species" in df_clean.columns else None,
            y=selected_col,
            title=f"Distribution of {selected_col} by Species"
        )
        st.plotly_chart(fig, use_container_width=True)
    
    elif chart_type == "Violin Plot":
        st.subheader("Violin Plot Analysis")
        
        numeric_cols = df_clean.select_dtypes(include=[np.number]).columns
        selected_col = st.selectbox("Select Variable", numeric_cols)
        
        fig = px.violin(
            df_clean,
            x="species" if "species" in df_clean.columns else None,
            y=selected_col,
            title=f"Distribution of {selected_col} by Species"
        )
        st.plotly_chart(fig, use_container_width=True)
    
    elif chart_type == "Correlation Heatmap":
        st.subheader("Correlation Analysis")
        
        numeric_df = df_clean.select_dtypes(include=[np.number])
        corr_matrix = numeric_df.corr()
        
        fig = px.imshow(
            corr_matrix,
            text_auto=True,
            aspect="auto",
            title="Correlation Matrix"
        )
        st.plotly_chart(fig, use_container_width=True)
    
    elif chart_type == "3D Scatter":
        st.subheader("3D Scatter Plot")
        
        numeric_cols = df_clean.select_dtypes(include=[np.number]).columns
        
        if len(numeric_cols) >= 3:
            col1, col2, col3 = st.columns(3)
            
            with col1:
                x_col = st.selectbox("X-axis", numeric_cols, key="3d_x")
            with col2:
                y_col = st.selectbox("Y-axis", numeric_cols, index=1, key="3d_y")
            with col3:
                z_col = st.selectbox("Z-axis", numeric_cols, index=2, key="3d_z")
            
            fig = px.scatter_3d(
                df_clean,
                x=x_col,
                y=y_col,
                z=z_col,
                color="species" if "species" in df_clean.columns else None,
                title=f"3D Plot: {x_col} vs {y_col} vs {z_col}"
            )
            st.plotly_chart(fig, use_container_width=True)
        else:
            st.warning("Need at least 3 numeric columns for 3D plot")

else:
    st.error("Could not load data for visualization")
    
    # Show sample visualization with synthetic data
    st.info("Showing sample visualization with synthetic data")
    
    np.random.seed(42)
    sample_data = pd.DataFrame({
        'x': np.random.randn(100),
        'y': np.random.randn(100),
        'category': np.random.choice(['A', 'B', 'C'], 100)
    })
    
    fig = px.scatter(sample_data, x='x', y='y', color='category', title="Sample Scatter Plot")
    st.plotly_chart(fig, use_container_width=True)
'''

with open('pages/2_📊_Visualizations.py', 'w') as f:
    f.write(viz_page)

print("Multi-page app created! To run it, use: streamlit run multipage_app.py")
print("\nFiles created:")
print("- multipage_app.py (main app)")
print("- pages/1_📈_Data_Explorer.py")
print("- pages/2_📊_Visualizations.py")

Multi-page app created! To run it, use: streamlit run multipage_app.py

Files created:
- multipage_app.py (main app)
- pages/1_📈_Data_Explorer.py
- pages/2_📊_Visualizations.py


## 8. Deployment {#deployment}

Once you've built your Streamlit app, you'll want to share it with others. Here are the main deployment options:

### 8.1 Streamlit Community Cloud (Free)

The easiest way to deploy is using Streamlit Community Cloud:

1. **Push your code to GitHub**
2. **Go to [share.streamlit.io](https://share.streamlit.io)**
3. **Connect your GitHub repository**
4. **Select your main Python file**
5. **Deploy!**

### 8.2 Local Network Deployment

To run your app on your local network:

In [13]:
deployment_instructions = '''
# Deployment Guide for Streamlit Apps

## 1. Local Development
streamlit run your_app.py

## 2. Local Network Access
streamlit run your_app.py --server.address 0.0.0.0

## 3. Custom Port
streamlit run your_app.py --server.port 8080

## 4. Production Settings
streamlit run your_app.py --server.headless true --server.port $PORT

## 5. Requirements File
Create a requirements.txt file with your dependencies:
streamlit>=1.28.0
pandas>=1.5.0
plotly>=5.15.0
numpy>=1.24.0
matplotlib>=3.7.0
seaborn>=0.12.0

## 6. Docker Deployment
# Dockerfile example:
FROM python:3.9-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .

EXPOSE 8501

CMD ["streamlit", "run", "app.py", "--server.address", "0.0.0.0"]
'''

with open('deployment_guide.md', 'w') as f:
    f.write(deployment_instructions)

print("Deployment guide created: deployment_guide.md")

Deployment guide created: deployment_guide.md


## 9. Best Practices {#best-practices}

Here are some best practices for building Streamlit applications:

### 9.1 Performance Optimization

In [14]:
best_practices_code = '''
import streamlit as st
import pandas as pd
import numpy as np

st.title("Streamlit Best Practices")

st.header("1. Use Caching for Expensive Operations")

# ✅ Good: Cache data loading
@st.cache_data
def load_data(file_path):
    """Load data with caching to avoid repeated file reads"""
    return pd.read_csv(file_path)

# ✅ Good: Cache expensive computations
@st.cache_data
def compute_statistics(df):
    """Compute statistics with caching"""
    return df.describe()

st.code("""
@st.cache_data
def load_data(file_path):
    return pd.read_csv(file_path)

# This will only run once, then use cached result
df = load_data('data.csv')
""", language='python')

st.header("2. Optimize Widget Usage")

st.subheader("Use Forms for Multiple Inputs")
st.code("""
# ✅ Good: Group related inputs in a form
with st.form("my_form"):
    name = st.text_input("Name")
    age = st.number_input("Age")
    submitted = st.form_submit_button("Submit")
    
    if submitted:
        # Process form data
        process_data(name, age)
""", language='python')

st.header("3. State Management")

st.code("""
# ✅ Good: Initialize session state properly
if 'counter' not in st.session_state:
    st.session_state.counter = 0

# ✅ Good: Use session state for user data
if 'user_selections' not in st.session_state:
    st.session_state.user_selections = []
""", language='python')

st.header("4. Layout and Design")

st.subheader("Use Columns for Better Layout")
col1, col2, col3 = st.columns(3)

with col1:
    st.metric("Metric 1", "123")
with col2:
    st.metric("Metric 2", "456")
with col3:
    st.metric("Metric 3", "789")

st.code("""
# ✅ Good: Use columns for metrics
col1, col2, col3 = st.columns(3)
with col1:
    st.metric("Sales", "$123K")
with col2:
    st.metric("Users", "456")
with col3:
    st.metric("Growth", "12%")
""", language='python')

st.header("5. Error Handling")

st.code("""
# ✅ Good: Handle errors gracefully
try:
    df = pd.read_csv('data.csv')
    st.success("Data loaded successfully!")
except FileNotFoundError:
    st.error("Data file not found. Please upload a file.")
except pd.errors.EmptyDataError:
    st.warning("The uploaded file is empty.")
except Exception as e:
    st.error(f"An error occurred: {str(e)}")
""", language='python')

st.header("6. Code Organization")

st.markdown("""
**✅ Good practices:**
- Split large apps into multiple pages
- Use functions for repeated code
- Keep business logic separate from UI code
- Use meaningful variable names
- Add docstrings to functions

**❌ Avoid:**
- Putting all code in one large file
- Mixing data processing with UI code
- Not handling edge cases
- Ignoring performance implications
""")

st.header("7. Configuration")

st.code("""
# ✅ Good: Set page config at the top
st.set_page_config(
    page_title="My App",
    page_icon="📊",
    layout="wide",
    initial_sidebar_state="expanded"
)
""", language='python')
'''

with open('best_practices.py', 'w') as f:
    f.write(best_practices_code)

print("Best practices app created! To run it, use: streamlit run best_practices.py")

Best practices app created! To run it, use: streamlit run best_practices.py


## Summary and Next Steps

Congratulations! You've completed the Streamlit tutorial. Here's what we covered:

### What You Learned:
1. **Streamlit Basics**: Text, layouts, and basic components
2. **Data Visualization**: Built-in charts and integration with Matplotlib/Plotly
3. **Interactive Elements**: Widgets for user input and interaction
4. **Real Data Applications**: Working with actual datasets
5. **Advanced Features**: Session state, caching, and multi-page apps
6. **Deployment**: How to share your apps with the world
7. **Best Practices**: Performance optimization and code organization

### Files Created:
1. `basic_app.py` - Simple "Hello World" app
2. `text_demo.py` - Text formatting examples
3. `charts_demo.py` - Data visualization examples
4. `widgets_demo.py` - Interactive widgets
5. `penguins_explorer.py` - Comprehensive penguins data app
6. `chla_dashboard.py` - Chlorophyll-a monitoring dashboard
7. `advanced_features.py` - Session state and caching examples
8. `multipage_app.py` - Multi-page application
9. `best_practices.py` - Performance and coding best practices

### Next Steps:
1. **Practice**: Try building your own app with your data
2. **Explore**: Check out the [Streamlit Gallery](https://streamlit.io/gallery) for inspiration
3. **Learn More**: Read the [Streamlit Documentation](https://docs.streamlit.io)
4. **Deploy**: Share your first app using Streamlit Community Cloud
5. **Connect**: Join the [Streamlit Community](https://discuss.streamlit.io)

### Resources:
- [Streamlit Documentation](https://docs.streamlit.io)
- [Streamlit Gallery](https://streamlit.io/gallery)
- [Streamlit Community](https://discuss.streamlit.io)
- [Streamlit Blog](https://blog.streamlit.io)
- [GitHub Repository](https://github.com/streamlit/streamlit)

Happy coding with Streamlit! 🎉

## Running the Examples

To run any of the example apps created in this tutorial:

1. **Open your terminal/command prompt**
2. **Navigate to the directory containing the Python files**
3. **Run the command**: `streamlit run filename.py`

For example:
```bash
streamlit run basic_app.py
streamlit run penguins_explorer.py
streamlit run multipage_app.py
```

The app will open in your default web browser, typically at `http://localhost:8501`.

### Troubleshooting:
- Make sure all required packages are installed: `pip install streamlit pandas plotly matplotlib seaborn`
- Ensure your data files are in the correct `data/` directory
- Check that you're running the command from the correct directory
- If you get import errors, try installing missing packages with pip