<div style="font-family: 'Comic Sans MS', sans-serif; font-size: 18px; color: #333; background-color: #d4d4d4; padding: 10px; border-radius: 5px; width: 900px;">
    <h1>Welcome to Plotly express tutorial ✨</h1>
    <p>Plotly express is a python library used for creating simple visualizations.</p>
    <p>Let's get started!</p>
</div>


For this tutorial we will use the Space Missions Dataset from Kaggle.  
https://www.kaggle.com/datasets/sameerk2004/space-missions-dataset

<div style="font-family: 'Comic Sans MS', sans-serif; font-size: 18px; color: #333; background-color: #d4d4d4; padding: 10px; border-radius: 5px; width: 900px;">
    <h1>Why Plotly express?</h1>
    <p>because it is simple and easy to use.</p>
    <h2>When to use Plotly express?</h2>
    <ul>
        <li>Quick and easy creation of standard charts.</li>
        <li>During exploratory data analysis (EDA) to get a quick overview of the data.</li>
        <li>When we want to create visualizations with minimal code.</li>
        <li>When we don't need to create custom or complex charts.</li>
    </ul>

</div>


In [1]:
# Importing libraries
import pandas as pd # for data manipulation
import plotly.express as px # for data visualization

<div style="font-family: 'Comic Sans MS', sans-serif; font-size: 18px; color: #333; background-color: #d4d4d4; padding: 10px; border-radius: 5px; width: 900px;">
    <h1>Data preparation 🛠</h1>
    <p>In this section we will prepare the dataset for visualization.</p>
    <p style="color: red;">Note that the data is synthetic and not real.</p>
</div>


In [2]:
df = pd.read_csv('Data/space_missions_dataset.csv') # loading the dataset using pandas

# Displaying a sample of the dataset
df.sample(5)

Unnamed: 0,Mission ID,Mission Name,Launch Date,Target Type,Target Name,Mission Type,Distance from Earth (light-years),Mission Duration (years),Mission Cost (billion USD),Scientific Yield (points),Crew Size,Mission Success (%),Fuel Consumption (tons),Payload Weight (tons),Launch Vehicle
229,MSN-0230,Mission-230,2029-05-23,Star,Titan,Mining,1.56,2.7,373.82,69.2,2,100.0,180.58,74.48,Ariane 6
481,MSN-0482,Mission-482,2034-03-22,Star,Titan,Colonization,5.68,6.0,224.89,10.1,40,91.4,525.78,35.53,Starship
139,MSN-0140,Mission-140,2027-09-01,Exoplanet,Ceres,Research,14.43,9.0,426.15,93.6,69,100.0,1431.78,84.37,Ariane 6
472,MSN-0473,Mission-473,2034-01-18,Moon,Titan,Mining,1.93,3.3,390.72,10.0,72,100.0,175.46,74.49,Starship
185,MSN-0186,Mission-186,2028-07-19,Star,Betelgeuse,Mining,29.05,16.4,430.94,19.2,76,100.0,2905.11,83.23,Starship


In [3]:
df.info() # displaying the information about the dataset to make sure there are no missing values

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 500 entries, 0 to 499
Data columns (total 15 columns):
 #   Column                             Non-Null Count  Dtype  
---  ------                             --------------  -----  
 0   Mission ID                         500 non-null    object 
 1   Mission Name                       500 non-null    object 
 2   Launch Date                        500 non-null    object 
 3   Target Type                        500 non-null    object 
 4   Target Name                        500 non-null    object 
 5   Mission Type                       500 non-null    object 
 6   Distance from Earth (light-years)  500 non-null    float64
 7   Mission Duration (years)           500 non-null    float64
 8   Mission Cost (billion USD)         500 non-null    float64
 9   Scientific Yield (points)          500 non-null    float64
 10  Crew Size                          500 non-null    int64  
 11  Mission Success (%)                500 non-null    float64

In [4]:
df.describe().T # displaying the statistical summary of the dataset and using .T to transpose the dataframe to make it more readable


Unnamed: 0,count,mean,std,min,25%,50%,75%,max
Distance from Earth (light-years),500.0,25.48306,14.942128,0.35,11.75,26.185,38.57,49.9
Mission Duration (years),500.0,15.7368,7.578316,1.4,8.9,16.4,22.2,29.5
Mission Cost (billion USD),500.0,277.30028,141.137422,13.32,149.96,282.17,399.995,538.32
Scientific Yield (points),500.0,55.2234,26.446232,10.0,33.775,54.4,79.025,99.8
Crew Size,500.0,50.118,27.660989,1.0,27.0,50.0,74.0,99.0
Mission Success (%),500.0,92.6166,9.391094,66.0,85.5,98.6,100.0,100.0
Fuel Consumption (tons),500.0,2543.52214,1492.964489,18.06,1177.315,2597.985,3859.355,5018.6
Payload Weight (tons),500.0,50.35562,28.227546,1.02,25.5675,50.995,74.4825,99.78


<div style="font-family: 'Comic Sans MS', sans-serif; font-size: 18px; color: #333; background-color: #d4d4d4; padding: 10px; border-radius: 5px; width: 900px;">
    <h2 style="color: #2c3e50;">Now it's time to visualize the data 📊</h2>
    <p>Let's create some interactive visualizations using Plotly to better understand our dataset.</p>
</div>

In [5]:
# Using a pie chart to visualize the mission type
fig = px.pie(df, names='Mission Type', title='Mission Type')        #names='Mission Type' is the column we want to visualize
fig.show()

# You can click on the legend to hide/show the data and check the percentage without the hidden data

In [6]:
# Using a histogram to visualize the relationship between Mission Duration (years) and Fuel Consumption (tons) 
fig = px.histogram(df, x='Mission Duration (years)', y='Fuel Consumption (tons)', title='Mission Duration vs Fuel Consumption')
fig.show()

#you can hover over the bars to see the exact values

In [7]:
# Group by 'Mission Type' and calculate the mean for 'Fuel Consumption'
average_fuel_consumption = df.groupby('Mission Type')['Fuel Consumption (tons)'].mean().reset_index()

# Create a bar plot for average fuel consumption by mission type
fig = px.bar(average_fuel_consumption, x='Mission Type', y='Fuel Consumption (tons)',
             title='Average Fuel Consumption by Mission Type',
             color='Mission Type',      #we used color for showing the mission type
             color_discrete_sequence=px.colors.qualitative.Set2)        #we used color_discrete_sequence to change the color of the bars to a different color


fig.show()

In [8]:
# Create a line plot for Mission Cost over time
fig = px.line(df, x='Launch Date', y='Mission Cost (billion USD)',
              title='Mission Cost Over Time')
             

fig.show()

#you can hover over the line to see the exact values

In [9]:
# Create a scatter plot for Mission Cost vs. Mission Success
fig = px.scatter(df, x='Mission Cost (billion USD)', y='Mission Success (%)',
                 title='Mission Cost vs. Mission Success',
                 color='Mission ID',        #we used color to show the mission id
                 symbol='Mission Type')        #we used symbol to show the mission type

fig.show()

In [10]:
# Select 5 random Mission IDs
random_missions = df.sample(n=5, random_state=42).sort_values(by='Scientific Yield (points)')  # random_state for reproducibility

# Create a funnel chart
fig = px.funnel(random_missions, x='Scientific Yield (points)', y='Mission ID',
                title='Funnel Chart of Scientific Yield for Random Missions',
                color='Mission ID',
                color_discrete_sequence=px.colors.qualitative.Set3)


fig.show()

In [11]:
# Create a new DataFrame with fictional mission data
data = {
    'Mission ID': ['MSN-001', 'MSN-002', 'MSN-003', 'MSN-004', 'MSN-005'],
    'Launch Date': pd.to_datetime(['2025-01-05', '2025-01-15', '2025-02-10', '2025-02-20', '2025-03-10']),
    'End Date': pd.to_datetime(['2025-01-10', '2025-01-25', '2025-02-15', '2025-02-25', '2025-03-11'])
}

# Convert the dictionary into a DataFrame
timeline_df = pd.DataFrame(data)

# Create a timeline plot
fig = px.timeline(timeline_df, x_start='Launch Date', x_end='End Date', y='Mission ID',
                  title='Timeline of Fictional Missions in 2025',
                  color='Mission ID',
                  color_discrete_sequence=px.colors.qualitative.Set1)

# Update the layout to make it more readable
fig.update_yaxes(title='Mission ID')
fig.update_xaxes(title='Date')

# Show the plot
fig.show()


<div style="font-family: 'Comic Sans MS', sans-serif; font-size: 18px; color: #333; background-color: #d4d4d4; padding: 10px; border-radius: 5px; width: 1300px;">
    <h2 style="color: #2c3e50;">Explore the other Chart Types with Plotly Express 🔍</h2>
    <p>We explored the basic chart types in plotly express.</p>
    <p>Plotly Express offers a wide range of chart types to help you visualize complex data in meaningful ways. Here's a quick guide to what you can explore:</p>
</div>



<div style="font-family: 'Comic Sans MS', sans-serif; font-size: 18px; color: #333; background-color: #d4d4d4; padding: 10px; border-radius: 5px; width: 900px; margin-top: 10px;">
    <h3 style="color: #2c3e50;">Part-of-Whole Charts</h3>
    <p><strong>Pie, Sunburst, Treemap, Icicle, Funnel Area</strong>: Visualize hierarchical or proportional data. Great for showing relationships and proportions within categories.</p>
</div>

<div style="font-family: 'Comic Sans MS', sans-serif; font-size: 18px; color: #333; background-color: #d4d4d4; padding: 10px; border-radius: 5px; width: 900px; margin-top: 10px;">
    <h3 style="color: #2c3e50;">1D Distributions</h3>
    <p><strong>Histogram, Box, Violin, Strip, ECDF</strong>: Explore the distribution of a single variable. Use these to analyze spread, outliers, and patterns in your data.</p>
</div>

<div style="font-family: 'Comic Sans MS', sans-serif; font-size: 18px; color: #333; background-color: #d4d4d4; padding: 10px; border-radius: 5px; width: 900px; margin-top: 10px;">
    <h3 style="color: #2c3e50;">2D Distributions</h3>
    <p><strong>Density Heatmap, Density Contour</strong>: Visualize the relationship between two variables. Ideal for spotting trends, clusters, or density patterns.</p>
</div>

<div style="font-family: 'Comic Sans MS', sans-serif; font-size: 18px; color: #333; background-color: #d4d4d4; padding: 10px; border-radius: 5px; width: 900px; margin-top: 10px;">
    <h3 style="color: #2c3e50;">Matrix or Image Input</h3>
    <p><strong>Imshow</strong>: Display matrix-like data or images. Perfect for heatmaps, correlation matrices, or image visualizations.</p>
</div>

<div style="font-family: 'Comic Sans MS', sans-serif; font-size: 18px; color: #333; background-color: #d4d4d4; padding: 10px; border-radius: 5px; width: 900px; margin-top: 10px;">
    <h3 style="color: #2c3e50;">3-Dimensional Charts</h3>
    <p><strong>Scatter 3D, Line 3D</strong>: Create interactive 3D plots to explore relationships across three variables. Great for spatial or multidimensional data.</p>
</div>

<div style="font-family: 'Comic Sans MS', sans-serif; font-size: 18px; color: #333; background-color: #d4d4d4; padding: 10px; border-radius: 5px; width: 900px; margin-top: 10px;">
    <h3 style="color: #2c3e50;">Multidimensional Charts</h3>
    <p><strong>Scatter Matrix, Parallel Coordinates, Parallel Categories</strong>: Visualize high-dimensional data. Use these to compare multiple variables and identify patterns or correlations.</p>
</div>

<div style="font-family: 'Comic Sans MS', sans-serif; font-size: 18px; color: #333; background-color: #d4d4d4; padding: 10px; border-radius: 5px; width: 900px; margin-top: 10px;">
    <h3 style="color: #2c3e50;">Tile Maps</h3>
    <p><strong>Scatter Map, Line Map, Choropleth Map, Density Map</strong>: Create geographic visualizations. Ideal for location-based data, routes, or regional comparisons.</p>
</div>

<div style="font-family: 'Comic Sans MS', sans-serif; font-size: 18px; color: #333; background-color: #d4d4d4; padding: 10px; border-radius: 5px; width: 900px; margin-top: 10px;">
    <h3 style="color: #2c3e50;">Outline Maps</h3>
    <p><strong>Scatter Geo, Line Geo, Choropleth</strong>: Similar to tile maps but with simpler outlines. Use these for geographic data with less detail.</p>
</div>

<div style="font-family: 'Comic Sans MS', sans-serif; font-size: 18px; color: #333; background-color: #d4d4d4; padding: 10px; border-radius: 5px; width: 900px; margin-top: 10px;">
    <h3 style="color: #2c3e50;">Polar Charts</h3>
    <p><strong>Scatter Polar, Line Polar, Bar Polar</strong>: Visualize data in polar coordinates. Great for cyclic or directional data (e.g., time series, wind direction).</p>
</div>

<div style="font-family: 'Comic Sans MS', sans-serif; font-size: 18px; color: #333; background-color: #d4d4d4; padding: 10px; border-radius: 5px; width: 900px; margin-top: 10px;">
    <h3 style="color: #2c3e50;">Ternary Charts</h3>
    <p><strong>Scatter Ternary, Line Ternary</strong>: Plot data in ternary coordinates. Useful for compositional data (e.g., mixtures of three components).</p>
</div>

<div style="font-family: 'Comic Sans MS', sans-serif; font-size: 18px; color: #333; background-color: #d4d4d4; padding: 10px; border-radius: 5px; width: 900px; margin-top: 10px;">
    <h3 style="color: #2c3e50;">Thank you for reading!</h3>
    <p>I hope you enjoyed this tutorial 😊 </p>
    <p>Feel free to contact me through my linkedin profile: <a href="https://www.linkedin.com/in/mohammad-alkhatim-9b1770266/" target="_blank">LinkedIn</a></p>
</div>