In [2]:
import numpy as np
import pandas as pd
import plotly.express as px

# Map Box

Datasets with latitude and longitude can be visualised on a map, where each point marks a location. You can adjust point size (continuous values), colour (categorical or numerical), and even animate over time to explore different behaviours in the data.

``px.scatter_mapbox()`` plots data on a map using latitude and longitude. You can adjust **color, size, max dot size, zoom,** and **map style**. Colour scales come from `px.colors.sequential.swatches()`, and you set with the argument `color_continuous_scale`. The **mapbox_style**  sets the style of the plot and it must be set for the plot to render.

In [3]:
df_practice = pd.read_csv("https://raw.githubusercontent.com/Code-Institute-Solutions/sample-datasets/main/2014_us_cities.csv")
df_practice = df_practice.sample(n=70, random_state=1)

print(df_practice.shape)
df_practice.head()

(70, 4)


Unnamed: 0,name,pop,lat,lon
1604,Marshfield,18997,44.668852,-90.171799
959,Hobbs,34384,32.702612,-103.13604
3080,Osceola,7656,28.044384,-81.143754
2084,Miami,13629,25.774266,-80.193659
1827,Durant,16232,33.993986,-96.370824


In [13]:
fig = px.scatter_mapbox(df_practice, lat="lat", lon="lon",
                        color='pop', size='pop',
                        mapbox_style='carto-positron',
                        color_continuous_scale=px.colors.sequential.Aggrnyl,
                        size_max=30, zoom=2.5,
                        hover_name='name'
                        )
fig.show()   

# Sunburst

A Sunburst plot displays hierarchical data in concentric rings, showing part-to-whole relationships. It’s useful for visualising multiple categorical variables and their proportions or counts, with colour based on either numerical or categorical values.

``px.sunburst()`` creates a sunburst chart. Key arguments include:
* `data_frame` (the dataset)
* `path` (hierarchy of sectors)
* `values` (sector sizes)
* `color` (sector colouring)
* `hover_name` / `hover_data` (extra info on hover).

You can **drill down by clicking a sector** and **zoom back out** by clicking again. Hovering shows **aggregate values** (e.g., total for a continent).

In [14]:
df = px.data.gapminder().query("year == 2007")
print(df.shape)
df.head()

(142, 8)


Unnamed: 0,country,continent,year,lifeExp,pop,gdpPercap,iso_alpha,iso_num
11,Afghanistan,Asia,2007,43.828,31889923,974.580338,AFG,4
23,Albania,Europe,2007,76.423,3600523,5937.029526,ALB,8
35,Algeria,Africa,2007,72.301,33333216,6223.367465,DZA,12
47,Angola,Africa,2007,42.731,12420476,4797.231267,AGO,24
59,Argentina,Americas,2007,75.32,40301927,12779.37964,ARG,32


In [15]:
fig = px.sunburst(data_frame=df, path=['continent', 'country'], values='pop',
                  color='gdpPercap', hover_name='iso_alpha', hover_data=['lifeExp'])
fig.show()

# Tree Map

A **Tree map** is another way to show **part-to-whole relationships** and **hierarchical data**.
* Instead of using concentric rings like a **Sunburst**, it uses **nested rectangles**.
* Each rectangle represents a category (or subcategory), and its **size corresponds to its value** (e.g., proportion or count).
* Colours can also be used to show an additional variable.

So, a **Sunburst** spreads data in circles outward from the centre, while a **Tree map** arranges it as rectangles inside larger rectangles.

In [16]:
df = px.data.gapminder().query("year == 2007")
print(df.shape)
df.head()

(142, 8)


Unnamed: 0,country,continent,year,lifeExp,pop,gdpPercap,iso_alpha,iso_num
11,Afghanistan,Asia,2007,43.828,31889923,974.580338,AFG,4
23,Albania,Europe,2007,76.423,3600523,5937.029526,ALB,8
35,Algeria,Africa,2007,72.301,33333216,6223.367465,DZA,12
47,Angola,Africa,2007,42.731,12420476,4797.231267,AGO,24
59,Argentina,Americas,2007,75.32,40301927,12779.37964,ARG,32


In [17]:
fig = px.treemap(data_frame=df, path=['continent', 'country'], values='pop',
                  color='gdpPercap', hover_name='iso_alpha', hover_data=['lifeExp'])
fig.show()

In [18]:
fig = px.treemap(data_frame=df, path=[px.Constant('World'), 'continent', 'country'], values='pop',
                  color='gdpPercap', hover_name='iso_alpha', hover_data=['lifeExp'])
fig.show()

In [23]:
df_practice = pd.read_csv("https://raw.githubusercontent.com/Code-Institute-Solutions/sample-datasets/main/sales_success.csv")
df_practice = df_practice.sample(n=50, random_state=1)
df_practice = df_practice.drop(columns=['Unnamed: 0'])
print(df_practice.shape)
df_practice.head(3)

(50, 5)


Unnamed: 0,region,county,salesperson,calls,sales
22,West,Brewster,BS,33,14
2,North,Dallam,IJ,20,6
49,East,Houston,AX,42,9


In [None]:
# Add Tree Map

fig = px.treemap(df_practice,
                 path=['region', 'county', 'salesperson'],
                 values='sales',
                 color='sales',
                 hover_name='calls'
                )
fig.show()

In [None]:
# Add Sunburst

fig = px.sunburst(df_practice,
                  path=['region', 'county', 'salesperson'],
                  values='sales',
                  color='sales',
                  hover_name='calls'
                  )
fig.show()

# Waterfall

A **Waterfall chart** is a type of plot that shows how a starting value changes step by step due to a series of positive and negative contributions, eventually leading to an ending value. 
* It looks like a series of **columns that “float”**, with each one showing the effect of an increase or decrease.
* The **first bar** usually shows the starting point (e.g., initial revenue).
* **Intermediate bars** show changes (e.g., expenses, gains, losses).
* The **final bar** shows the ending point (e.g., net profit).

🔹**What it’s used for:**
- To explain how a value evolves over time or through different factors.
- Common in **finance** (e.g., revenue → costs → profit).
- Useful for attribution analysis (seeing which parts contributed most to growth or decline).

In [26]:
df = pd.DataFrame({"Revenue Product 1": [500],
                   "Revenue Product 2": [100],
                   "Revenue Product 3": [600],
                   "Revenue Product 4": [250],
                   "Net revenue": [1450], 
                   "Fixed Cost": [-200],
                   "Variable Cost": [-400],
                   "Trips Expenses": [-300],
                   "Other expenses": [-10],
                   "Operating profit": [540],
                   "Income tax": [-200],
                   "Net Profit": [340]},
              index=['Value'])

df = df.T  # .T transpose the rows and columns
df

Unnamed: 0,Value
Revenue Product 1,500
Revenue Product 2,100
Revenue Product 3,600
Revenue Product 4,250
Net revenue,1450
Fixed Cost,-200
Variable Cost,-400
Trips Expenses,-300
Other expenses,-10
Operating profit,540


Waterfall charts in Plotly are created using **Plotly Graph Objects** (not Plotly Express). Key arguments include:
* orientation – sets chart direction (e.g., horizontal)
* measure – defines each step as either relative (added/subtracted) or total (shows the accumulated value).

In [27]:
import plotly.graph_objs as go
fig = go.Figure(go.Waterfall(
    orientation = "h",
    measure = ["relative", "relative", "relative", "relative","total",
               "relative", "relative", "relative", "relative","total",
                "relative","total"],
    y = df.index.to_list(),
    x = df['Value']
))


fig.update_layout(title = "Waterfall Chart - Profit & Losses in 2020 in M$",
                  width=800, height=500)
# go.Waterfall() doesn't have a argument for setting width, height and title
# we set with fig.update_layout()
fig.show()

#### Setting Templates
You can check available templates or themes `plotly.io` and use the available options using the template argument.

In [29]:
import plotly.io as pio
pio.templates

Templates configuration
-----------------------
    Default template: 'plotly'
    Available templates:
        ['ggplot2', 'seaborn', 'simple_white', 'plotly',
         'plotly_white', 'plotly_dark', 'presentation', 'xgridoff',
         'ygridoff', 'gridon', 'none']

In [30]:
df = px.data.gapminder()
df = df.sample(n=50, random_state=1)
df.head(3)

Unnamed: 0,country,continent,year,lifeExp,pop,gdpPercap,iso_alpha,iso_num
708,Indonesia,Asia,1952,37.468,82052000,749.681655,IDN,360
1353,Sierra Leone,Africa,1997,39.897,4578212,574.648158,SLE,694
491,Equatorial Guinea,Africa,2007,51.579,551201,12154.08975,GNQ,226


In [31]:
fig = px.scatter(df, x="gdpPercap", y="lifeExp", 
                 animation_frame="year", 
                 size="pop",size_max=55, 
                 color="continent", hover_name="country",
                 log_x=True, range_x=[100,100000], range_y=[25,90],
                 width=600, height=350,
                 template='simple_white'  ######## set template
            )

fig.show()