<a href="https://colab.research.google.com/github/cagBRT/Intro-to-Programming-with-Python/blob/master/Data_Visualization_Plots.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This notebook covers five charts and plots visualizing data in different ways.

## Chord Diagram

The Chord diagram shows the connections and relationships between data in a matrix format.
<br>
The diagram consists of a circle whose circumference is divided into different segments that are connected using arcs (or chords) that represent the relationships between the segments.

The thickness of the arc is proportional to the significance of the relationship.

![picture](https://miro.medium.com/v2/resize:fit:1400/format:webp/0*a8FAIkubQVMsyFCf.png)

Use Chord diagrams to demonstrate: <br>
>Social networks<br>
Genomics data<br>
Traffic flow data<br>
Trade relationships data<br>

### Create a Chord Diagram

Holoviews & Bokeh libraries to create a beautiful Chord diagram.



In [None]:
import holoviews as hv
from holoviews import opts
import pandas as pd
import numpy as np
hv.extension('bokeh')

# Sample matrix representing the export volumes between 5 countries
export_data = np.array([[0, 50, 30, 20, 10],
                        [10, 0, 40, 30, 20],
                        [20, 10, 0, 35, 25],
                        [30, 20, 10, 0, 40],
                        [25, 15, 30, 20, 0]])

labels = ['USA', 'China', 'Germany', 'Japan', 'India']

# Creating a pandas DataFrame
df = pd.DataFrame(export_data, index=labels, columns=labels)
df = df.stack().reset_index()

df.columns = ['source', 'target', 'value']

# Creating a Chord object
chord = hv.Chord(df)

# Styling the Chord diagram
chord.opts(
    opts.Chord(
        cmap='Category20', edge_cmap='Category20',
        labels='source', label_text_font_size='10pt',
        edge_color='source', node_color='index',
        width=700, height=700
    )
).select(value=(5, None))

# Display the plot
chord

## Sunburst Chart
The Sunburst chart is used to plot and visualise hierarchical data/ tree-like data in a circular layout.<br>

The chart is in the form of multiple rings where each represents a level in the hierarchy.<br>

The centre of the chart is the root / top level in the hierarchy.<br>

Each segment or sector of a ring represents a node at that level of hierarchy.<br>

The size of each segment/sector is proportional to its value relative to its siblings.

Use Sunburst charts to plot hierarchical data for:<br>

>File systems<br>
Website navigation paths<br>
Market segmentation<br>
Genomic data<br>

In [None]:
import plotly.express as px
import numpy as np

df = px.data.gapminder().query("year == 2007")

fig = px.sunburst(df, path=['continent', 'country'],
                  values='pop',
                  color='lifeExp',
                  hover_data=['iso_alpha'],
                  color_continuous_scale='RdBu',
                  color_continuous_midpoint=np.average(df['lifeExp'], weights=df['pop']))
fig.show()

## Hexbin Plot<br>
This is a 2D Histogram plot where the bins are hexagons and different colours are used to represent the number of data points in each bin.br>

The plot is used to analyze the relationship between two data variables (or Bivariate data).br>

It is a great alternative to Scatter plots when plotting large amounts of data points.br>

When one has a lot of data points to plot, a Hexbin plot helps in better visualisation as these data points would otherwise overlap and obscure each other in a traditional scatter plot.br>

In [None]:
import matplotlib.pyplot as plt
import numpy as np

# Simulating environmental data
aqi = np.random.uniform(0, 300, 10000)
hospital_visits = aqi * np.random.uniform(0.5, 1.5, 10000)

# Creating the hexbin plot
plt.hexbin(aqi, hospital_visits, gridsize=30, cmap='Purples')

# Adding a color bar on the right
cb = plt.colorbar(label='Count')

# Setting labels and title
plt.xlabel('Air Quality Index (AQI)')
plt.ylabel('Respiratory-related Hospital Visits')
plt.title('Environmental Health Impact Analysis')

# Show the plot
plt.show()

## Sankey Diagram<br>
This diagram represents the movement/ flow of quantities between different stages or parts of a system.<br>

A Sanket Diagram consists of Nodes and links between them.<br>

The width of each link is proportional to the flow quantity.<br>

The diagram also represents the direction of flow.<br>




The plot can be used to visualise data such as follows:<br>

>Supply chain/ Logistics data<br>
Traffic flow data<br>
Data flow<br>
Energy flow data<br>


In [None]:
import plotly.graph_objects as go

labels = ["Coal", "Solar", "Wind", "Nuclear", "Residential", "Industrial", "Commercial"]

source = [0, 1, 2, 3, 0, 1, 2, 3]
target = [4, 4, 4, 4, 5, 5, 5, 5]
value = [25, 10, 40, 20, 30, 15, 25, 35]

# Create the Sankey diagram object
fig = go.Figure(data=[go.Sankey(
    node=dict(
        pad=15,
        thickness=20,
        line=dict(color="black", width=0.5),
        label=labels
    ),
    link=dict(
        source=source,
        target=target,
        value=value
    ))])

fig.update_layout(title_text="Energy Flow in Model City", font_size=12)
fig.show()

## Stream Graph
A Stream Graph (also called Theme River) is a form of stacked area graph created around a central axis that results in a flowing shape.<br>

Each stream on the graph represents a time series associated with a category and is differently color-coded.<bR>

The plot can be used to visualise Time series data such as follows:<br>

>Popularity trends<br>
Financial data<br>
Website traffic data<br>
Sales/ Marketing data<br>

In [None]:
!pip install vega_datasets altair

In [None]:
import altair as alt
from vega_datasets import data

source = data.unemployment_across_industries.url

alt.Chart(source).mark_area().encode(
    alt.X('yearmonth(date):T',
        axis=alt.Axis(format='%Y', domain=False, tickSize=0)
    ),
    alt.Y('sum(count):Q', stack='center', axis=None),
    alt.Color('series:N',
        scale=alt.Scale(scheme='category20b')
    )
).interactive()