# Plotly -Data Visualization- Created for Personal Study Purpose.

In [184]:
import pandas as pd      # Data tables
import numpy as np       # Numbers and arrays
import plotly.express as px  # Fast interactive charts
import plotly.graph_objects as go  # Advanced custom charts
print("NumPy:", np.__version__)
print("Pandas:", pd.__version__)

NumPy: 1.24.3
Pandas: 2.0.3


# Scatter Plot:
A scatter plot shows the relationship between 2 variables on the x and y-axis. 

In [185]:
# Example 1: Let us illustrate the income vs age of people in a scatter plot

# Step 1: Create age_array
age_array=np.random.randint(25,55,size=60)     #People’s ages will be between 25 (minimum) and 55 (maximum),generate data for 60 individuals

# Stepn2: Create income_array
income_array=np.random.randint(300000, 700000, size=60)

# Step 3: Now Let’s Make the Scatter Plot!
# creating a chart and storing it in a variable named fig.
fig=px.scatter(
    x=age_array, 
    y=income_array, 
    title="Income vs Age (Sample Data), Economic Survey",
    labels={"x": "Age (Years)", "y": "Income"}
    

)
fig.show()


Inferences:
From the above plot we find that the Income of a person is not correlated with age. We find that as the age increases the income may or not decrease.

# Line Plot:

A line plot shows information that changes continuously with time

In [186]:
# Example 2: Line Chart – Bicycle Sales Jan–Aug

# step 1: Create array for bicycle sold.
numberOfBicyclesSold_array= [50,100,40,150,160,70,60,45]

# Ste 2: Create array for months.
months_array=["Jan","Feb","Mar","April","May","June","July","August"]

#  Step 3: Create Line chart.
fig= px.line(
    x=months_array,
    y=numberOfBicyclesSold_array,
    title="Bicycle Sales from Jan to Aug (Last Year)",
    labels={ "x": "Months", "y": "Number of Bicycles Sold"},
        
    
)
fig.show()


In [187]:
# Line Chart – Bicycle Sales Jan–Aug (Adds dots on each data point


# step 1: Create array for bicycle sold.
numberOfBicyclesSold_array= [50,100,40,150,160,70,60,45]

# Ste 2: Create array for months.
months_array=["Jan","Feb","Mar","April","May","June","July","August"]

#  Step 3: Create Line chart.
fig= px.line(
    x=months_array,
    y=numberOfBicyclesSold_array,
    title="Bicycle Sales from Jan to Aug (Last Year)",
    labels={ "x": "Months", "y": "Number of Bicycles Sold"},
    markers=True
        
    
)
fig.show()


In [188]:
# Line Chart – Bicycle Sales Jan–Aug (Styling Upgrades)
# step 1: Create array for bicycle sold.
numberOfBicyclesSold_array= [50,100,40,150,160,70,60,45]

# Ste 2: Create array for months.
months_array=["Jan","Feb","Mar","April","May","June","July","August"]

#  Step 3: Create Line chart.
fig= px.line(
    x=months_array,
    y=numberOfBicyclesSold_array,
    title="Bicycle Sales from Jan to Aug (Last Year)",
    labels={ "x": "Months", "y": "Number of Bicycles Sold"},
    markers=True
        
    
)

# Step 3: Styling Upgrades
fig.update_traces(line_color="green", line_width=3)
fig.update_layout(template="plotly_white")
fig.show()

Inferences:
From the above plot we find that the sales is the highest in the month of May and then there is a decline in sales.

# Bar Plot:
A bar plot represents categorical data in rectangular bars.

In [189]:
# Example 3: Let us illustrate the average pass percentage of classes from grade 6 to grade 10
# Define an array containing scores of students 
score_array=[80,90,56,88,95]
# Define an array containing Grade names  
grade_array=['Grade 6','Grade 7','Grade 8','Grade 9','Grade 10']

# This will give average pass percentage per class
fig = px.bar( x=grade_array, y=score_array, title='Pass Percentage of Classes') 
fig.show()


Inferences:
From the above plot we find that Grade 8 has the lowest pass percentage and Grade 10 has the highest pass percentage



# Histogram:
A histogram is used to represent continuous data in the form of bar. 

In [190]:
# Example 4: Let us illustrate the distribution of heights of 200 people using a histogram

# Step 1:
#Here we will concentrate on heights which are 160 and the standard deviation is 11
heights_array = np.random.normal(160, 11, 200)

# step 2:
## Use plotly express histogram chart function px.histogram.Provide input data x to the histogram
fig = px.histogram(x=heights_array,title="Distribution of Heights")
fig.show()



Inferences: There are 40 people whose height is approximately between 160 cm and 165 cm.

**** Note: Why no y-axis name?

Because in a histogram:
x-axis shows the data values (here: heights)
y-axis shows the frequency / count of how many people fall into each height range
Plotly knows this and does it automatically.
So y-axis is still there, but it just has no label yet.


# Bubble Plot
A bubble plot is used to show the relationship between 3 or more variables. It is an extension of a scatter plot. 

In [191]:
# Example 4: Let us illustrate crime statistics of US cities with a bubble chart.
# Step 1: Create a dictionary having city,numberofcrimes and year as 3 keys
crime_details = {
    'City' : ['Chicago', 'Chicago', 'Austin', 'Austin','Seattle','Seattle'],
    'Numberofcrimes' : [1000, 1200, 400, 700,350,1500],
    'Year' : ['2007', '2008', '2007', '2008','2007','2008'],
}
  # Step 2: create a Dataframe object with the dictionary
df = pd.DataFrame(crime_details)
  
df


Unnamed: 0,City,Numberofcrimes,Year
0,Chicago,1000,2007
1,Chicago,1200,2008
2,Austin,400,2007
3,Austin,700,2008
4,Seattle,350,2007
5,Seattle,1500,2008


In [192]:
# Step 5: ## Group the number of crimes by city and find the total number of crimes per city
bub_data = df.groupby('City')['Numberofcrimes'].sum()

# Display the grouped dataframe
bub_data

City
Austin     1100
Chicago    2200
Seattle    1850
Name: Numberofcrimes, dtype: int64

In [193]:
# step 6: Updated Code (with reset_index())
bub_data = df.groupby('City')['Numberofcrimes'].sum().reset_index()
bub_data

Unnamed: 0,City,Numberofcrimes
0,Austin,1100
1,Chicago,2200
2,Seattle,1850


In [194]:
# Step 7: Create Bubble chart

fig = px.scatter(
    bub_data,
    x='City',                # City on x-axis
    y='Numberofcrimes',        # Crimes on y-axis
    size='Numberofcrimes',     # Bubble size = number of crimes
    color='City',            # Different color per city (helps visually)
    size_max=100,             # Maximum bubble size
    title='Total Number of Crimes by City – Bubble Chart',
    labels={'City': 'City', 'Total_Crimes': 'Total Number of Crimes'}
)

fig.show()


Inferences: The size of the bubble in the bubble chart indicates that Chicago has the highest crime rate when compared with the other 2 cities.

# Pie Plot:

A pie plot is a circle chart mainly used to represent proportion of part of given data with respect to the whole data. Each slice represents a proportion and on total of the proportion becomes a whole.

In [195]:
# Example 5:

## Monthly expenditure of a family
# Step 1: Create Random Data
exp_percent= [20, 50, 10,8,12]
house_holdcategories = ['Grocery', 'Rent', 'School Fees','Transport','Savings']

# Step 2: Create Pie chart
fig = px.pie(
    names=house_holdcategories,
    values=exp_percent,
    title="Monthly Expenditure of a Family",
    hole=0.0  # 0.4 if you want a donut chart
)

fig.show()


Note:   
* Why No x= and y= in Pie Charts?

Because pie charts do not have axes ❌
(no X-axis, no Y-axis)

A pie chart shows:
labels (the slice names)
values (how big each slice is)
____________________________________
* Line	Meaning

names=	Text labels for each slice
values=	Size of each slice (percentage)
hole=	Makes a donut chart if > 0
fig.show()	Displays the chart
___________________________________


# Sunburst Charts:
Sunburst charts represent hierarchial data in the form of concentric circles. Here the innermost circle is the root node which defines the parent, and then the outer rings move down the hierarchy from the centre. They are also called radial charts.

A sunburst chart shows family or hierarchical relationships.
It looks like a circle with layers:
Center = the root person
Next ring = children
Next ring = grandchildren
And so on…
It shows who belongs to whom and how big their value is.


In [196]:
#Create a dictionary having a set of people represented by a character array and the parents of these characters represented in another
## array and the values are the values associated to the vectors.
data = dict(
    character=["Eve", "Cain", "Seth", "Enos", "Noam", "Abel", "Awan", "Enoch", "Azura"],
    parent=["", "Eve", "Eve", "Seth", "Seth", "Eve", "Eve", "Awan", "Eve" ],
    value=[10, 14, 12, 10, 2, 6, 6, 4, 4])

fig = px.sunburst(
    data,
    names='character',
    parents='parent',
    values='value',
    title="Family chart"
)
fig.show()

Note: Sunburst chart shows hierarchical family relationships.
names are the characters, parents defines who they belong to,
and values defines how big their portion is.