![Banner](https://github.com/Data-Dunkers/lessons/blob/main/images/top-banner.jpg?raw=true)

# Misleading Visualizations

We are going to use NBA player statistics from the 2024-2025 season to show some ways that data visualizations can used to spread disinformation (intentional) or misinformation (unintentional).

Let's start by importing the required libraries and data.

In [None]:
import pandas as pd
import plotly.express as px
import numpy as np
df = pd.read_csv('https://raw.githubusercontent.com/Data-Dunkers/data/refs/heads/main/NBA/team/2024-2025/2024-2025_players.csv')
df = df[df['Name'] != 'Total'] # drop the rows that are team total values
df

## Adjusted Y-Axis

One classic way to mislead with data visualizations is by adjusting the y-axis to make differences appear larger (or smaller). Let's compare the points per game for three players with two different y-axis scales.

In [None]:
subset = df[df['Name'].isin(['Pascal Siakam', 'Scottie Barnes', 'Tyrese Haliburton'])]
px.bar(subset, x='Name', y='PTS', title='Points Per Game<br>(Truncated Axis)', range_y=[subset['PTS'].min()-0.1, subset['PTS'].max()]).show()
px.bar(subset, x='Name', y='PTS', title='Points Per Game<br>(0 to Maximum)', range_y=[0, df['PTS'].max()]).show()

## Inverted Y-Axis

We can also mislead by inverting the y-axis to imply that an increase is actually a decrease (or vice versa).

As an example, we can create a visualization that seems to show that more field goals made results in fewer points.

In [None]:
px.scatter(df, x='FGM', y='PTS', trendline='ols', title='Misleading: Points vs. Field Goals Made').update_yaxes(autorange='reversed')

## Pie Pull

By pulling a slice away from the center, you can making the viewer perceive it as larger or more important.

In [None]:
fig = px.pie(df[df['Team'] == 'IND'].head(10), values='PTS', names='Name')
fig.update_layout(title='Top Ten Indiana Pacers Scorers')
fig.show()
fig.update_layout(title='Top Ten Indiana Pacers Scorers<br>Misleading Focus with Pull')
fig.update_traces(pull=[0, 0, 0.2])
fig.update_traces(textinfo='label') # use a label instead of a percentage
fig.show()

## Spurious Correlations

Another issue with data visualizations might be displaying two unrelated factors that seem to be correlated.

For this we will group the NBA player data by team. We can show that there is a fairly strong negative correlation between the sum of the lengths of the players names and the average number of games played by each player.

In [None]:
df['Name Length'] = df['Name'].str.len()
dfg = df.groupby('Team').agg({'Name Length':'sum', 'GP':'mean'})
px.scatter(dfg, x='Name Length', y='GP', hover_data=[dfg.index], trendline='ols', title='Average Games Played Per Player vs. Name Length of All Players on the Team')

## Errors

Occasionally there may be intentional or unintentional errors in the visualization.

In [None]:
new_data = df[df['Team'] == 'IND'][['Name', 'PTS']]
new_data.iloc[0, 1] = 25
fig = px.bar(new_data, x='Name', y='PTS', title='Top Ten Indiana Pacers Scorers')
for i, (name, value) in enumerate(new_data[['Name', 'PTS']].values):
    fig.add_annotation(x=name, y=value, text=str(value))
fig.show()

In [None]:
new_data = df[df['Team'] == 'IND'][['Name', 'PTS']].head(10)
fig = px.bar(new_data, x='Name', y='PTS', title='Top Ten Indiana Pacers Scorers')
for i, (name, value) in enumerate(new_data[['Name', 'PTS']].values):
    fig.add_annotation(x=name, y=value, text=str(value), showarrow=False, yshift=10)
fig.show()

new_data2 = new_data.copy()
new_data2.iloc[0, 1] = 25
fig = px.bar(new_data2, x='Name', y='PTS', title='Top Ten Indiana Pacers Scorers')
for i, (name, value) in enumerate(new_data[['Name', 'PTS']].values):
    fig.add_annotation(x=name, y=value, text=str(value), showarrow=False, yshift=10)
fig.show()



## Questions

1. 
2. 
3. 

---

### Online Access
You can run this notebook online using the following links:

*   [**Google Colab**](https://colab.research.google.com/github/Data-Dunkers/student/blob/main/activities/misleading-visualizations.ipynb)
*   [**Callysto Hub**](https://hub.callysto.ca/jupyter/hub/user-redirect/git-pull?repo=https%3A%2F%2Fgithub.com%2FData-Dunkers%2Fstudent&branch=main&subPath=activities/misleading-visualizations.ipynb&depth=1)