![Banner](https://github.com/Data-Dunkers/lessons/blob/main/images/top-banner.jpg?raw=true)

# Misleading Visualizations

We are going to use NBA player statistics from the 2024-2025 season to show some ways that data visualizations can used to spread disinformation (intentional) or misinformation (unintentional).

Let's start by importing the required libraries and data.

In [None]:
import pandas as pd
import plotly.express as px
import numpy as np
df = pd.read_csv('https://raw.githubusercontent.com/Data-Dunkers/data/refs/heads/main/NBA/player/nba_player_stats_2024-2025.csv')
df

## Adjusted Y-Axis

One classic way to mislead with data visualizations is by adjusting the y-axis to make differences appear larger (or smaller). Let's compare the points per game for three players with two different y-axis scales.

In [None]:
subset = df[df['Name'].isin(['Pascal Siakam', 'Scottie Barnes', 'Tyrese Haliburton'])]
px.bar(subset, x='Name', y='PTS', title='Points Per Game<br>(Truncated Axis)', range_y=[subset['PTS'].min()-0.1, subset['PTS'].max()]).show()
px.bar(subset, x='Name', y='PTS', title='Points Per Game<br>(0 to Maximum)', range_y=[0, df['PTS'].max()]).show()

## Inverted Y-Axis

We can also mislead by inverting the y-axis to imply that that 

## Spurious Correlations

Another way to mislead with data visualizations is to find two factors that seem to be correlated. For this we will use data from all NBA players in the 2024-2025 season, grouped by team. We can show that there is a fairly strong negative correlation between the sum of the lengths of the players names and the average number of games played by each player.

In [None]:
df_all = pd.read_csv('https://raw.githubusercontent.com/Data-Dunkers/data/refs/heads/main/NBA/team/2024-2025/2024-2025_players.csv')
df_all['Name Length'] = df_all['Name'].str.len()
df_grouped = df_all.groupby('Team').agg({'Name Length':'sum', 'GP':'mean'})
px.scatter(df_grouped, x='Name Length', y='GP', hover_data=[df_grouped.index], trendline='ols', title='Average Games Played Per Player vs. Name Length of All Players on the Team')

## Errors

Occasionally there may be intentional or unintentional errors in the visualization.



## Questions

1. 
2. 
3. 

---

### Online Access
You can run this notebook online using the following links:

*   [**Google Colab**](https://colab.research.google.com/github/Data-Dunkers/student/blob/main/activities/misleading-visualizations.ipynb)
*   [**Callysto Hub**](https://hub.callysto.ca/jupyter/hub/user-redirect/git-pull?repo=https%3A%2F%2Fgithub.com%2FData-Dunkers%2Fstudent&branch=main&subPath=activities/misleading-visualizations.ipynb&depth=1)