<p style='text-align:center;font-size:30px'>Commonwealth Games 2022<br/>Participation EDA and Winners Visualization 🏆</p>

Underpinned by the core values of humanity, equality and destiny, the Games aim to unite the Commonwealth family through a glorious festival of sport. Often referred to as the ‘Friendly Games’, the event is renowned for inspiring athletes to compete in the spirit of friendship and fair play.

Some of the most memorable sporting moments in history took place at the Commonwealth Games:

At the 1954 Vancouver Games, Roger Bannister and John Landy became the first people to break the four-minute mile in a race that became known as the ‘Miracle Mile’.

Chantal Petitclerc became the first gold medal winner in a para-sport in 2002. An occasion that marked the first time an event for an athlete with a disability had been part of the official programme.

And women’s boxing became a mainstay of the Commonwealth Games in 2014 with Team England’s Nicola Adams taking the first gold medal in the flyweight division.

The encouraging ethos of the Games has stirred athletes to sprint faster, leap higher and push themselves to the very limits of what the human body is capable of.

The 2022 Games will be the first time West Midlands has played host to the event, following London 1934, and Manchester 2002. As preparations for the Birmingham 2022 Commonwealth Games take shape, the West Midlands become part of a lasting legacy. One that displays world-class teamwork, athleticism and friendship.

In [1]:
import numpy as np
import pandas as pd
import plotly.express as px

In [2]:
players_data = pd.read_csv('../input/commonwealth-games-2022/commonwealth games 2022 - players participated.csv')
players_data

Unnamed: 0,ATHLETE NAME\t,SPORT,GENDER,AGE,TEAM
0,GregHire,3x3 Basketball,Male,34.0,Australia
1,Daniel Geoffrey CraigJohnson,3x3 Basketball,Male,34.0,Australia
2,LaurenMansfield,3x3 Basketball,Female,32.0,Australia
3,LaurenScherf,3x3 Basketball,Female,26.0,Australia
4,JesseWagstaff,3x3 Basketball,Male,36.0,Australia
...,...,...,...,...,...
4528,JohnVake,Wrestling,Male,31.0,Tonga
4529,VeronicaAyo,Wrestling,Female,28.0,Uganda
4530,JacobNtuyo,Wrestling,Male,28.0,Uganda
4531,CurtisDodge,Wrestling,Male,29.0,Wales


# Participation EDA

In [3]:
players_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4533 entries, 0 to 4532
Data columns (total 5 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   ATHLETE NAME	  4533 non-null   object 
 1   SPORT          4533 non-null   object 
 2   GENDER         4533 non-null   object 
 3   AGE            4532 non-null   float64
 4   TEAM           4533 non-null   object 
dtypes: float64(1), object(4)
memory usage: 177.2+ KB


## Distribution of age of the players

In [4]:
fig = px.histogram(x = players_data['AGE'], color=players_data['GENDER'], barmode='group', color_discrete_sequence=['darkcyan', 'magenta'])
fig.show()

#### There are more players whose age is in between 20-30 years.

## Participation in each Sport

In [5]:
fig = px.histogram(x = players_data['GENDER'], color=players_data['SPORT'], barmode="group", color_discrete_sequence=px.colors.qualitative.Dark24)
fig.show()

<ul style='font-size:22px;color:rgb(0, 175, 175)'>The top 5 Men's sports played are:<li style='font-size:16px'> Atheltics and Para Athletics<li style='font-size:18px'> Aquatics - Swimming and Para Swimming <li style='font-size:16px'>Rugby Sevens <li style='font-size:18px'>Cycling - Road  <li style='font-size:18px'>Boxing / Hockey</ul>

<ul style=';font-size:22px;color:rgb(255,20,147)'>The top 5 Women's sports played are:<li style='font-size:16px'> Atheltics and Para Athletics<li style='font-size:18px'> Aquatics - Swimming and Para Swimming <li style='font-size:16px'>Hockey<li style='font-size:18px'>Netball<li style='font-size:18px'>Cricket T20</ul>

## Participation of each Country

In [6]:
fig = px.histogram(y = players_data['TEAM'].sort_values(ascending=False), color=players_data['GENDER'], barmode='stack', color_discrete_sequence=['darkcyan', 'magenta'], height=1200)
fig.show()

## Participation of each Country in every Sports

In [7]:
fig = px.histogram(y = players_data['TEAM'].sort_values(ascending=False), color=players_data['SPORT'], barmode='stack', color_discrete_sequence=px.colors.qualitative.Dark24, height=1200)
fig.show()

# Winners Visualization

In [8]:
winners_data = pd.read_csv('../input/commonwealth-games-2022/commonwealth games 2022 - players won medals in cwg games 2022.csv')
winners_data

Unnamed: 0,ATHLETE NAME,TEAM,SPORT,EVENT,MEDAL,CONTINENT
0,Greg Hire,Australia,3x3 Basketball,Men,S,Australia& Oceania
1,Daniel Geoffrey Craig Johnson,Australia,3x3 Basketball,Men,S,Australia& Oceania
2,Lauren Mansfield,Australia,3x3 Basketball,Women,B,Australia& Oceania
3,Lauren Scherf,Australia,3x3 Basketball,Women,B,Australia& Oceania
4,Jesse Wagstaff,Australia,3x3 Basketball,Men,S,Australia& Oceania
...,...,...,...,...,...,...
1553,Olivia Mathias,Wales,Triathlon and Para Triathlon,Mixed Team Relay,S,Europe
1554,Non Stanford,Wales,Triathlon and Para Triathlon,Mixed Team Relay,S,Europe
1555,Muzala Samukonga,Zambia,Athletics and Para Athletics,Men's 400m,G,Africa
1556,Patrick Chinyemba,Zambia,Boxing,Men’s Over 48kg-51kg (Flyweight),B,Africa


In [9]:
winners_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1558 entries, 0 to 1557
Data columns (total 6 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   ATHLETE NAME  1558 non-null   object
 1   TEAM          1558 non-null   object
 2   SPORT         1558 non-null   object
 3   EVENT         1558 non-null   object
 4   MEDAL         1558 non-null   object
 5   CONTINENT     1558 non-null   object
dtypes: object(6)
memory usage: 73.2+ KB


## Medals Won by each Country

In [10]:
fig = px.histogram(y = winners_data['TEAM'].sort_values(ascending=False), color=winners_data['MEDAL'], barmode='stack', color_discrete_sequence=["#C0C0C0", "#CD7F32", "#FFD700"])
fig.show()

In [11]:
df = pd.get_dummies(winners_data, columns=['MEDAL'])
df["Total"] = df['MEDAL_G'] + df['MEDAL_S'] + df['MEDAL_B']
df

Unnamed: 0,ATHLETE NAME,TEAM,SPORT,EVENT,CONTINENT,MEDAL_B,MEDAL_G,MEDAL_S,Total
0,Greg Hire,Australia,3x3 Basketball,Men,Australia& Oceania,0,0,1,1
1,Daniel Geoffrey Craig Johnson,Australia,3x3 Basketball,Men,Australia& Oceania,0,0,1,1
2,Lauren Mansfield,Australia,3x3 Basketball,Women,Australia& Oceania,1,0,0,1
3,Lauren Scherf,Australia,3x3 Basketball,Women,Australia& Oceania,1,0,0,1
4,Jesse Wagstaff,Australia,3x3 Basketball,Men,Australia& Oceania,0,0,1,1
...,...,...,...,...,...,...,...,...,...
1553,Olivia Mathias,Wales,Triathlon and Para Triathlon,Mixed Team Relay,Europe,0,0,1,1
1554,Non Stanford,Wales,Triathlon and Para Triathlon,Mixed Team Relay,Europe,0,0,1,1
1555,Muzala Samukonga,Zambia,Athletics and Para Athletics,Men's 400m,Africa,0,1,0,1
1556,Patrick Chinyemba,Zambia,Boxing,Men’s Over 48kg-51kg (Flyweight),Africa,1,0,0,1


## SunBurst Charts

#### Hierarchy :   Continent -> Team -> Sport -> Event

In [12]:
px.sunburst(df, path=['CONTINENT', 'TEAM', 'SPORT', 'EVENT'], values='Total', height=1000, width=1000)

#### Hierarchy :   Continent -> Team -> Sport -> Athlete Name

In [13]:
px.sunburst(df, path=['CONTINENT', 'TEAM', 'SPORT', 'ATHLETE NAME'], values='Total', height=1000, width=1000)

#### Thank you for checking out the Notebook!