In [12]:
import pandas as pd

file_path = "Olympic 1976-2008.csv"

# Read the CSV file with a different encoding
data = pd.read_csv(file_path, encoding='latin-1')

# Display the first few rows of the dataframe
data.head()

Unnamed: 0,City,Year,Sport,Discipline,Event,Athlete,Gender,Country_Code,Country,Event_gender,Medal
0,Montreal,1976.0,Aquatics,Diving,3m springboard,"KÖHLER, Christa",Women,GDR,East Germany,W,Silver
1,Montreal,1976.0,Aquatics,Diving,3m springboard,"KOSENKOV, Aleksandr",Men,URS,Soviet Union,M,Bronze
2,Montreal,1976.0,Aquatics,Diving,3m springboard,"BOGGS, Philip George",Men,USA,United States,M,Gold
3,Montreal,1976.0,Aquatics,Diving,3m springboard,"CAGNOTTO, Giorgio Franco",Men,ITA,Italy,M,Silver
4,Montreal,1976.0,Aquatics,Diving,10m platform,"WILSON, Deborah Keplar",Women,USA,United States,W,Bronze


# Brief About Dataset

The dataset captures comprehensive details of Olympic Games spanning the period from 1976 to 2008. It includes information on the host city, year, and a diverse range of sports and disciplines such as Aquatics and Athletics. Each entry specifies the event within a discipline, the participating athlete's name and gender, the three-letter country code, the corresponding country, the gender category of the event, and the type of medal (Gold, Silver, or Bronze) earned by the athlete. This rich dataset facilitates in-depth analyses, enabling exploration of trends in medal distribution, identification of successful athletes, and examination of the performance of different countries in specific sports and disciplines throughout the specified time frame.







In [19]:
pip install dash pandas

In [20]:
pip install openpyxl

In [21]:
pip install jupyter-dash

In [22]:
pip install ffmpeg-python

In [23]:
pip install pandas plotly bar_chart_race

In [24]:
pip install bar_chart_race

In [None]:
pip install dash dash-bootstrap-components plotly

In [None]:
pip install plotly dash-bootstrap-components

# Objective 

The objective of studying Olympic data from 1976 to 2008 is to comprehensively understand global sports performance dynamics. This involves analyzing athlete, team, and national performance patterns, identifying success factors and challenges. Insights into sports evolution cover changes in training, technology, and socio-political impacts, informing national and international sports policies and enhancing infrastructure and talent programs. The project aids strategic planning for future Olympics by predicting growth areas and optimizing resource allocation. Talent identification trends assist in refining processes and guiding athlete development programs. Global sports landscape analysis compares performance, identifying best practices. Sports management strategies benefit from data-driven decision-making, optimizing coaching, training, sponsorship, and resource allocation. The dataset contributes to academic research, enhancing understanding of sports trends and performance analytics. Promoting data-driven decisions in the sports industry, the project aims to create a lasting legacy by preserving and inspiring future generations through the rich history of Olympic competition.
 



In [1]:
import pandas as pd
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
import plotly.express as px

# Read the CSV file
file_path = "Olympic 1976-2008.csv"
data = pd.read_csv(file_path, encoding='latin-1')

# Preprocess the data to get the count of medals for each country
medal_counts = data.groupby(['Country', 'Medal']).size().unstack().fillna(0)

# Create a Dash web application
app = dash.Dash(__name__)

# Define the layout of the app
app.layout = html.Div([
    html.H1("Most Successful Country"),
    
    # Dropdown for selecting medal type
    dcc.Dropdown(
        id='medal-dropdown',
        options=[
            {'label': 'Gold', 'value': 'Gold'},
            {'label': 'Silver', 'value': 'Silver'},
            {'label': 'Bronze', 'value': 'Bronze'}
        ],
        value='Gold',
        style={'width': '50%'}
    ),
    
    # Line chart to display medal counts by country
    dcc.Graph(id='medal-line-chart'),
])

# Define callback to update line chart based on dropdown selection
@app.callback(
    Output('medal-line-chart', 'figure'),
    [Input('medal-dropdown', 'value')]
)
def update_line_chart(selected_medal):
    # Create a line chart using Plotly Express
    fig = px.line(
        medal_counts.reset_index(),
        x='Country',
        y=selected_medal,
        title=f'Top Countries for {selected_medal} Medals',
        labels={'Country': 'Country', selected_medal: f'{selected_medal} Medals'},
        template='plotly_dark'  # Set dark template
    )

    # Customize line chart colors
    color_dict = {'Gold': 'gold', 'Silver': 'silver', 'Bronze': 'darkorange'}
    fig.update_traces(line=dict(color=color_dict[selected_medal], width=2, dash='solid'))

    return fig

# Run the app on a different port to avoid conflicts
app.run_server(debug=True, port=8132, mode="inline")


The dash_core_components package is deprecated. Please replace
`import dash_core_components as dcc` with `from dash import dcc`
  import dash_core_components as dcc
The dash_html_components package is deprecated. Please replace
`import dash_html_components as html` with `from dash import html`
  import dash_html_components as html


# Most Successful Country

•	Line chart visualizing medal counts by country for Gold, Silver, or Bronze medals from 1976 to 2008.

•	Dropdown menu for selecting medal type, aiding performance evaluation and strategic planning.

•	Helps in identifying trends, resource allocation decisions, and benchmarking against top-performing nations.



In [2]:
import pandas as pd
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
import plotly.express as px

# Read the CSV file with a different encoding
file_path = "Olympic 1976-2008.csv"
data = pd.read_csv(file_path, encoding='latin-1')

# Initialize the Dash app
app = dash.Dash(__name__)

# Get unique non-null cities from the data
unique_cities = data['City'].dropna().unique()

# Define the layout of the dashboard
app.layout = html.Div([
    html.H1("Discipline-wise Most Successful Athlete", style={'text-align': 'center', 'color': 'black'}),
    
    # Dropdown for selecting a city
    dcc.Dropdown(
        id='city-dropdown',
        options=[{'label': city, 'value': city} for city in unique_cities],
        value=unique_cities[0],
        multi=False,
        style={'width': '50%', 'margin': '20px auto'}
    ),
    
    # Bubble chart showing the most successful athlete in each discipline
    dcc.Graph(
        id='athlete-bubble-chart',
        style={'height': '600px'}
    ),
])

# Define callback to update the bubble chart based on selected city
@app.callback(
    Output('athlete-bubble-chart', 'figure'),
    [Input('city-dropdown', 'value')]
)
def update_bubble_chart(selected_city):
    # Filter data based on selected city
    city_data = data[data['City'] == selected_city]

    # Group by discipline and find the most successful athlete in each discipline
    most_successful_athletes = city_data.groupby('Discipline')['Athlete'].agg(lambda x: x.value_counts().idxmax()).reset_index()

    # Get the medal count for each athlete
    medal_counts = city_data.groupby(['Discipline', 'Athlete']).size().reset_index(name='Medal Count')

    # Merge the most successful athlete with medal counts
    most_successful_athletes = pd.merge(most_successful_athletes, medal_counts, on=['Discipline', 'Athlete'], how='left')

    # Create a bubble chart
    bubble_chart = px.scatter(
        most_successful_athletes,
        x='Discipline',
        y='Athlete',
        size='Medal Count',
        color='Medal Count',
        hover_data=['Medal Count'],
        title=f'Most Successful Athlete in Each Discipline ({selected_city})',
        labels={'Athlete': 'Athlete', 'Discipline': 'Discipline'},
        template='plotly_dark'
    )

    # Update layout for the bubble chart
    bubble_chart.update_layout(
        xaxis=dict(title='Discipline', showgrid=False, zeroline=False),
        yaxis=dict(title='Athlete', showgrid=False, zeroline=False),
    )

    return bubble_chart

# Run the app
if __name__ == '__main__':
    app.run_server(debug=True, port=153)


# Discipline-wise Most Successful Athlete


•	Bubble chart dynamically displaying most successful athletes in each discipline for selected cities (1976-2008).

•	Assists in identifying key performers, guiding talent development, and optimizing resource allocation.

•	Supports strategic partnerships and sponsorships based on historical athlete success in specific disciplines.


In [3]:
import pandas as pd
from dash import Dash, html, dcc
from dash.dependencies import Input, Output
import plotly.express as px
import dash_bootstrap_components as dbc

# Read the CSV file with a different encoding
file_path = "Olympic 1976-2008.csv"
data = pd.read_csv(file_path, encoding='latin-1')

# Get unique non-null cities from the data
unique_cities = data['City'].dropna().unique()

# Initialize the Dash app
app = Dash(__name__, external_stylesheets=[dbc.themes.BOOTSTRAP])

# Define the layout of the dashboard
app.layout = dbc.Container(
    [
        html.H1("Medal Count And Gender Distribution", className="display-4 text-center text-light mt-4 mb-4", style={'color': 'black', 'font-weight': 'bold'}),
        
        # Dropdown for selecting a city
        dcc.Dropdown(
            id='city-dropdown',
            options=[{'label': city, 'value': city} for city in unique_cities],
            value=unique_cities[0],
            multi=False,
            style={'width': '50%', 'margin': '20px auto'}
        ),
        
        # Bar chart showing the number of medals by sport
        dcc.Graph(
            id='sport-bar-chart',
            style={'height': '400px'}
        ),

        # Dropdown for selecting a medal
        dcc.Dropdown(
            id='medal-dropdown',
            options=[
                {'label': medal, 'value': medal} for medal in data['Medal'].dropna().unique()
            ],
            value=data['Medal'].dropna().unique()[0],
            multi=False,
            style={'width': '50%', 'margin': '20px auto'}
        ),

        # Dropdown for selecting a country
        dcc.Dropdown(
            id='country-dropdown',
            options=[
                {'label': country, 'value': country} for country in data['Country'].dropna().unique()
            ],
            value=data['Country'].dropna().unique()[0],
            multi=False,
            style={'width': '50%', 'margin': '20px auto'}
        ),

        # Pie chart showing the distribution of medals by medal, country, and gender
        dcc.Graph(
            id='medal-country-pie-chart',
            style={'height': '400px'}
        ),
    ],
    fluid=True
)

# Define callback to update both charts based on the selected city, medal, or country
@app.callback(
    [Output('sport-bar-chart', 'figure'),
     Output('medal-country-pie-chart', 'figure')],
    [Input('city-dropdown', 'value'),
     Input('medal-dropdown', 'value'),
     Input('country-dropdown', 'value')]
)
def update_charts(selected_city, selected_medal, selected_country):
    # Filter data based on selected slicers
    filtered_data = data[(data['City'] == selected_city) & 
                         (data['Medal'] == selected_medal) & 
                         (data['Country'] == selected_country)]

    # Count medal occurrences
    medal_counts = filtered_data['Sport'].value_counts().reset_index()
    medal_counts.columns = ['Sport', 'Medal Count']

    # Update bar chart
    sport_bar_chart = px.bar(
        medal_counts,
        x='Sport',
        y='Medal Count',
        title=f'Medal Count by Sport in {selected_city}',
        labels={'Sport': 'Sport', 'Medal Count': 'Medal Count'},
        height=400,
        color='Medal Count',  # Set color scale to 'Medal Count' for gradient colors
        color_continuous_scale='Viridis',  # Use Viridis color scale for gradient colors
    )

    sport_bar_chart.update_layout(
        xaxis=dict(title='Sport', showgrid=False, zeroline=False),
        yaxis=dict(title='Medal Count', showgrid=False, zeroline=False),
        plot_bgcolor='rgba(0,0,0,0)',  # Set plot background color
        font=dict(color='#fff'),  # Set font color
        margin=dict(t=40, l=40, r=40, b=40),  # Add margins for better visibility
        paper_bgcolor='#1e1e1e',  # Set paper background color
    )

    # Update pie chart with gender information
    medal_country_pie_chart = px.pie(
        filtered_data,
        names='Gender',
        title=f'Distribution of {selected_medal} Medals in {selected_country} by Gender',
        height=400,
        color_discrete_sequence=px.colors.qualitative.Set3,  # Set color scheme for gender
    )

    medal_country_pie_chart.update_layout(
        plot_bgcolor='rgba(0,0,0,0)',  # Set plot background color
        font=dict(color='#fff'),  # Set font color
        margin=dict(t=40, l=40, r=40, b=40),  # Add margins for better visibility
        paper_bgcolor='#1e1e1e',  # Set paper background color
    )

    return sport_bar_chart, medal_country_pie_chart

# Run the app on a different port
if __name__ == '__main__':
    app.run_server(debug=True, port=8091)


# Medal Count And Gender Distribution


•	Bar chart shows medal count by sport for a selected city, aiding performance analysis.

•	Pie chart illustrates gender distribution of selected medal types within a country.

•	Supports targeted resource allocation, strategic planning, and gender equality initiatives in sports.


In [4]:
import pandas as pd
from dash import Dash, html, dcc
from dash.dependencies import Input, Output
import plotly.express as px

# Read the CSV file with a different encoding
file_path = "Olympic 1976-2008.csv"
data = pd.read_csv(file_path, encoding='latin-1')

# Get unique non-null cities from the data
unique_cities = data['City'].dropna().unique()

# Initialize a new Dash app
app = Dash(__name__)

# Define the layout of the dashboard
app.layout = html.Div([
    html.H1("Medal Count vs Sport Discipline", style={'text-align': 'center', 'color': 'black'}),
    
    html.Div([
        html.H3("Medal Count by Sport", style={'color': '#fff'}),
        
        # Dropdown for selecting a city
        dcc.Dropdown(
            id='city-dropdown',
            options=[{'label': city, 'value': city} for city in unique_cities],
            value=unique_cities[0],
            multi=False,
            style={'width': '50%', 'margin': '20px auto'}
        ),
    ]),
    
    # Horizontal bar chart showing the number of medals by sport
    dcc.Graph(
        id='sport-bar-chart',
        style={'height': '400px'}
    ),
])

# Define callback to update the bar chart based on the selected city
@app.callback(
    Output('sport-bar-chart', 'figure'),
    [Input('city-dropdown', 'value')]
)
def update_bar_chart(selected_city):
    # Filter data based on the selected city
    filtered_data = data[data['City'] == selected_city]

    # Count medal occurrences
    medal_counts = filtered_data['Sport'].value_counts().reset_index()
    medal_counts.columns = ['Sport', 'Medal Count']

    # Create horizontal bar chart
    bar_chart = px.bar(
        medal_counts,
        x='Medal Count',
        y='Sport',
        title=f'Medal Count by Sport in {selected_city}',
        labels={'Sport': 'Sport', 'Medal Count': 'Medal Count'},
        height=400,
        color_discrete_sequence=px.colors.diverging.Portland,  # Use diverging color scheme
        orientation='h'  # Make it a horizontal bar chart
    )

    bar_chart.update_layout(
        xaxis=dict(title='Medal Count', showgrid=False, zeroline=False),
        yaxis=dict(title='Sport', showgrid=False, zeroline=False),
        plot_bgcolor='#2c2c2c',  # Set dark background color
        paper_bgcolor='#2c2c2c',  # Set dark paper background color
        font=dict(color='#fff'),  # Set font color
    )

    return bar_chart

# Run the app with a different port
if __name__ == '__main__':
    app.run_server(debug=True, port=8092)


# Medal Count vs Sport Discipline

•	Bar chart shows medal count by sport for a selected city, aiding performance analysis.

•	Pie chart illustrates gender distribution of selected medal types within a country.

•	Supports targeted resource allocation, strategic planning, and gender equality initiatives in sports.


In [5]:
from dash import Dash, html, dcc
from dash.dependencies import Input, Output
import pandas as pd
import plotly.express as px

# Read the CSV file with a different encoding
file_path = "Olympic 1976-2008.csv"
data = pd.read_csv(file_path, encoding='latin-1')

# Get unique non-null cities from the data
unique_cities = data['City'].dropna().unique()

# Initialize the Dash app
app = Dash(__name__)

# Define the layout of the dashboard
app.layout = html.Div([
    html.H1("Most Successful Athletes", style={'text-align': 'center', 'color': 'black'}),
    
    # Dropdown for selecting a city
    dcc.Dropdown(
        id='city-dropdown',
        options=[{'label': city, 'value': city} for city in unique_cities],
        value=unique_cities[0],
        multi=False,
        style={'width': '50%', 'margin': '20px auto'}
    ),
    
    # Dropdown for selecting a medal
    dcc.Dropdown(
        id='medal-dropdown',
        options=[
            {'label': medal, 'value': medal} for medal in data['Medal'].dropna().unique()
        ],
        value=data['Medal'].dropna().unique()[0],
        multi=False,
        style={'width': '50%', 'margin': '20px auto'}
    ),
    
    # Slider for setting minimum medal count threshold
    dcc.Slider(
        id='medal-threshold-slider',
        min=1,
        max=20,
        step=1,
        value=5,
        marks={i: str(i) for i in range(1, 21)},
        tooltip={'placement': 'bottom', 'always_visible': True}
    ),

    # Histogram showing the distribution of medals won by athletes
    dcc.Graph(
        id='athlete-histogram',
        style={'height': '400px'}
    ),
])

# Define callback to update the histogram based on selected city, medal, and medal threshold
@app.callback(
    Output('athlete-histogram', 'figure'),
    [Input('city-dropdown', 'value'),
     Input('medal-dropdown', 'value'),
     Input('medal-threshold-slider', 'value')]
)
def update_histogram(selected_city, selected_medal, medal_threshold):
    # Filter data based on selected slicers and medal threshold
    filtered_data = data[(data['City'] == selected_city) & 
                         (data['Medal'] == selected_medal) & 
                         (data.groupby('Athlete')['Medal'].transform('count') >= medal_threshold)]

    # Create histogram
    athlete_histogram = px.histogram(
        filtered_data,
        x='Athlete',
        title=f'Most Successful Athletes in {selected_city} with {selected_medal} Medals (Threshold: {medal_threshold})',
        labels={'Athlete': 'Athlete', 'count': 'Medal Count'},
        height=400,
        color_discrete_sequence=px.colors.sequential.Viridis,  # Use Viridis color scale
    )

    athlete_histogram.update_layout(
        xaxis=dict(title='Athlete', showgrid=False, zeroline=False),
        yaxis=dict(title='Medal Count', showgrid=False, zeroline=False),
        plot_bgcolor='#2c2c2c',  # Set dark plot background color
        paper_bgcolor='#2c2c2c',  # Set dark paper background color
        font=dict(color='#fff'),  # Set font color
    )

    return athlete_histogram

# Run the app
if __name__ == '__main__':
    app.run_server(debug=True, port=8253)



# Most Successful Athletes


•	Histogram dynamically displaying distribution of medals won by athletes based on city, medal type, and threshold.

•	Identifies elite athletes, aids talent development, and supports strategic planning for sports events.

•	Informs sponsorship decisions and provides a performance benchmark for athletes.


In [7]:
import pandas as pd
import plotly.express as px
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output

# Read the CSV file with a different encoding
file_path = "Olympic 1976-2008.csv"
data = pd.read_csv(file_path, encoding='latin-1')

# Remove null values from the 'City' column
cities_without_null = data['City'].dropna().unique()

# Create a list of dictionaries for dropdown options
city_options = [{'label': city, 'value': city} for city in cities_without_null]

# Create a list of dictionaries for medal slicer options
medal_options = [
    {'label': 'Gold', 'value': 'Gold'},
    {'label': 'Silver', 'value': 'Silver'},
    {'label': 'Bronze', 'value': 'Bronze'}
]

# Initialize the Dash app
app = dash.Dash(__name__)

# Define the layout of the app
app.layout = html.Div([
    html.H1("Best Performing Country in Each Country ", style={'text-align': 'center', 'color': 'black'}),
    
    # Dropdown for selecting Olympic City
    html.Label("Select Olympic City:"),
    dcc.Dropdown(
        id='city-dropdown',
        options=city_options,
        value=cities_without_null[0],
        multi=False,
        style={'width': '50%', 'margin': '10px auto'}
    ),

    # Dropdown for selecting type of medal
    html.Label("Select Medal Type:"),
    dcc.Dropdown(
        id='medal-dropdown',
        options=medal_options,
        value='Gold',
        multi=False,
        style={'width': '50%', 'margin': '10px auto'}
    ),

    # Graph component to display the bar chart
    dcc.Graph(id='medal-bar-chart'),

    # Text component to display additional information
    html.Div(id='additional-info', style={'color': '#fff', 'margin-top': '20px'})
])

# Define callback to update the chart and additional info based on the selected city and medal
@app.callback(
    [Output('medal-bar-chart', 'figure'),
     Output('additional-info', 'children')],
    [Input('city-dropdown', 'value'),
     Input('medal-dropdown', 'value')]
)
def update_chart(selected_city, selected_medal):
    filtered_data = data[(data['City'] == selected_city) & (data['Medal'] == selected_medal)]

    # Create an empty list to store traces for each discipline
    traces = []

    # Create an empty list to store additional information
    additional_info = []

    # Loop through unique disciplines
    for discipline in filtered_data['Discipline'].unique():
        # Filter data for the current discipline
        discipline_data = filtered_data[filtered_data['Discipline'] == discipline]

        # Calculate the total medals for each country in the selected city, discipline, and medal type
        medals_by_country = discipline_data.groupby('Country')['Medal'].count().reset_index()
        medals_by_country.columns = ['Country', 'Total Medals']

        # Find the country with the most medals
        most_medals_country = medals_by_country.loc[medals_by_country['Total Medals'].idxmax()]['Country']

        # Create a bar trace for the current discipline
        trace = dict(
            x=[discipline],
            y=[len(medals_by_country)],
            name=most_medals_country,
            hoverinfo='name+y',
            marker=dict(color=px.colors.qualitative.Set1)  # You can choose a different color scale here
        )
        traces.append(trace)

        # Add additional information
        info_text = f"In {selected_city}, {most_medals_country} won the most {selected_medal} medals in {discipline}."
        additional_info.append(info_text)

    # Create layout for the figure
    layout = dict(
        xaxis=dict(title='Discipline', showgrid=False, zeroline=False),
        yaxis=dict(title='Count', showgrid=False, zeroline=False),
        plot_bgcolor='#2c2c2c',
        paper_bgcolor='#2c2c2c',
        font=dict(color='#fff'),
    )

    # Create the figure with the traces and layout
    fig = dict(data=traces, layout=layout)

    return fig, additional_info

# Run the app
if __name__ == '__main__':
    app.run_server(debug=True, port=8160)


# Best Performing Country in Each Country

•	Bar chart displaying the number of countries winning medals in each discipline for a selected city and medal type.

•	Supports discipline-specific performance analysis, identification of successful countries, and strategic planning.

•	Facilitates performance benchmarking against the country with the most medals in each discipline.


In [8]:
from dash import Dash, html, dcc
from dash.dependencies import Input, Output
import pandas as pd
import plotly.express as px

# Read the CSV file with a different encoding
file_path = "Olympic 1976-2008.csv"
data = pd.read_csv(file_path, encoding='latin-1')

# Get unique non-null cities from the data
unique_cities = data['City'].dropna().unique()

# Initialize the Dash app
app = Dash(__name__)

# Define the layout of the dashboard
app.layout = html.Div([
    html.H1("Most Successful Female Athletes", style={'text-align': 'center', 'color': 'black'}),
    
    # Dropdown for selecting a city
    dcc.Dropdown(
        id='city-dropdown',
        options=[{'label': city, 'value': city} for city in unique_cities],
        value=unique_cities[0],
        multi=False,
        style={'width': '50%', 'margin': '20px auto'}
    ),
    
    # Dropdown for selecting a medal
    dcc.Dropdown(
        id='medal-dropdown',
        options=[
            {'label': medal, 'value': medal} for medal in data['Medal'].dropna().unique()
        ],
        value=data['Medal'].dropna().unique()[0],
        multi=False,
        style={'width': '50%', 'margin': '20px auto'}
    ),
    
    # Slider for setting minimum medal count threshold
    dcc.Slider(
        id='medal-threshold-slider',
        min=1,
        max=20,
        step=1,
        value=5,
        marks={i: str(i) for i in range(1, 21)},
        tooltip={'placement': 'bottom', 'always_visible': True}
    ),

    # Histogram showing the distribution of medals won by athletes
    dcc.Graph(
        id='athlete-histogram',
        style={'height': '400px'}
    ),
])

# Define callback to update the histogram based on selected city, medal, and medal threshold
@app.callback(
    Output('athlete-histogram', 'figure'),
    [Input('city-dropdown', 'value'),
     Input('medal-dropdown', 'value'),
     Input('medal-threshold-slider', 'value')]
)
def update_histogram(selected_city, selected_medal, medal_threshold):
    # Filter data based on selected slicers, medal threshold, and gender (female)
    filtered_data = data[(data['City'] == selected_city) & 
                         (data['Medal'] == selected_medal) & 
                         (data['Gender'] == 'Women') &  # Filter for female athletes
                         (data.groupby('Athlete')['Medal'].transform('count') >= medal_threshold)]

    # Create histogram
    athlete_histogram = px.histogram(
        filtered_data,
        x='Athlete',
        title=f'Most Successful Female Athletes in {selected_city} with {selected_medal} Medals (Threshold: {medal_threshold})',
        labels={'Athlete': 'Athlete', 'count': 'Medal Count'},
        height=400,
        color_discrete_sequence=px.colors.qualitative.Set1,  # Use a different color scheme
    )

    athlete_histogram.update_layout(
        xaxis=dict(title='Athlete', showgrid=False, zeroline=False),
        yaxis=dict(title='Medal Count', showgrid=False, zeroline=False),
        plot_bgcolor='#2c2c2c',  # Set dark plot background color
        paper_bgcolor='#2c2c2c',  # Set dark paper background color
        font=dict(color='#fff'),  # Set font color
    )

    return athlete_histogram

# Run the app
if __name__ == '__main__':
    app.run_server(debug=True, port=12153)



# Most Successful Female Athletes

•	Histogram displays the distribution of medals won by female athletes based on city, medal type, and threshold.

•	Identifies successful female athletes for talent recognition, strategic planning, and resource allocation.

•	Supports gender equality initiatives, promotes female athletes as role models, and aids in benchmarking performance.


In [10]:
import pandas as pd
import plotly.express as px
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output

# Read the CSV file with a different encoding
file_path = "Olympic 1976-2008.csv"
data = pd.read_csv(file_path, encoding='latin-1')

# Remove null values from the 'City' column
cities_without_null = data['City'].dropna().unique()

# Create a list of dictionaries for dropdown options
city_options = [{'label': city, 'value': city} for city in cities_without_null]

# Create a list of dictionaries for country slicer options
country_options = [{'label': country, 'value': country} for country in data['Country'].unique() if pd.notnull(country)]

# Create a list of dictionaries for medal type options
medal_options = [{'label': medal, 'value': medal} for medal in data['Medal'].unique() if pd.notnull(medal)]

# Initialize the Dash app
app = dash.Dash(__name__)

# Define the layout of the app
app.layout = html.Div([
    html.H1("Medal Distribution", style={'text-align': 'center', 'color': 'black'}),
    
    # Dropdown for selecting Olympic City
    html.Label("Select Olympic City:"),
    dcc.Dropdown(
        id='city-dropdown',
        options=city_options,
        value=cities_without_null[0],
        multi=False,
        style={'width': '50%', 'margin': '10px auto'}
    ),

    # Dropdown for selecting Country
    html.Label("Select Country:"),
    dcc.Dropdown(
        id='country-dropdown',
        options=country_options,
        multi=False,
        style={'width': '50%', 'margin': '10px auto'}
    ),

    # Dropdown for selecting Medal Type
    html.Label("Select Medal Type:"),
    dcc.Dropdown(
        id='medal-dropdown',
        options=medal_options,
        multi=False,
        style={'width': '50%', 'margin': '10px auto'}
    ),

    # Pie chart to display discipline distribution
    dcc.Graph(id='discipline-pie-chart')
])

# Define callback to update the pie chart based on the selected city, country, and medal type
@app.callback(
    Output('discipline-pie-chart', 'figure'),
    [Input('city-dropdown', 'value'),
     Input('country-dropdown', 'value'),
     Input('medal-dropdown', 'value')]
)
def update_pie_chart(selected_city, selected_country, selected_medal):
    # Filter data based on selected city, country, and medal type
    filtered_data = data[(data['City'] == selected_city) & 
                         (data['Country'] == selected_country) & 
                         (data['Medal'] == selected_medal)]

    # Pie chart for discipline distribution
    discipline_pie_chart = px.pie(
        filtered_data,
        names='Discipline',
        title=f"Discipline Distribution in {selected_city}",
        template='plotly_dark'
    )

    return discipline_pie_chart

# Run the app
if __name__ == '__main__':
    app.run_server(debug=True, port=8181)


# Medal Distribution

1.	Pie chart dynamically displays medal distribution across disciplines for a selected city, country, and medal type.

2.	Aids performance analysis for a specific country, informs resource allocation decisions, and identifies popular disciplines.

3.	Facilitates benchmarking against previous Olympics and guides targeted athlete development programs


In [11]:
import pandas as pd
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
import plotly.express as px

# Read the CSV file with a different encoding
file_path = "Olympic 1976-2008.csv"
data = pd.read_csv(file_path, encoding='latin-1')

# Remove null values from the 'City' column
cities_without_null = data['City'].dropna().unique()

# Create a list of dictionaries for dropdown options
city_options = [{'label': city, 'value': city} for city in cities_without_null]

# Create a list of dictionaries for country slicer options
country_options = [{'label': country, 'value': country} for country in data['Country'].unique() if pd.notnull(country)]

# Create a list of dictionaries for medal type options
medal_options = [{'label': medal, 'value': medal} for medal in data['Medal'].unique() if pd.notnull(medal)]

# Initialize the Dash app
app = dash.Dash(__name__)

# Define the layout of the app
app.layout = html.Div([
    html.H1("Olympic Dashboard", style={'text-align': 'center', 'color': 'Black'}),

    # Dropdown for selecting Olympic City
    html.Label("Select Olympic City:"),
    dcc.Dropdown(
        id='city-dropdown',
        options=city_options,
        value=cities_without_null[0],
        multi=False,
        style={'width': '50%', 'margin': '10px auto'}
    ),

    # Dropdown for selecting Country
    html.Label("Select Country:"),
    dcc.Dropdown(
        id='country-dropdown',
        options=country_options,
        multi=False,
        style={'width': '50%', 'margin': '10px auto'}
    ),

    # Dropdown for selecting Medal Type
    html.Label("Select Medal Type:"),
    dcc.Dropdown(
        id='medal-dropdown',
        options=medal_options,
        multi=False,
        style={'width': '50%', 'margin': '10px auto'}
    ),

    # Slider for setting minimum medal count threshold
    dcc.Slider(
        id='medal-threshold-slider',
        min=1,
        max=20,
        step=1,
        value=5,
        marks={i: str(i) for i in range(1, 21)},
        tooltip={'placement': 'bottom', 'always_visible': True}
    ),

    # Pie chart to display discipline distribution
    dcc.Graph(id='discipline-pie-chart'),

    # Bar chart showing the distribution of medals by sport
    dcc.Graph(id='sport-bar-chart', style={'height': '400px'}),

    # Bar chart showing the distribution of medals by medal type
    dcc.Graph(id='medal-bar-chart', style={'height': '400px'}),

    # Histogram showing the distribution of medals won by athletes
    dcc.Graph(id='athlete-histogram', style={'height': '400px'}),

    # Pie chart showing the distribution of medals by medal, country, and gender
    dcc.Graph(id='medal-country-pie-chart', style={'height': '400px'}),

    # Bar chart showing the most successful athletes
    dcc.Graph(id='successful-athletes-bar-chart', style={'height': '400px'}),

    # Scatter plot showing the correlation between the number of athletes and medals
    dcc.Graph(id='athletes-vs-medals-scatter-plot', style={'height': '400px'}),

    # Bubble chart showing the most successful athlete in each discipline
    dcc.Graph(id='athlete-bubble-chart', style={'height': '600px'}),

    # Bar chart for discipline vs number of medals
    dcc.Graph(id='discipline-vs-medals-bar-chart'),

    # Text component to display additional information
    html.Div(id='additional-info', style={'color': '#fff', 'margin-top': '20px'})
])

# Define callback to update all charts based on the selected city, country, and medal type
@app.callback(
    [Output('discipline-pie-chart', 'figure'),
     Output('sport-bar-chart', 'figure'),
     Output('medal-bar-chart', 'figure'),
     Output('athlete-histogram', 'figure'),
     Output('medal-country-pie-chart', 'figure'),
     Output('successful-athletes-bar-chart', 'figure'),
     Output('athletes-vs-medals-scatter-plot', 'figure'),
     Output('athlete-bubble-chart', 'figure'),
     Output('discipline-vs-medals-bar-chart', 'figure'),
     Output('additional-info', 'children')],
    [Input('city-dropdown', 'value'),
     Input('country-dropdown', 'value'),
     Input('medal-dropdown', 'value'),
     Input('medal-threshold-slider', 'value')]
)
def update_charts(selected_city, selected_country, selected_medal, medal_threshold):
    # Filter data based on selected city, country, and medal type
    filtered_data = data[(data['City'] == selected_city) & 
                         (data['Country'] == selected_country) & 
                         (data['Medal'] == selected_medal) &
                         (data.groupby('Athlete')['Medal'].transform('count') >= medal_threshold)]

    # Update pie chart for discipline distribution
    discipline_pie_chart = px.pie(
        filtered_data,
        names='Discipline',
        title=f"Discipline Distribution in {selected_city}",
        template='plotly_dark'
    )

    # Update bar chart for distribution of medals by sport
    sport_bar_chart = px.bar(
        filtered_data['Sport'].value_counts().reset_index(),
        x='index',
        y='Sport',
        title=f'Distribution of Medals by Sport in {selected_city}',
        labels={'index': 'Sport', 'Sport': 'Count'},
        height=400,
        color='Sport',  # Set color scale to 'Sport' for gradient colors
        color_continuous_scale='Viridis',  # Use Viridis color scale for gradient colors
        template='plotly_dark'
    )

    sport_bar_chart.update_layout(
        xaxis=dict(title='Sport', showgrid=False, zeroline=False),
        yaxis=dict(title='Count', showgrid=False, zeroline=False),
    )

    # Update bar chart for distribution of medals by medal type
    medal_bar_chart = px.bar(
        filtered_data['Medal'].value_counts().reset_index(),
        x='index',
        y='Medal',
        title=f'Distribution of Medals by Medal Type in {selected_city}',
        labels={'index': 'Medal Type', 'Medal': 'Count'},
        height=400,
        color='Medal',  # Set color scale to 'Medal' for gradient colors
        color_continuous_scale='Viridis',  # Use Viridis color scale for gradient colors
        template='plotly_dark'
    )

    medal_bar_chart.update_layout(
        xaxis=dict(title='Medal Type', showgrid=False, zeroline=False),
        yaxis=dict(title='Count', showgrid=False, zeroline=False),
    )

    # Update histogram for distribution of medals won by athletes
    athlete_histogram = px.histogram(
        filtered_data,
        x='Athlete',
        title=f'Distribution of Medals Won by Athletes in {selected_city}',
        labels={'Athlete': 'Athlete', 'count': 'Medal Count'},
        height=400,
        color_discrete_sequence=px.colors.qualitative.Set3,  # Set color scheme for athlete
        template='plotly_dark'
    )

    athlete_histogram.update_layout(
        xaxis=dict(title='Athlete', showgrid=False, zeroline=False),
        yaxis=dict(title='Medal Count', showgrid=False, zeroline=False),
    )

    # Update pie chart for distribution of medals by medal, country, and gender
    medal_country_pie_chart = px.pie(
        filtered_data,
        names='Gender',
        title=f'Distribution of {selected_medal} Medals by Gender in {selected_country}',
        template='plotly_dark'
    )

    # Update bar chart for the most successful athletes
    successful_athletes_bar_chart = px.bar(
        filtered_data.groupby('Athlete')['Medal'].count().reset_index().nlargest(10, 'Medal'),
        x='Medal',
        y='Athlete',
        title=f'Top 10 Most Successful Athletes in {selected_city} ({selected_medal} Medals)',
        labels={'Athlete': 'Athlete', 'Medal': 'Medal Count'},
        height=400,
        color_discrete_sequence=px.colors.qualitative.Set1,  # Use Set1 color scheme
        template='plotly_dark'
    )

    # Update layout for the successful athletes bar chart
    successful_athletes_bar_chart.update_layout(
        xaxis=dict(title='Medal Count', showgrid=False, zeroline=False),
        yaxis=dict(title='Athlete', showgrid=False, zeroline=False),
    )

    # Scatter plot showing the correlation between the number of athletes and medals
    athletes_vs_medals_scatter_plot = px.scatter(
        filtered_data.groupby('Country').agg({'Athlete': 'nunique', 'Medal': 'count'}).reset_index(),
        x='Athlete',
        y='Medal',
        title=f'Correlation between Number of Athletes and Medals in {selected_country}',
        labels={'Athlete': 'Number of Athletes', 'Medal': 'Number of Medals'},
        template='plotly_dark',
        color_discrete_sequence=['#1f77b4'],  # Set color
        hover_name='Country'  # Show country names on hover
    )

    # Update layout for the scatter plot
    athletes_vs_medals_scatter_plot.update_layout(
        xaxis=dict(title='Number of Athletes', showgrid=False, zeroline=False),
        yaxis=dict(title='Number of Medals', showgrid=False, zeroline=False),
    )

    # Bubble chart showing the most successful athlete in each discipline
    most_successful_athletes = data.groupby('Discipline')['Athlete'].agg(lambda x: x.value_counts().idxmax()).reset_index()
    medal_counts = data.groupby(['Discipline', 'Athlete']).size().reset_index(name='Medal Count')
    most_successful_athletes = pd.merge(most_successful_athletes, medal_counts, on=['Discipline', 'Athlete'], how='left')

    bubble_chart = px.scatter(
        most_successful_athletes,
        x='Discipline',
        y='Athlete',
        size='Medal Count',
        color='Medal Count',
        hover_data=['Medal Count'],
        title=f'Most Successful Athlete in Each Discipline ({selected_city})',
        labels={'Athlete': 'Athlete', 'Discipline': 'Discipline'},
        template='plotly_dark'
    )

    bubble_chart.update_layout(
        xaxis=dict(title='Discipline', showgrid=False, zeroline=False),
        yaxis=dict(title='Athlete', showgrid=False, zeroline=False),
    )

    # Bar chart for discipline vs number of medals
    discipline_vs_medals_bar_chart = px.bar(
        filtered_data.groupby('Discipline')['Medal'].count().reset_index(),
        x='Discipline',
        y='Medal',
        title=f'Discipline vs Number of Medals in {selected_city}',
        labels={'Discipline': 'Discipline', 'Medal': 'Number of Medals'},
        height=400,
        color='Medal',  # Set color scale to 'Medal' for gradient colors
        color_continuous_scale='Viridis',  # Use Viridis color scale for gradient colors
        template='plotly_dark'
    )

    discipline_vs_medals_bar_chart.update_layout(
        xaxis=dict(title='Discipline', showgrid=False, zeroline=False),
        yaxis=dict(title='Number of Medals', showgrid=False, zeroline=False),
    )

    # Additional information
    additional_info = f"Selected City: {selected_city}, Selected Country: {selected_country}, Selected Medal: {selected_medal}, Medal Threshold: {medal_threshold}"

    return (
        discipline_pie_chart, sport_bar_chart, medal_bar_chart, athlete_histogram, medal_country_pie_chart,
        successful_athletes_bar_chart, athletes_vs_medals_scatter_plot, bubble_chart, discipline_vs_medals_bar_chart,
        additional_info
    )

# Run the app
if __name__ == '__main__':
    app.run_server(debug=True, port=299)


# Project Report:

Objective:

The objective of studying Olympic data from 1976 to 2008 is to comprehensively understand global sports performance dynamics. This involves analyzing athlete, team, and national performance patterns, identifying success factors and challenges. Insights into sports evolution cover changes in training, technology, and socio-political impacts, informing national and international sports policies and enhancing infrastructure and talent programs. The project aids strategic planning for future Olympics by predicting growth areas and optimizing resource allocation. Talent identification trends assist in refining processes and guiding athlete development programs. Global sports landscape analysis compares performance, identifying best practices. Sports management strategies benefit from data-driven decision-making, optimizing coaching, training, sponsorship, and resource allocation. The dataset contributes to academic research, enhancing understanding of sports trends and performance analytics. Promoting data-driven decisions in the sports industry, the project aims to create a lasting legacy by preserving and inspiring future generations through the rich history of Olympic competition.


Key Findings:


Dominant Nations: The US emerges as the overall medal leader, but dynamic shifts are observed across Olympic cycles. The USSR and GDR historically ranked high, while China's meteoric rise showcases strategic investment in talent development.
Discipline-Specific Stars: Bubble charts reveal top athletes by discipline, offering valuable insights for talent scouting and training programs.

Gender Gaps and Parity: Bar and pie charts dissect medal distribution by gender, highlighting discrepancies and potential areas for promoting equality.

Sport Variations: Horizontal bar charts illustrate the spread of medals across different sports, showcasing national strengths and potential development areas.

Athlete Profiles: Histograms unveil the distribution of athletes based on medal counts, offering insights into the competitive landscape and identifying potential training targets.

Country Comparisons: Dynamic charts enable benchmarking against successful nations, aiding in strategic planning and resource allocation.


Managerial Implications:


Strategic Resource Allocation:

Sports organizations can strategically allocate resources based on the distribution of medals by sport and discipline. This information guides investment in training programs, coaching staff, and facilities to enhance performance in areas of historical success.

Talent Identification and Development:

The application aids sports managers in identifying key athletes and disciplines where consistent success has been achieved. This information is crucial for talent scouting, development programs, and strategic planning to nurture future champions.

Gender Diversity Initiatives:

The gender distribution analysis informs sports organizations about the participation and success of male and female athletes. It provides a basis for initiatives promoting gender equality and helps identify areas for improvement in gender-specific sports.

Strategic Partnerships and Sponsorships:

Recognizing the most successful athletes and disciplines allows for targeted partnerships and sponsorships. Sponsors may be interested in supporting athletes with proven track records, contributing to the financial sustainability of sports programs.

Performance Benchmarking:

Sports organizations and countries can benchmark their performance against historical data, setting realistic goals for future events. Benchmarking against successful countries and athletes provides a reference point for continuous improvement.


Recommendations:

Strategic Partnerships: Leverage data to identify countries with complementary strengths for mutually beneficial collaborations.

Targeted Training Programs: Invest in disciplines with high potential based on historical trends and future Olympic projections.

Gender Equity Focus: Implement initiatives to attract and retain female athletes in traditionally male-dominated sports.

Benchmarking and Goal Setting: Continuously assess progress against successful nations and athletes to drive improvement.

Data-Driven Decisions: Integrate data analysis into coaching methodologies and athlete sponsorship decisions.

Conclusion:

This comprehensive analysis of Olympic data provides a roadmap for sports organizations to optimize their performance and achieve sustained success. By leveraging the power of data visualization and insightful analysis, we can empower future generations of athletes to reach their full potential and leave their mark on the Olympic stage. This project paves the way for a data-driven approach to sports management, ensuring our dedication to athletic excellence transcends generations and inspires the world.