Skip to content

KeanuSwart/codedex-track3-hackathon

Repository files navigation

codedex-track3-hackathon

Please read below!

The purpose behind this Track was to predict the winners of the 2024 Paris Summer Olympics.

In my implementation I made use of Pyhton (Flask) in order to make a web application that displays my predicted findings.

I will go over how I was able to produce my Project:

So, initially I got the following dataset from Kaggle: https://www.kaggle.com/datasets/piterfm/olympic-games-medals-19862018 [results.csv]. This dataset holds all Olympic event results from 1896-2022. I started cleaning this dataset, since I realised we will only need historic data for the Summer Olympics and fiured that it wouldn't really be relevant to include data so old that it would not effect current predictions. With this in mind, deleted all Winter Olympics records as well as all Summer Olympics records older than London 2012. In addition, I simplified the data into the following fields: discipline_title,event_title,slug_game,medal_type,rank_equal,rank_position,country_name,country_code,country_3_letter_code

This can be seen in the cleaned dataset [updated_filtered_results.csv]: image

With this dataset, I wanted to simplify it even more by only storing the nations that were in the top 5 of each event. To do this, I ran the following code [remove_data.py]: image

The above code produced olympic_rank_counts.csv, which can be seen below: image

Now that the data is in a more appropriate format, I was able to start making predictions on the dataset. I made use of sklearn in order to develop a prediction model (The following segment can be seen in predict_script.py).

I started with loading the data and started with normalizing the data by assigning weights based on the olympic event. I figurerd that, the more recent the olympic event was, the greater the weight should be since it would be more likely for a nation to achieve gold, silver or bronze is they more recently achieved those medals. I followed this by defining the target variables for the prediction, which were the events per nation and then merged these columns into a new dataframe: image

I then started with the model construction and split data into training and testing sets. Following this, I was able to make predictions and evaluate these predictions with the data that had been trained: image

I then stored the prediction data in a file named olympic_predictions_2024.csv: image

Which looked like the following: image

This was great, however, I wanted to clean this up so that we only have the predicted values instead of the weights as well, so I made use of clean_data.py to clean the data and ended up getting the following [processed_olympic_predictions_2024.csv]: image

And from this point forward, I made use of processed_olympic_predictions.csv as my prediction set :)

In order to run this application on your side, I would suggest installing the latest versions of:

  • Python
  • Pip
  • Flask
  • sklearn
  • pandas
  • plotly

Once these have been installed, you can navigate to app.py and run the following: python app.py

You will be met with the following screen (Yes, I know my frontend skills aren't the greatest haha but I did make sure to use the Paris 2024 colours. I also didn't want to use the logos because of potentialy copyright? I wasn't sure but I hope you enjoy): image

Following this you can click on the 'Let's Predict' button, where the winner will be announced!: image

I am sure you guys are happy with the predicted winner! Anyways, you can click on the 'Show Predictions' button and will be met with the predictions page: image

This is your 2024 Summer Olympics Prediction Dashboard! You are met with a table that displays the Top 5 Rankings. This table includes how many Golds, Silvers and Bronzes each nation is predicted to win.

I would like to mention that all plots are interactive, so if you want to see hwo well a specific nation did, you can click on the nam,e of the nation in the plot and it will show you their analysis! (Double-click to return to default analysis).

Underneath this, you will see a Filters Tab. This tab works with the plots below the tab (Bar Graph, Pie Chart and Table). Let's say, for example, you wanted to see who got Gold medals at the Athletics. You will enter the following: image

And you can view the analysis below!: image image

I would advise playing around with the filters, it is vewry fun to see the different predictions for each event :)

Moving on, we can navigate to the 'Compare Nations' tab at the top of the page, where you will be met with the following: image

Woah, it's so empty :( It's because you will need to select the two nations you want to compare! For example, let's compare USA to China (1st and 2nd Place based on the predictions): image

And if you scroll down, you will see the following table, helping visualize the data that is in the bar graph: image

Once again, feel free to play around, it is pretty fun.

If you navigate to 'View All Data', you will be met with the following screen: image

There is a brief analysis of which nation achieved the highest amount of Gold, SIlver and Bronze medals as well as the nation with the least! Below this is a table of all the nations that achieved at least one medal at the Olympics.

If you navigate to 'Data Description', you will be met with descriptions of the datasets that have been used, as well as the process and scripts used to clean and engineer the data that has been used. Feel free to go through this - it is informative, but not as informative as this readme :): image

With regards to the challenges that I faced when creating this application, the largest hurdle to get over was definitely the data preperation steps ... It was a real struggle trying to find data that c ould be used appropriately - some datasets would have different anmes for the same sports, and it looked very scruffy when applying that into a UI. So, cleaning, preparing and engineering tghe data definitely took the longest. I think it is also very apparent that I am not an Front-End Dev, however I really enjoyed making use of Flask and being able to visualize the data that I had predicted was very rewarding!

And that is all! Thanks a lot for setting up this hackathon, it has been a great pleasure to be involved in it and I would like tto extend my thanks to the mentors/collaborators and members of the team that spent countless hours setting the tracks up! I hope you enjoyed my project and I hope to be involved in more in the future :) Have a gret day further! Also, if you have any question please feel free to reach out to me! I have included my LinkedIn below: https://www.linkedin.com/in/keanu-swart-930013210/

About

My implementation of Track 3 for Codedex Summer Hackathon 2024

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors