👨💻 Author: Anton Reyes
This project is more for my own practice with data visualization, dashboarding, processing/cleaning, and prediction.
This project contains notebooks in which I attempted to predict the winner of RuPaul's Drag Race season 15 by using machine learning. This was done by using the data of winners in all the different franchises and countries. This project also contains a dashboard that attempts to summarize the data of all finalists in all the different franchises.
(Note: This project contains all the finalists up until US Season 15)
Other than that, this project is more for my own practice with data visualization, dashboarding, processing/cleaning, and prediction with a domain that I am familiar with.
Project Dashboard: Click here (Hosting in progress)
Notebook | Description | Status |
---|---|---|
ModelP1.ipynb |
Contains the EDA and Preprocessing for prediction model | Complete |
ModelP2.ipynb |
Contains the prediction models and reuslt | Complete |
Processing.ipynb |
EDA and Preprocessing of finalist data for the dashboard | Complete |
dashboard.py |
Contains visualizations and dashboard | Complete |
- Exploratory Data Analysis (EDA)
- Data preprocessing
- Data cleaning
- Data visualization
- Predictive modeling
- Python
- jupyter
- Dash
- VSCode
- Google Sheets
- Render
Tracking the the contestants' progress from the different franchises of RuPaul's Drag Race has been a weekly thing for me since March 2020. I enjoyed watching the show as well as recording the weekly placements of each contestant. Eventually, after so many franchises, I was able to build my own criteria on which contestestants would reach the finale of their season.
As of April 2023, the dataset has more than 10 tabs for the different franchises, seasons, and countries - and it's still growing.
Aside from manually entering each episode on a weekly basis, I wanted to take it a step further and apply what I've learned from my minor program in Data Science.
I consider this as a culminating mini-final-project for myself.
No webscraping of any kind was done to get this dataset
This is where all the contestant progress is stored. Regardless of the franchise or country, their progress is recorded on a weekly basis.
Since watching all episodes of different countries would be a limitation for me (some franchises are not available or hard to get access to in the Philippines), I go to
Fandom
to check the contestant progress. After that, I check their placements and encode it into theGoogle Sheet
tracker. Though unreliable for other cases, I useWikipedia
to cross-check any placements withFandom
andYouTube
Since some franchises are hard to get access to from the Philippines,
YouTube
becomes an indicator for me to record and update the tracker. Additionally, in the event that I get skeptical with any of my sources, I get to see for myself if the said progress is justified based on the episode itself.
- Gathering the winners and identifying the returning challenges
- Doing the modeling and applying it to my datasets
- Relearning Dash, HTML, and CSS or styling for Dash components (dbc)
- Dealt with inconsistencies in the dataset
- Data exploration/descriptive statistics
- Data Preprocessing/Cleaning
- Writeup/Storytelling
- Layout study for dashboard.
First off, thanks for considering to contribute to this project! Contributions are what makes the open-source community such an amazing place to learn, inspire, and create. Any contributions you make will benefit everybody else and are greatly appreciated.
You can contribute by:
- Improving documentation
- Implementing a new feature
- Discuss potential ways to improve project
- Adding another graph/plot
Just make sure that your contributions or reports are:
- Reproducible. Include steps to reproduce the problem.
- Specific. Include as much detail as possible: which version, what environment, etc.
- Unique. Do not duplicate existing opened issues.