Skip to content

shubham14yadav/Attendance-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Analyzing Sports Event Attendance Using Pandas, SQL, and Data Visualization

Description:
This project explores sports event attendance data to uncover insights using Python libraries such as Pandas, pandasql, and visualization tools like Matplotlib and Seaborn. The project involves loading and manipulating sports event data from Excel files, performing SQL-like queries in Python, and visualizing trends in attendance based on various factors like game timing and weather conditions.

Key Features and Analysis Performed:

  1. Data Loading and Preparation: Imported datasets from Excel files into Pandas DataFrames, including game sales, UCP scans, and reservations.

  2. SQL-like Data Queries: Utilized pandasql for SQL-style querying within Python. Analyses included comparing attendance for evening and afternoon games, monthly attendance trends, and UCP member behavior.

  3. Attendance Analysis: Investigated the correlation between game timing (evening vs. afternoon) and attendance, average attendance by month, and the behavior of UCP members with specific reservation and scan thresholds.

  4. Web Scraping: Employed Python's requests and BeautifulSoup for scraping game data from a sports website, enhancing the dataset.

  5. Data Transformation and Feature Engineering: Enhanced the data with calculated fields such as game time and attendance status, and merged with weather data for comprehensive analysis.

  6. Data Visualization: Utilized Matplotlib and Seaborn for plotting various aspects of the data, such as attendance trends by game time, month, and other factors.

  7. Predictive Modeling and Machine Learning: Explored various machine learning models like Linear Regression, Random Forest, and XGBoost to predict attendance, including hyperparameter tuning and model evaluation using RMSE and R2 score.

  8. Feature Importance Analysis: Determined the importance of various features in the attendance prediction using the Random Forest model.

  9. Custom Predictions and Insights: Made custom attendance predictions based on specific game conditions and visualized true vs. predicted attendance comparisons.

Technologies Used: Python, Pandas, pandasql, Matplotlib, Seaborn, BeautifulSoup, Machine Learning (scikit-learn, XGBoost).

Project Outcome:


The project provided detailed insights into factors affecting sports event attendance, showcasing the power of Python and SQL for data analysis and the effectiveness of machine learning in predictive analytics. The visualizations and models developed offer valuable tools for sports event management and marketing strategies.

Attendance-Prediction Attendance-Prediction Attendance-Prediction Attendance-Prediction Attendance-Prediction

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published