This project analyzes video game sales data using both SQL and Pandas in Python. It compares average global sales before and after 2005 using PYTHON and MYSQL
These instructions will help you set up the project on your local machine for development and testing purposes. For deployment instructions, refer to the Deployment section
The Python script in this repository performs the following tasks:
- Loads video game sales data from a CSV file
- Inserts the data into a MySQL database using SQLAlchemy
- Adds a new column to categorize games as pre-2005 or post-2005 using SQL
- Calculates average global sales for both periods using SQL queries
- Performs the same calculations using Pandas for comparison
- Compares the results from both SQL and Pandas methods
To run this project, you need the following software installed:
- Python software
- Pandas library
- SQLAlchemy library
- MySQL Connector for Python
- A MySQL server with a database named `sales_db'
Follow these steps to get your development environment up and running:
- Download the Project Files: Download the project files directly from the repository.
- Navigate to the Project Directory: cd path/to/your/project
- Install Required Python Packages: pip install pandas sqlalchemy mysql-connector-python
- Prepare the MySQL Database: Ensure you have a MySQL database named
sales_dbset up and ready for use. - Update the File Path: Modify the file path in the script to point to your local CSV file containing the video game sales data.
- Run the Analysis Script: python vgsales_analysis.py This will execute the script and output the average global sales before and after 2005.
To verify the functionality of the Video Game Sales Analysis script, follow these steps:
- Set Up Your Environment: Ensure you have Python, Pandas, SQLAlchemy, and MySQL Connector installed, and that your MySQL server is running.
- Prepare Your Data:
- Place the
vgsales.csvfile in the specified location and update thefile_pathvariable in the script: - file_path = 'C:\Users\path\vgsales.csv' # Change this to your actual file path
- Place the
- Configure the Database:
- Create a MySQL database named
sales_dband update the connection string in the script with your MySQL credentials: engine = create_engine('mysql+mysqlconnector://username:password@localhost/sales_db') # Replace 'username' and 'password' with your actual MySQL username and password.
- Create a MySQL database named
- Run the Script: Execute the following command in your terminal: python vgsales_analysis.py
- The script inserts video game sales data from a CSV file into a MySQL table.
- It adds a new column to categorize the games as 'pre-2005' or 'post-2005'.
- The script calculates average global sales for both periods using SQL queries.
- It performs the same calculations using Pandas for comparison.
- Results from both SQL and Pandas are printed to the console.
For deploying this project on a live system, ensure that:
- Your MySQL server is operational.
- The
sales_dbdatabase is created. - Update the database connection string in the script with your MySQL credentials.
- Pandas - For data manipulation and analysis.
- SQLAlchemy - For database interaction using Python.
- MySQL Connector - To connect Python with MySQL.
[Martins Bash 100890325]
This project is licensed under the MIT License - see the LICENSE.md) file for details.