MySQL Python Data Pipeline Mini Project

📖 Abstract

This project develops a basic data pipeline for an event ticketing system, integrating CSV-based vendor feeds with a relational database. The system simulates how major ticket platforms manage direct ticket sales alongside third-party reseller submissions.

The workflow includes:

Designing and creating a sales table in MySQL with fields for ticket, customer, event, and purchase details.
Using Python’s MySQL connector to establish database connections and ingest new sales records from daily CSV submissions.
Implementing a loader function to iterate through CSV rows and insert them into the database table.
Running SQL queries to generate statistical insights, such as identifying the most popular events in the past month based on ticket volume.
Formatting query results into human-readable output, e.g., "The North American International Auto Show" as a top-selling event.

Deliverables include Python code, the sales table definition, ingestion logic, and example query results.

Through this project, I gained practice in Python-MySQL integration, database schema creation, batch CSV ingestion, and SQL-based analytics, building foundational ETL pipeline skills.

🛠 Requirements

Python 3.8+
MySQL Server running locally or remote
mysql-connector-python package
Provided ticket_sales.csv dataset
GitHub repo for code + logs submission

🧰 Setup

Install mysql-connector-python (pip install mysql-connector-python)
Create a MySQL database (e.g., ticketdb)
Define and create sales table per schema in rubric
Place provided CSV (ticket_sales.csv) in project folder

📊 Dataset

ticket_sales.csv (third-party reseller records)
Schema includes:

ticket_id, trans_date, event_id, event_name, event_date, event_type,

event_city, customer_id, price, num_tickets

⏱️ Run Steps

Connect to MySQL using get_db_connection()
Load CSV into sales table with load_third_party()
Commit and close connection
Query popular tickets in past month using query_popular_tickets()
Print results to stdout in user-friendly format

📈 Outputs

sales table populated with ticket_sales.csv records
Console output listing top-selling events (past month)

📎 Deliverables

🛠️ Architecture

Data pipeline architecture:
- Source: CSV (third-party ticket sales)
- Ingestion: Python (mysql-connector)
- Storage: MySQL sales table
- Analytics: SQL queries for popular events

🔍 Monitoring

Manual validation via MySQL queries (SELECT TOP records)
Review of command-line output logs

♻️ Cleanup

Drop sales table if no longer needed
Delete ticket_sales.csv from project folder
Remove Python venv (optional)

Generated automatically via Python + Jinja2 + SQL Server table tblMiniProjectProgress on 11-11-2025 15:31:07

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
deliverables		deliverables
.gitattributes		.gitattributes
README.md		README.md
create_ticketsales_table.sql		create_ticketsales_table.sql
event_ticket_system.py		event_ticket_system.py
execution_log.txt		execution_log.txt
requirements.txt		requirements.txt
third_party_sales_1.csv		third_party_sales_1.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MySQL Python Data Pipeline Mini Project

📖 Abstract

🛠 Requirements

🧰 Setup

📊 Dataset

⏱️ Run Steps

📈 Outputs

📎 Deliverables

🛠️ Architecture

🔍 Monitoring

♻️ Cleanup

About

Uh oh!

Releases

Packages

Languages

mtholahan/mysql-python-data-pipeline-mini-project

Folders and files

Latest commit

History

Repository files navigation

MySQL Python Data Pipeline Mini Project

📖 Abstract

🛠 Requirements

🧰 Setup

📊 Dataset

⏱️ Run Steps

📈 Outputs

📎 Deliverables

🛠️ Architecture

🔍 Monitoring

♻️ Cleanup

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages