SQL-Driven E-Commerce Database Optimization

This project demonstrates an end-to-end SQL query optimization system with ML-based prediction, dynamic indexing, real-time monitoring via InfluxDB + Grafana, and benchmarking using JMeter. The backend is powered by a Flask API that applies query rewrites, logs performance metrics, and adapts based on historical patterns.

How to Compile and Run

Clone the repository
Install Python dependencies
Ensure you are using Python 3.8+. Install requirements: pip install mysql-connector-python flask scikit-learn pandas
Start MySQL and InfluxDB
Make sure the following services are running locally:
MySQL (with user: your_username & password, database: ecommerce_db)
InfluxDB (port 8086, database: jmeter)
Set up MySQL schema and data
Run the provided SQL setup file or manually execute sql_setup.sql to populate tables like products, reviews, orders.
Start the Flask API
python ml_optimization_script.py
The API will start at: http://localhost:5000/query/
Run JMeter Benchmark
Open comparison.jmx in Apache JMeter
Add Backend Listener for InfluxDB
Ensure JSON Extractors and JSR223 Listeners are properly placed
Start test
View in Grafana
Access Grafana at http://localhost:3000
--> Data Source Configuration
- Go to: Gear Icon → Data Sources
- Click: Add data source
- Select: InfluxDB
- Set:
- URL: http://localhost:8086
- Database: jmeter
- Version: InfluxQL
- Click Save & Test
Import the dashboard JSON provided or manually configure panels to visualize metrics from:
jmeter_metrics (JDBC requests)
optimized_query_exec (Flask-optimized responses)

Requirements & Installation

Python 3.8+
Flask
pandas, numpy, scikit-learn
MySQL Server
InfluxDB (v1.x)
Grafana
Apache JMeter
MySQL Connector for Python

Team Contributions

--> Ananya Penuballi
Designed and created the MySQL schema and dataset for products, reviews, and orders.
Implemented core SQL optimization logic, including:
- Index suggestion and creation logic
- Query rewriting strategies (e.g., USE INDEX, SQL_NO_CACHE)
- Handling of subqueries, JOINs, and aggregations

--> Sai Sanjay Yerunkar
Led the Flask API development, responsible for:
Predictive optimization using a trained ML model (RandomForestRegressor)
- Executing optimized queries and logging performance
- Returning clean JSON responses with applied optimizations
Built the JMeter setup for benchmarking each query endpoint
Integrated InfluxDB and Grafana for real-time visualization
Ensured correct query execution time logging via JSR223 scripting and JSON extractors

Notes

The optimized_query_exec InfluxDB measurement logs only query execution time, not API overhead.
Indexes are created dynamically if missing, with error-handling for duplicate index cases.
All dashboards are parameterized to allow per-query comparison over time.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
comparison.jmx		comparison.jmx
ml_optimization_script.py		ml_optimization_script.py
sql_setup.sql		sql_setup.sql

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SQL-Driven E-Commerce Database Optimization

How to Compile and Run

Requirements & Installation

Team Contributions

Notes

About

Uh oh!

Releases

Packages

Languages

saiyerunkar/CS257_SQL_Query_Optimization

Folders and files

Latest commit

History

Repository files navigation

SQL-Driven E-Commerce Database Optimization

How to Compile and Run

Requirements & Installation

Team Contributions

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages