Skip to content

saiyerunkar/CS257_SQL_Query_Optimization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

SQL-Driven E-Commerce Database Optimization

This project demonstrates an end-to-end SQL query optimization system with ML-based prediction, dynamic indexing, real-time monitoring via InfluxDB + Grafana, and benchmarking using JMeter. The backend is powered by a Flask API that applies query rewrites, logs performance metrics, and adapts based on historical patterns.

How to Compile and Run

  1. Clone the repository

  2. Install Python dependencies
    Ensure you are using Python 3.8+. Install requirements: pip install mysql-connector-python flask scikit-learn pandas

  3. Start MySQL and InfluxDB
    Make sure the following services are running locally:
    MySQL (with user: your_username & password, database: ecommerce_db)
    InfluxDB (port 8086, database: jmeter)

  4. Set up MySQL schema and data
    Run the provided SQL setup file or manually execute sql_setup.sql to populate tables like products, reviews, orders.

  5. Start the Flask API
    python ml_optimization_script.py
    The API will start at: http://localhost:5000/query/

  6. Run JMeter Benchmark
    Open comparison.jmx in Apache JMeter
    Add Backend Listener for InfluxDB
    Ensure JSON Extractors and JSR223 Listeners are properly placed
    Start test

  7. View in Grafana
    Access Grafana at http://localhost:3000
    --> Data Source Configuration
    - Go to: Gear Icon → Data Sources
    - Click: Add data source
    - Select: InfluxDB
    - Set:
    - URL: http://localhost:8086
    - Database: jmeter
    - Version: InfluxQL
    - Click Save & Test
    Import the dashboard JSON provided or manually configure panels to visualize metrics from:
    jmeter_metrics (JDBC requests)
    optimized_query_exec (Flask-optimized responses)

Requirements & Installation

Python 3.8+
Flask
pandas, numpy, scikit-learn
MySQL Server
InfluxDB (v1.x)
Grafana
Apache JMeter
MySQL Connector for Python

Team Contributions

--> Ananya Penuballi
Designed and created the MySQL schema and dataset for products, reviews, and orders.
Implemented core SQL optimization logic, including:
- Index suggestion and creation logic
- Query rewriting strategies (e.g., USE INDEX, SQL_NO_CACHE)
- Handling of subqueries, JOINs, and aggregations


--> Sai Sanjay Yerunkar
Led the Flask API development, responsible for:
Predictive optimization using a trained ML model (RandomForestRegressor)
- Executing optimized queries and logging performance
- Returning clean JSON responses with applied optimizations
Built the JMeter setup for benchmarking each query endpoint
Integrated InfluxDB and Grafana for real-time visualization
Ensured correct query execution time logging via JSR223 scripting and JSON extractors

Notes

The optimized_query_exec InfluxDB measurement logs only query execution time, not API overhead.
Indexes are created dynamically if missing, with error-handling for duplicate index cases.
All dashboards are parameterized to allow per-query comparison over time.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages