# 🟢 SQL Mini-Projects

## 🎯 Goals:
- Practice SELECT, JOIN, GROUP BY, HAVING.
- Learn to integrate SQL with Pandas.
- Create ready-made mini-reports for portfolio.

### 📌 Variant 1. “Sales Dashboard (SQL + Pandas)”
- Connect to `sales.db`.
- Run SQL queries:
  - Top-3 products by revenue.
  - Revenue by categories.
  - Revenue by regions and dates.
- Load results into Pandas.
- Build one integrated DataFrame and save to `sql_dashboard.csv`.

### 📌 Variant 2. “Business Insights Report”
- SQL query with HAVING: find categories with revenue > 150.
- SQL query with subquery: products with revenue above average.
- Combine results in Pandas into `insights.csv`.
- Add column `report_type` (categories / products).

### 📌 Variant 3. “SQL + Pandas Mix”
- SQL query to fetch all sales.
- In Pandas, calculate:
  - average revenue,
  - min/max,
  - standard deviation.
- Save everything into `sql_stats.csv`.


In [21]:
import sqlite3
import pandas as pd
conn = sqlite3.connect('db/sales.db')
print('SALES\n', pd.read_sql("SELECT * FROM sales", conn), '\n')
print('PRODUCTS\n', pd.read_sql("SELECT * FROM products", conn), '\n')
print('REGIONS\n', pd.read_sql("SELECT * FROM regions", conn), '\n')

top3_products = """
    SELECT p.product_name, SUM(s.revenue) total_revenue
    FROM products p
    INNER JOIN sales s ON p.product_id = s.product_id
    GROUP BY p.product_name
    ORDER BY total_revenue DESC LIMIT 3
"""
df1 = pd.read_sql(top3_products, conn)

categories_revenue = """
    SELECT p.category, SUM(s.revenue) total_revenue
    FROM products p
    INNER JOIN sales s ON p.product_id = s.product_id
    GROUP BY p.category
"""
df2 = pd.read_sql(categories_revenue, conn)

rev_regions_and_dates = """
    SELECT r.region, r.date, SUM(s.revenue) total_revenue
    FROM regions r
    INNER JOIN sales s on s.date = r.date
    GROUP BY r.region, r.date
"""
df3 = pd.read_sql(rev_regions_and_dates, conn)

report = pd.concat([df1, df2, df3], axis=0, ignore_index=True).fillna('N/A')
report.to_csv('csv/sql_dashboard.csv', index=False)
conn.close()

SALES
          date  product_id  revenue
0  2025-09-01         101     50.0
1  2025-09-01         102     30.0
2  2025-09-02         101     90.0
3  2025-09-02         103     40.0
4  2025-09-03         102     60.0
5  2025-09-03         104     75.0
6  2025-09-04         105     20.0
7  2025-09-04         101     65.0 

PRODUCTS
    product_id product_name     category
0         101       Laptop  Electronics
1         102        Phone  Electronics
2         103     Backpack      Fashion
3         104       Tablet  Electronics
4         105   Headphones  Accessories 

REGIONS
          date region
0  2025-09-01  North
1  2025-09-01   East
2  2025-09-02   West
3  2025-09-02  North
4  2025-09-03  South
5  2025-09-03   East
6  2025-09-04   West
7  2025-09-04  South 



In [37]:
import sqlite3
import pandas as pd
conn = sqlite3.connect('db/sales.db')

categories_over150_rev = """
    SELECT p.category, SUM(s.revenue) total_revenue
    FROM products p
    INNER JOIN sales s ON p.product_id = s.product_id
    GROUP BY p.category
    HAVING total_revenue > 150
"""
product_rev_above_avg = """
    SELECT p.product_name, s.revenue
    FROM products p
    INNER JOIN sales s ON p.product_id = s.product_id
    WHERE s.revenue > (SELECT AVG(revenue) FROM sales)
"""

df1 = pd.read_sql(categories_over150_rev, conn)
df2 = pd.read_sql(product_rev_above_avg, conn)
df1['report_type'] = 'categories'
df2['report_type'] = 'products'

report = pd.concat([df1, df2], axis=0, ignore_index=True).fillna('N/A')
report.to_csv('csv/insights.csv')

conn.close()

In [60]:
import sqlite3
import pandas as pd
conn = sqlite3.connect('db/sales.db')
query = """
   SELECT p.product_id, p.product_name, p.category, r.region, s.revenue
   FROM products p
   INNER JOIN sales s ON p.product_id = s.product_id
   INNER JOIN regions r ON r.date = s.date  
"""
df = pd.read_sql(query, conn)
data = {
    'avg_rev': df['revenue'].mean(),
    'min_rev':df['revenue'].min(),
    'max_rev': df['revenue'].max(),
    'std': df['revenue'].std().round(2)
}
report = pd.DataFrame([data])
report.to_csv('csv/sql_stats.csv', index=False)
conn.close()