# Ungraded Lab: AI-Powered Schema Optimization

## 📋 Overview 
In this lab, you'll explore how to leverage AI to optimize SQL queries and database schemas. As a data analyst at BookCycle, you'll use generative AI to suggest improvements to your existing queries and database structure. This hands-on experience will help you understand how AI can enhance database performance and efficiency.


## 🎯 Learning Outcomes
By the end of this lab, you will be able to:

- Use generative AI to analyze and optimize SQL queries
- Identify indexing opportunities to improve query performance
- Compare AI suggestions with best practices for database schema design


## 📚 Dataset Information
We'll be working with the BookCycle database, which includes tables for <b>transactions, books, and customers</b>. This data represents sales and inventory information for a small bookstore chain.

## 🖥️ Activities

### Activity 1: Analyze Existing Query 

As a data analyst at BookCycle, you've been asked to optimize a query that's running slowly. This query retrieves information about high-value transactions.

<b>Step 1:</b> Review the following query:

```
SELECT t.transaction_id, t.date_time, t.store_location, b.title, t.sale_price
FROM transactions t
JOIN customers c ON t.customer_id = c.customer_id
JOIN books b ON t.book_id = b.book_id
WHERE t.sale_price > 10
ORDER BY t.sale_price DESC;
```

<b>Step 2:</b> Run the query and observe its performance:

In [1]:
import sqlite3
import pandas as pd

# Setting up the database
from db_setup_2 import setup_database
setup_database() 

conn = sqlite3.connect('bookcycle.db')

query = """
SELECT t.transaction_id, t.date_time, t.store_location, b.title, t.sale_price
FROM transactions t
JOIN customers c ON t.customer_id = c.customer_id
JOIN books b ON t.book_id = b.book_id
WHERE t.sale_price > 10
ORDER BY t.sale_price DESC;
"""

df = pd.read_sql_query(query, conn)
display(df)

✅ Database setup complete: Tables created and populated with data!


Unnamed: 0,transaction_id,date_time,store_location,title,sale_price
0,T1004,2023-01-15 11:30:15,University,The Catcher in the Rye,13.99
1,T1032,2023-01-19 11:40:45,University,Thus Spoke Zarathustra,13.99
2,T1039,2023-01-20 13:30:12,University,The Catcher in the Rye,13.99
3,T1051,2023-01-23 09:15:22,University,The Catcher in the Rye,13.99
4,T1062,2023-01-24 14:40:45,University,Ulysses,13.99
...,...,...,...,...,...
59,T1079,2023-01-27 11:40:12,Suburban,The Adventures of Sherlock Holmes,10.99
60,T1086,2023-01-28 14:25:22,Downtown,Jane Eyre,10.99
61,T1090,2023-01-29 13:45:48,Suburban,The Adventures of Sherlock Holmes,10.99
62,T1092,2023-01-30 09:35:45,University,The Scarlet Letter,10.99


💡 Tip: Pay attention to the execution time of the query.

### Activity 2: AI-Powered Query Optimization

Now, let's use AI to suggest optimizations for our query.

<b>Step 1:</b> Use an AI-powered coding assistant (such as ChatGPT, Gemini, or another AI tool of your choice) to analyze the query. Copy and paste the following into your chosen AI assistant and see what the AI suggests:

```
# AI Assistant, please analyze the following SQL query and suggest optimizations:
# [Paste the original query here]
```

<b>Step 2:</b> Review the AI's suggestions. They might include:
- Adding indexes on frequently used columns
- Rewriting the query to use subqueries or CTEs
- Suggesting changes to the database schema

<b>Step 3:</b> Implement one of the AI's suggestions. For example, if it recommends adding an index:

<b>Step 4:</b> Run the optimized query and compare its performance to the original. Note: for small datasets, the performance improvement may be imperceptible.

💡 Tip: Remember that AI suggestions should be critically evaluated and tested before implementation.

### Activity 3: Schema Optimization Analysis 
Let's use AI to analyze our overall database schema for potential improvements.

<b>Step 1:</b> Review the current schema:

In [2]:
# Display current schema
for table in ['transactions', 'books', 'customers']:
    print(f"Table: {table}")
    cursor = conn.execute(f"PRAGMA table_info({table})")
    for column in cursor.fetchall():
        print(f"  {column[1]} ({column[2]})")
    print()

Table: transactions
  transaction_id (TEXT)
  date_time (TEXT)
  store_location (TEXT)
  customer_id (TEXT)
  book_id (TEXT)
  sale_price (REAL)
  payment_method (TEXT)
  is_online (INTEGER)

Table: books
  book_id (TEXT)
  title (TEXT)
  author (TEXT)
  isbn (INTEGER)
  genre (TEXT)
  condition (TEXT)
  purchase_price (REAL)
  list_price (REAL)
  date_acquired (TEXT)
  current_location (TEXT)
  quantity (INTEGER)

Table: customers
  customer_id (TEXT)
  join_date (TEXT)
  is_member (INTEGER)
  zip_code (INTEGER)
  birth_year (INTEGER)
  preferred_store (TEXT)



<b>Step 2:</b> Ask the AI for schema optimization suggestions:

```
# AI Assistant, please analyze our current database schema and suggest optimizations:
# [Paste the schema information here]
```

<b>Step 3:</b>  Review and discuss the AI's suggestions. Consider:
- How do they align with database best practices?
- What are the potential impacts on query performance?
- Are there any suggestions that might not be suitable for BookCycle's needs?

<b>Step 4:</b> Close the connection: 

In [3]:
# Close the connection
conn.close()

### ⚙️ Test Your Work:
Verify that any schema changes suggested by the AI don't break existing functionality

## ✅ Success Checklist
- You've successfully run and analyzed the original query
- You've received and implemented at least one AI-suggested optimization
- You've critically evaluated AI suggestions for schema optimization

## 🔍 Common Issues & Solutions 

- Problem: AI suggests an optimization that doesn't improve performance
    - Solution: Remember that AI suggestions are not always perfect. Always test and validate suggestions before implementing them in a production environment.

- Problem: Implementing an index doesn't significantly improve query performance 
    - Solution:  Ensure that the index is on the correct column(s) and that the query is actually using the index. You can use EXPLAIN QUERY PLAN to see if the index is being used.


## ➡️ Summary
By completing the AI-Powered Schema Optimization lab, you've acquired essential skills in leveraging AI for database management, query enhancement, and schema refinement, positioning yourself at the forefront of data analysis and equipping you to tackle complex database challenges with innovative, AI-driven solutions.

### 🔑 Key Points
- AI can provide valuable insights for query and schema optimization
- Always critically evaluate and test AI suggestions
- Indexing and query rewriting are common optimization techniques
- Schema design impacts query performance and should be regularly reviewed