## 1. Import and Setup

Let's start by importing all necessary libraries. In notebook style, we can run this cell first and see any import errors immediately.

In [None]:
# Essential imports for our NL query system
import sqlite3
import json
import pandas as pd
import os

print("‚úÖ Basic libraries imported successfully!")

In [None]:
# LangChain imports for AI functionality
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.chat_models import ChatOpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

print("‚úÖ LangChain libraries imported successfully!")

## 2. Environment Configuration

**‚ö†Ô∏è SECURITY NOTE**: Never put API keys directly in notebooks that might be shared! 
Let's set up environment variables properly.

In [None]:
# Check if API key is already set in environment
if "OPENAI_API_KEY" in os.environ:
    print("‚úÖ OpenAI API key found in environment")
    api_key_status = "Set"
else:
    print("‚ö†Ô∏è OpenAI API key not found in environment")
    api_key_status = "Not Set"
    
print(f"API Key Status: {api_key_status}")

In [None]:
# TEMPORARY: Set API key for this session only
# TODO: Move this to environment variables for security
os.environ["OPENAI_API_KEY"] = "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"

print("‚úÖ API key set for this session")

## 3. OpenAI Client Setup

Now let's initialize the OpenAI client and test the connection.

In [None]:
from openai import OpenAI

# Initialize OpenAI client
client = OpenAI()
print("‚úÖ OpenAI client initialized")

# Test the client with a simple request
print("Testing OpenAI connection...")

## 4. Testing OpenAI Response

**Note**: Your original code had some syntax issues. Let me fix them:
- `client.response.create` should be `client.chat.completions.create`
- `GPT-5 mini` should be `gpt-4o-mini` or `gpt-3.5-turbo`
- The response structure needs to be corrected

In [None]:
# Fixed version of your OpenAI call
try:
    response = client.chat.completions.create(
        model="gpt-4o-mini",  # Fixed model name
        messages=[
            {"role": "user", "content": "How much gold would it take to coat the Statue of Liberty in a 1mm layer?"}
        ],
        max_tokens=200
    )
    
    # Extract and display the response
    answer = response.choices[0].message.content
    print("ü§ñ OpenAI Response:")
    print(answer)
    
except Exception as e:
    print(f"‚ùå Error calling OpenAI: {e}")

## 5. Database Setup (SQLite)

Let's set up a sample database to demonstrate the NL query functionality.

In [None]:
# Create a sample database for testing
db_path = "sample_database.db"
conn = sqlite3.connect(db_path)
cursor = conn.cursor()

# Create a sample table
cursor.execute('''
CREATE TABLE IF NOT EXISTS employees (
    id INTEGER PRIMARY KEY,
    name TEXT NOT NULL,
    department TEXT,
    salary REAL,
    hire_date DATE
)
''')

# Insert sample data
sample_data = [
    (1, 'Alice Johnson', 'Engineering', 75000, '2023-01-15'),
    (2, 'Bob Smith', 'Marketing', 65000, '2023-02-20'),
    (3, 'Charlie Brown', 'Engineering', 80000, '2022-11-10'),
    (4, 'Diana Prince', 'HR', 70000, '2023-03-05'),
    (5, 'Eve Wilson', 'Engineering', 85000, '2022-08-12')
]

cursor.executemany('INSERT OR REPLACE INTO employees VALUES (?, ?, ?, ?, ?)', sample_data)
conn.commit()

print("‚úÖ Sample database created with employee data")
print(f"Database location: {db_path}")

In [None]:
# Verify the data was inserted
df = pd.read_sql_query("SELECT * FROM employees", conn)
print("üìä Sample Data:")
print(df)

## 6. Vector Database Setup (FAISS)

Now let's set up FAISS for semantic search capabilities.

In [None]:
# Initialize OpenAI embeddings
embeddings = OpenAIEmbeddings()

# Create sample text documents about our database schema
schema_docs = [
    "employees table contains employee information",
    "employees table has columns: id, name, department, salary, hire_date",
    "department column contains values like Engineering, Marketing, HR",
    "salary column contains numeric salary values",
    "hire_date column contains employment start dates"
]

# Create FAISS vector store
try:
    vectorstore = FAISS.from_texts(schema_docs, embeddings)
    print("‚úÖ FAISS vector store created successfully")
    print(f"Number of documents indexed: {len(schema_docs)}")
except Exception as e:
    print(f"‚ùå Error creating vector store: {e}")

## 7. Interactive Testing

This is where notebooks shine! You can now test different queries and see results immediately.

In [None]:
# Test semantic search
query = "show me information about employee salaries"
try:
    results = vectorstore.similarity_search(query, k=2)
    print(f"üîç Query: '{query}'")
    print("üìù Most relevant schema information:")
    for i, doc in enumerate(results, 1):
        print(f"{i}. {doc.page_content}")
except Exception as e:
    print(f"‚ùå Error in similarity search: {e}")

## 8. Cleanup

Don't forget to close database connections!

In [None]:
# Close database connection
conn.close()
print("‚úÖ Database connection closed")

## Summary: Script vs Notebook Execution

### Python Script (`.py` file):
- ‚úÖ Runs all code at once
- ‚úÖ Good for production/final code
- ‚ùå Hard to debug
- ‚ùå Can't inspect intermediate results
- ‚ùå Must re-run everything to test changes

### Jupyter Notebook (`.ipynb` file):
- ‚úÖ Run cell by cell
- ‚úÖ See intermediate results
- ‚úÖ Easy debugging and experimentation
- ‚úÖ Variables persist between cells
- ‚úÖ Can add documentation with Markdown
- ‚úÖ Perfect for data science and AI development

**Recommendation**: Use notebooks for development and experimentation, then convert to scripts for production deployment!