# SUTRA - Complete Usage Guide
## Natural Language to SQL with Visualization

This notebook demonstrates all features of the SUTRA library step by step.

## Step 1: Install SUTRA

First, install the library using pip:

In [None]:
# Run this in your terminal or uncomment to run here
# !pip install sutra

## Step 2: Import and Initialize

Import SUTRA and initialize with your OpenAI API key:

In [None]:
from sutra import Sutra

# Initialize SUTRA with your OpenAI API key
sutra = Sutra(api_key="your-openai-api-key-here")

# Alternative: Set API key as environment variable
# import os
# os.environ['OPENAI_API_KEY'] = 'your-api-key'
# sutra = Sutra()

## Step 3: Upload Data

Upload your data files - supports CSV, Excel, JSON, Parquet, and more!

In [None]:
# Upload CSV file
sutra.upload_data("sales_data.csv")

# Upload Excel file with custom table name
sutra.upload_data("products.xlsx", table_name="products")

# Upload JSON file
sutra.upload_data("customers.json")

# You can upload any supported format!
# sutra.upload_data("data.parquet")
# sutra.upload_data("report.pdf")  # If PDF parsing is supported

## Step 4: View Available Tables

Check what tables are available in your database:

In [None]:
# List all tables
tables = sutra.list_tables()

# Get detailed info about a specific table
sutra.show_table_info("sales_data")

## Step 5: View Database Schema

See the complete structure of your database:

In [None]:
# Get the full database schema
schema = sutra.get_schema()
print(schema)

## Step 6: Direct SQL Queries (Without API)

Execute SQL queries directly if you know SQL:

In [None]:
# Execute a direct SQL query (no API call needed)
result = sutra.direct_sql("SELECT * FROM sales_data LIMIT 10")

# Display the results
print(result['data'])

# SQL query with visualization
result = sutra.direct_sql(
    "SELECT region, SUM(revenue) as total_revenue FROM sales_data GROUP BY region",
    visualize=True
)

# Show the data
print(result['data'])

# Show the visualization
if 'visualization' in result:
    result['visualization'].show()

## Step 7: Natural Language Queries

Now the magic part - ask questions in plain English!

In [None]:
# Simple query with automatic visualization
result = sutra.query("What are the top 5 products by revenue?")

# View the generated SQL
print("Generated SQL:", result['sql'])

# View the data
print("\nResults:")
print(result['data'])

# Display visualization
if 'visualization' in result:
    result['visualization'].show()

In [None]:
# More complex queries
result = sutra.query("Show me monthly sales trends for the last year")
result['visualization'].show()

In [None]:
# Query without visualization
result = sutra.query(
    "What is the average order value by customer segment?",
    visualize=False
)
print(result['data'])

## Step 8: Visualization Options

You can control whether visualizations are created:

In [None]:
# WITH visualization (default)
result = sutra.query("Compare sales across regions", visualize=True)
result['visualization'].show()

# WITHOUT visualization
result = sutra.query("Compare sales across regions", visualize=False)
print(result['data'])

## Step 9: Export Results

Save your query results to files:

In [None]:
# Run a query
result = sutra.query("Show me all sales data")

# Export to CSV
sutra.export_results(result, "sales_export.csv", format='csv')

# Export to Excel
sutra.export_results(result, "sales_export.xlsx", format='excel')

# Export to JSON
sutra.export_results(result, "sales_export.json", format='json')

## Step 10: Provide Feedback

Help improve the system by providing feedback on queries:

In [None]:
# If the generated SQL was correct
sutra.provide_feedback(
    query="Show me total sales",
    sql="SELECT SUM(amount) FROM sales",
    is_correct=True
)

# If the generated SQL was incorrect, provide the correct version
sutra.provide_feedback(
    query="Show me average sales per customer",
    sql="SELECT AVG(amount) FROM sales",
    is_correct=False,
    correct_sql="SELECT customer_id, AVG(amount) FROM sales GROUP BY customer_id"
)

## Complete Example Workflow

In [None]:
from sutra import Sutra

# 1. Initialize
sutra = Sutra(api_key="your-openai-api-key")

# 2. Upload data
sutra.upload_data("sales.csv")
sutra.upload_data("customers.xlsx")

# 3. Check available tables
sutra.list_tables()

# 4. Direct SQL query (no API call)
result = sutra.direct_sql("SELECT COUNT(*) FROM sales")
print(f"Total sales records: {result['data'].iloc[0, 0]}")

# 5. Natural language query with visualization
result = sutra.query(
    "What are the top 10 customers by total purchase amount?",
    visualize=True
)

# 6. Display results
print(result['data'])
result['visualization'].show()

# 7. Export results
sutra.export_results(result, "top_customers.csv")

# 8. Close connection
sutra.close()

## Advanced Usage Examples

In [None]:
# Multiple file uploads
files = ["sales_2023.csv", "sales_2024.csv", "products.xlsx", "customers.json"]
for file in files:
    sutra.upload_data(file)

# Complex analytical queries
queries = [
    "Show me year-over-year growth by product category",
    "What is the customer retention rate?",
    "Which products have the highest profit margins?",
    "Show me sales distribution by geographic region",
    "What are the seasonal trends in our sales data?"
]

for q in queries:
    result = sutra.query(q, visualize=True)
    print(f"\n{q}")
    print(result['data'].head())
    result['visualization'].show()

## Tips and Best Practices

1. **API Key**: Store your API key securely using environment variables
2. **Data Upload**: Upload related data files together for better query understanding
3. **Table Names**: Use descriptive table names when uploading data
4. **Direct SQL**: Use direct SQL for simple queries to save API costs
5. **Feedback**: Provide feedback to improve query accuracy over time
6. **Visualization**: Enable visualization for better data insights
7. **Export**: Export important results for sharing or further analysis