In [2]:
import os
from dotenv import load_dotenv
from langchain_groq import ChatGroq
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

In [3]:
#loading API key

load_dotenv()
groq_api_key = os.getenv("GROQ_API_KEY")

if not groq_api_key:
    raise ValueError("GROQ_API_KEY not found in environment variables.")

In [8]:
#LLM init

llm = ChatGroq(
    model="openai/gpt-oss-120b",
    temperature=0.3
)

In [18]:
prompt = ChatPromptTemplate.from_template("""
Role: You are an expert-level AI assistant with strong knowledge of databases, data modeling, and query logic.

Objective: Analyze the database structure provided in the user‚Äôs question and generate answers that are accurate, structured, and directly based on that structure.

Instructions:

Extract and Understand the Database Structure

Carefully read the database schema, table names, column names, relationships, and constraints provided in the question.

Use this information as the foundation for your answers.

Answer Using the Database Context

Write queries, explanations, or calculations strictly based on the given schema.

Avoid assuming fields, relationships, or data types that are not provided.

Include table and column references explicitly where relevant.

Structure and Clarity

Organize answers with headings, bullet points, or numbered lists where appropriate.

Provide step-by-step reasoning for queries, joins, or calculations derived from the schema.

Label query outputs or logical steps clearly for easy understanding.

Handle Ambiguity

If the question lacks detail, clearly state your assumptions before providing the answer.

Offer alternatives if multiple interpretations are possible, using the database structure as context.

Accuracy and Completeness

Ensure that all answers are technically correct according to the provided database schema.

Avoid adding extra information or assumptions beyond the schema unless explicitly stated.

{question}
""")

qa_chain = prompt | llm | StrOutputParser()

In [20]:
#Question and Answer

question = input("üí¨ Ask your question: ")

try:
    answer = qa_chain.invoke({"question": question})
    print("\nüß† Answer:\n")
    print(answer)

except Exception as e:
    print(f"\n‚ùå Error: {e}")


üß† Answer:

**Assumptions (based on the description)**  

| Table | Relevant Columns |
|-------|------------------|
| **Sales** | `sale_id` (primary key), `product_id` (foreign key), `quantity_sold` (integer), `total_price` (numeric/decimal) |

If the actual column names differ, replace them with the exact names from your schema.

---

## SQL Query

```sql
SELECT
    sale_id,
    product_id,
    total_price
FROM
    Sales
WHERE
    quantity_sold > 4;
```

### Explanation of the query
1. **SELECT clause** ‚Äì Retrieves the three requested columns: `sale_id`, `product_id`, and `total_price`.
2. **FROM clause** ‚Äì Indicates the source table is `Sales`.
3. **WHERE clause** ‚Äì Filters rows to include only those where the `quantity_sold` column has a value greater than‚ÄØ4.

---

### Result

The query returns a result set with one row per sale that meets the condition, showing:

| sale_id | product_id | total_price |
|---------|------------|-------------|
| ‚Ä¶       | ‚Ä¶          | ‚Ä