In [1]:
%%capture
%pip install -U transformers accelerate

In [2]:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline, TextStreamer
import torch


base_model = "/kaggle/input/llama-3.2/transformers/3b-instruct/1"

tokenizer = AutoTokenizer.from_pretrained(base_model)

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    return_dict=True,
    low_cpu_mem_usage=True,
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True,
)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [3]:

if tokenizer.pad_token_id is None:
    tokenizer.pad_token_id = tokenizer.eos_token_id
if model.config.pad_token_id is None:
    model.config.pad_token_id = model.config.eos_token_id

In [4]:
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.float16,
    device_map="auto",
)

Device set to use cuda:0


In [5]:
messages = [{"role": "user", "content": "Who is Vincent van Gogh?"}]

prompt = tokenizer.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)

outputs = pipe(prompt, max_new_tokens=120, do_sample=True)

print(outputs[0]["generated_text"])

<|begin_of_text|><|start_header_id|>user<|end_header_id|>

Who is Vincent van Gogh?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Vincent van Gogh (1853-1890) was a Dutch post-impressionist artist, widely considered one of the most influential and iconic figures in the history of art. He is known for his bold, expressive, and emotionally charged paintings that captured the beauty of the natural world and the human experience.

**Early Life and Career**

Van Gogh was born in Groot-Zundert, Netherlands, to a family of Protestant ministers. He was the eldest of six children, and his early life was marked by a series of struggles with mental health and financial instability. He worked as an


In [6]:
from IPython.display import Markdown, display

messages = [
    {
        "role": "system",
        "content": "You are a skilled Python developer specializing in database management and optimization.",
    },
    {
        "role": "user",
        "content": "I'm experiencing a sorting issue in my database. Could you please provide Python code to help resolve this problem?",
    },
]

prompt = tokenizer.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)

outputs = pipe(prompt, max_new_tokens=512, do_sample=True)

display(
    Markdown(
            outputs[0]["generated_text"].split(
                "<|start_header_id|>assistant<|end_header_id|>"
            )[1]
        )
    )



To assist you with sorting issues in your database, I'll need a bit more information about the problem you're experiencing. However, I can provide a general example of how you might approach sorting data using Python and a popular database library, such as SQLAlchemy for PostgreSQL or MySQL, or Pandas for general-purpose data manipulation.

Let's assume you're working with a simple database table and you want to sort the data by a specific column.

**Example using SQLAlchemy and PostgreSQL:**

```python
from sqlalchemy import create_engine, Column, Integer, String
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
import pandas as pd

# Create a connection to the database
engine = create_engine('postgresql://user:password@localhost/dbname')

# Create a configured "Session" class
Session = sessionmaker(bind=engine)

# Create a base class for declarative class definitions
Base = declarative_base()

# Define a class for the table
class MyTable(Base):
    __tablename__ ='my_table'
    id = Column(Integer, primary_key=True)
    name = Column(String)
    age = Column(Integer)

# Create the table
Base.metadata.create_all(engine)

# Create a session
session = Session()

# Fetch all rows from the table
rows = session.query(MyTable).all()

# Convert the rows to a Pandas DataFrame for sorting
df = pd.DataFrame([row.__dict__ for row in rows])

# Sort the DataFrame by the 'age' column
df_sorted = df.sort_values(by='age')

# Print the sorted DataFrame
print(df_sorted)

# Close the session
session.close()
```

**Example using Pandas:**

```python
import pandas as pd

# Load the data from the database into a Pandas DataFrame
df = pd.read_sql_query("SELECT * FROM my_table", engine)

# Sort the DataFrame by the 'age' column
df_sorted = df.sort_values(by='age')

# Print the sorted DataFrame
print(df_sorted)
```

Please note that these examples are simplified and you should adjust them according to your specific database schema and requirements.

To better assist you, could you please provide more information about your problem, such as:

* The specific database you're using (e.g., PostgreSQL, MySQL, SQLite)
* The table and columns involved in the sorting issue
* Any error messages or unexpected behavior you're experiencing

I'll be happy to help you troubleshoot and provide more tailored solutions.

In [10]:
!pip install -U langchain langchain-experimental langchain-community langchain-huggingface pandas torch accelerate transformers huggingface_hub


Collecting langchain
  Downloading langchain-0.3.18-py3-none-any.whl.metadata (7.8 kB)
Collecting langchain-experimental
  Downloading langchain_experimental-0.3.4-py3-none-any.whl.metadata (1.7 kB)
Collecting langchain-community
  Downloading langchain_community-0.3.17-py3-none-any.whl.metadata (2.4 kB)
Collecting langchain-huggingface
  Downloading langchain_huggingface-0.1.2-py3-none-any.whl.metadata (1.3 kB)
Collecting torch
  Downloading torch-2.6.0-cp310-cp310-manylinux1_x86_64.whl.metadata (28 kB)
Collecting langchain-core<1.0.0,>=0.3.34 (from langchain)
  Downloading langchain_core-0.3.34-py3-none-any.whl.metadata (5.9 kB)
Collecting langchain-text-splitters<1.0.0,>=0.3.6 (from langchain)
  Downloading langchain_text_splitters-0.3.6-py3-none-any.whl.metadata (1.9 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain-community)
  Downloading pydantic_settings-2.7.1-py3-none-any.whl.metadata (3.5 kB)
Collecting httpx-sse<1.0.0,>=0.4.0 (from langchain-community)
  Downloa

In [17]:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch
from langchain.llms import HuggingFacePipeline
from langchain_core.prompts import PromptTemplate
from langchain.chains import LLMChain
import pandas as pd

# Load LLaMA 3.2 3B-Instruct Model
base_model = "/kaggle/input/llama-3.2/transformers/3b-instruct/1"
tokenizer = AutoTokenizer.from_pretrained(base_model)

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    return_dict=True,
    low_cpu_mem_usage=True,
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True,
)

# Ensure padding token is set correctly
if tokenizer.pad_token_id is None:
    tokenizer.pad_token_id = tokenizer.eos_token_id
if model.config.pad_token_id is None:
    model.config.pad_token_id = model.config.eos_token_id

# Create a text generation pipeline
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.float16,
    device_map="auto",
    max_new_tokens=1024,  # Limit output size for faster response
)

# Wrap the pipeline inside LangChain's HuggingFacePipeline
llm = HuggingFacePipeline(pipeline=pipe)

print("LLaMA 3.2 3B-Instruct Model Loaded with LangChain")

# Create Sample CSV Data
data = {
    "Customer": ["Alice", "Bob", "Charlie", "David", "Eve"],
    "Product": ["Laptop", "Phone", "Tablet", "Monitor", "Keyboard"],
    "Quantity": [1, 2, 1, 3, 2],
    "Price": [1200, 800, 500, 300, 100],
    "Total": [1200, 1600, 500, 900, 200]
}

# Save CSV File
csv_path = "sales_data.csv"
df = pd.DataFrame(data)
df.to_csv(csv_path, index=False)

# Load CSV File
df = pd.read_csv(csv_path)
print("CSV File Loaded\n", df.head())

# Define Prompt Template for CSV Queries
prompt_template = PromptTemplate(
    input_variables=["query", "data"],
    template="You are an AI assistant analyzing sales data. Answer the query: {query}\nData:\n{data}"
)

# Create LLMChain for Querying CSV Data
llm_chain = LLMChain(llm=llm, prompt=prompt_template)

# Function to run AI queries on CSV data
def ask_csv(query):
    """Processes a natural language query on the CSV file using LLaMA 3.2."""
    data_str = df.to_string(index=False)  # Convert DataFrame to string
    response = llm_chain.invoke({"query": query, "data": data_str})
    return response["text"]  # Extract text from the response

# Run AI queries
query_1 = "What is the total revenue in the dataset?"
query_2 = "Which product generated the highest sales?"
query_3 = "Who is the best customer based on spending?"

print("Query 1 Response:", ask_csv(query_1))
print("Query 2 Response:", ask_csv(query_2))
print("Query 3 Response:", ask_csv(query_3))


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Device set to use cuda:0


✅ LLaMA 3.2 3B-Instruct Model Loaded with LangChain
✅ CSV File Loaded
   Customer   Product  Quantity  Price  Total
0    Alice    Laptop         1   1200   1200
1      Bob     Phone         2    800   1600
2  Charlie    Tablet         1    500    500
3    David   Monitor         3    300    900
4      Eve  Keyboard         2    100    200
📝 Query 1 Response: You are an AI assistant analyzing sales data. Answer the query: What is the total revenue in the dataset?
Data:
Customer  Product  Quantity  Price  Total
   Alice   Laptop         1   1200   1200
     Bob    Phone         2    800   1600
 Charlie   Tablet         1    500    500
   David  Monitor         3    300    900
     Eve Keyboard         2    100    200
     Frank  Mouse         1    80    80
     George   Headset         1   200    200
     Helen   Speaker         2    150    300
     Ivan   Printer         1    250   250
     Julia  Scanner         1   180   180
     Kate   Camera         2   200   400
     Leo    Tablet 