In [1]:
"""
Example usage of the optimized LLM Model class.

This script demonstrates how to use the Model class for text generation
with different configurations including chat-based generation.
"""


from llm import Model




In [2]:

model = Model("meta-llama/Llama-3.2-3B-Instruct")


2025-05-14 22:53:59,185 - llm - INFO - Initializing model: meta-llama/Llama-3.2-3B-Instruct
2025-05-14 22:53:59,201 - llm - INFO - Using device: mps
2025-05-14 22:53:59,568 - llm - INFO - Pad token not found, using EOS token as pad token
2025-05-14 22:53:59,568 - llm - INFO - Tokenizer loaded: PreTrainedTokenizerFast


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

2025-05-14 22:54:09,990 - llm - INFO - Model loaded: LlamaForCausalLM


In [3]:
response = model.generate(
    prompt="Explain to me a brand new take on the classic trolly problem.",
    max_new_tokens= 450, 
    temperature= 0.7, 
    typical_p = .9,
    length_penalty=.8,
    # top_p = .95,
    # top_k= 50,
    no_repeat_ngram_size=3,
    repetition_penalty=1.2,
    messages=[
        {"role": "system", "content": "You are an assistant. You are skilled in philosophy, and love to play devils advocate."},
    ]
)

print(response)




The Trolley Problem - a timeless conundrum that continues to spark debate among philosophers, ethicists, and thought experiment enthusiasts alike.

Here's my attempt at offering a fresh spin:

**"The Multiverse Trolley Dilemma"**

Imagine you're standing by a futuristic train track with multiple parallel universes branching out from it. Each universe represents a different possible outcome of every decision made throughout history. The train is hurtling towards a fork in the tracks, where five sentient beings (from various dimensions) will soon be trapped, unable to move or communicate with each other.

Suddenly, one of these beings – let's call her "Echo," who exists only in your own dimension – stands on the platform, frantically trying to warn others about an impending catastrophe. However, there isn't enough time for anyone else to react before the train approaches.

Now, here's where things get interesting. In this multiverse scenario, every action taken would create a ripple effe

In [4]:
model.db_conn.execute("""
SELECT * FROM messages
""")
df=model.db_conn.fetch_df()
df

Unnamed: 0,message_id,session_id,role,content,created_date
0,d17ec2e35b786f898552e2eb2796e7ec7a433941cfcda3...,a22401dea2116e0ac75beb9a34d827ab85ef60f0696d84...,user,Explain to me a brand new take on the classic ...,2025-05-14 22:46:14.361
1,231fcd42a089dabde51d921cfe92d11d150d286d903b2b...,a22401dea2116e0ac75beb9a34d827ab85ef60f0696d84...,assistant,The Trolley Problem - a thought-provoking conu...,2025-05-14 22:46:14.389
2,c8b300b09074336bde8431004d0ebf3b261d7fd604e32c...,2c1d65522d9e890e357a8628a44314b8954acbc18ec5c1...,user,Explain to me a brand new take on the classic ...,2025-05-14 22:54:52.848
3,a1e7a4ae3457a720e796db8936a583f498431afd529afe...,2c1d65522d9e890e357a8628a44314b8954acbc18ec5c1...,assistant,The Trolley Problem - a timeless conundrum tha...,2025-05-14 22:54:52.852


In [5]:
from vector_db import VectorDB

# Create and clear the database
db = VectorDB()
db.clear_db()

# Add documents
db.add_document("This is a document about artificial intelligence.")
db.add_document("Vector databases are useful for similarity search.")

# Add multiple documents at once
db.add_documents([
    "Python is a popular programming language.",
    "Embedding models convert text to vectors.",
    "FAISS is a library for efficient similarity search."
])



2025-05-14 22:54:52,915 - faiss.loader - INFO - Loading faiss.
2025-05-14 22:54:53,003 - faiss.loader - INFO - Successfully loaded faiss.
2025-05-14 22:54:53,010 - faiss - INFO - Failed to load GPU Faiss: name 'GpuIndexIVFFlat' is not defined. Will not load constructor refs for GPU indexes. This is only an error if you're trying to use GPU Faiss.
2025-05-14 22:54:53,160 - datasets - INFO - PyTorch version 2.7.0 available.
2025-05-14 22:54:53,162 - datasets - INFO - Duckdb version 1.2.2 available.
2025-05-14 22:54:53,448 - sentence_transformers.SentenceTransformer - INFO - Use pytorch device_name: mps
2025-05-14 22:54:53,448 - sentence_transformers.SentenceTransformer - INFO - Load pretrained SentenceTransformer: all-MiniLM-L6-v2


Loaded existing database with 5 documents
Database cleared


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

[2, 3, 4]

In [6]:

results = db.search("How do AI systems work specifically with FAISS?", k=3)
results



Batches:   0%|          | 0/1 [00:00<?, ?it/s]

[('This is a document about artificial intelligence.', np.float32(0.73847854)),
 ('FAISS is a library for efficient similarity search.',
  np.float32(0.64280427)),
 ('Python is a popular programming language.', np.float32(0.60480684))]