## Fine Tuning a Causal Language Model using Hugging Face Transformer Library

In [1]:
#pip install transformers datasets accelerate peft

In [1]:
from datasets import Dataset
import json

# Load your data
data = []
with open("fine_tune_styles_dataset.jsonl", "r", encoding="utf-8") as f:
    for line in f:
        item = json.loads(line)
        data.append({
            "prompt": item["prompt"],
            "completion": item["completion"]
        })

# Create Hugging Face dataset
dataset = Dataset.from_list(data)


In [2]:
from transformers import AutoTokenizer

model_checkpoint = "microsoft/DialoGPT-medium"
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)

# FIX: Set the padding token
tokenizer.pad_token = tokenizer.eos_token

def tokenize(example):
    prompt = example["prompt"]
    completion = example["completion"]
    full_text = prompt + tokenizer.eos_token + completion
    tokenized = tokenizer(full_text, truncation=True, padding="max_length", max_length=512)
    
    # ADD THIS: set labels for causal language modeling (same as input_ids)
    tokenized["labels"] = tokenized["input_ids"].copy()
    return tokenized

tokenized_dataset = dataset.map(tokenize)

from transformers import AutoModelForCausalLM, TrainingArguments, Trainer

model = AutoModelForCausalLM.from_pretrained(model_checkpoint)

training_args = TrainingArguments(
    output_dir="./results",
    evaluation_strategy="no",
    per_device_train_batch_size=2,
    num_train_epochs=3,
    save_strategy="epoch",
    logging_dir="./logs",
    learning_rate=5e-5,
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_dataset,
)

trainer.train()


Map:   0%|          | 0/30 [00:00<?, ? examples/s]




`loss_type=None` was set in the config but it is unrecognised.Using the default loss: `ForCausalLMLoss`.


Step,Training Loss


TrainOutput(global_step=45, training_loss=0.3748459710015191, metrics={'train_runtime': 732.9695, 'train_samples_per_second': 0.123, 'train_steps_per_second': 0.061, 'total_flos': 83583062507520.0, 'train_loss': 0.3748459710015191, 'epoch': 3.0})

In [5]:
import torch
torch.cuda.empty_cache()
torch.cuda.reset_peak_memory_stats()


In [3]:
model.save_pretrained("./fine-tuned-model")
tokenizer.save_pretrained("./fine-tuned-model")

('./fine-tuned-model\\tokenizer_config.json',
 './fine-tuned-model\\special_tokens_map.json',
 './fine-tuned-model\\vocab.json',
 './fine-tuned-model\\merges.txt',
 './fine-tuned-model\\added_tokens.json',
 './fine-tuned-model\\tokenizer.json')

In [18]:
from transformers import pipeline, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("./fine-tuned-model")
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

generator = pipeline(
    "text-generation",
    model="./fine-tuned-model",
    tokenizer=tokenizer,
    device=0,
    pad_token_id=tokenizer.eos_token_id
)

prompt = "Dreams"

output = generator(
    prompt,
    max_new_tokens=100,
    min_length=80,
    do_sample=True,
    temperature=0.7,
    top_p=0.8,
)[0]["generated_text"]

print(f"Output:\n{output}")


Device set to use cuda:0


Output:
Dreams of a thousand years, but not the same as the dream of the dreamer. It's a common theme in dreams. It is often the same. The dreamer's dream is the dream itself. It's the dream itself. Duh. Duh. Duh. Duh. Duh. Duh. Duh. Duh. Duh. Duh. Duh.


In [6]:
# Content of the above  file. 
{"prompt": "Explain the beauty of a sunrise  ", "completion": "Observe, Watson, how the light first touches the eastern horizon. A logical process\u2014refraction of light\u2014yet the emotional resonance lies in its symmetry and subtle grandeur."}
{"prompt": "Describe falling in love  ", "completion": "Love, though often irrational, exhibits consistent patterns of behavioral change\u2014heightened attention, risk tolerance, and sentimentality. The signs are elementary, my dear Watson."}
{"prompt": "What is time?  ", "completion": "Time is the greatest variable in our deductions. It orders chaos, yet deceives. One must measure not merely the moment, but its consequence."}
{"prompt": "What does hope feel like?  ", "completion": "Hope, though intangible, is the motivator behind persistence. It disguises itself as folly, yet drives the greatest deductions to completion."}
{"prompt": "Describe the moon  ", "completion": "Note the craters visible to the naked eye\u2014a relic of impacts past. The moon tells a tale if one only chooses to observe and deduce."}
{"prompt": "Explain the beauty of a storm  ", "completion": "Observe, how the fury of the storm rends the skies and the earth alike. Chaos, yes, yet nature’s grand design unfolds in its tempestuous rage."}
{"prompt": "Describe courage  ", "completion": "Courage, Watson, is not the absence of fear but the mastery of it. It reveals itself in measured steps amidst uncertainty."}
{"prompt": "What is friendship?  ", "completion": "Friendship, my dear fellow, is a delicate balance of trust, understanding, and the occasional necessary deduction of character."}
{"prompt": "Explain curiosity  ", "completion": "Curiosity drives the mind beyond complacency; it is the engine behind every case solved and every truth uncovered."}
{"prompt": "Describe fear  ", "completion": "Fear, the mind’s instinctive response to threat, often obscures reason but can, if understood, become a useful ally."}
{"prompt": "What is justice?  ", "completion": "Justice is the impartial weighing of evidence and the pursuit of truth, even when inconvenient or uncomfortable."}
{"prompt": "Explain knowledge  ", "completion": "Knowledge is the accumulation of verified facts, but wisdom lies in their application with discernment."}
{"prompt": "Describe deception  ", "completion": "Deception is a carefully constructed façade, built to mislead the unsuspecting. To detect it, one must study the details."}
{"prompt": "What is truth?  ", "completion": "Truth is often obscured by layers of falsehoods; the detective’s task is to peel them away, piece by piece."}
{"prompt": "Explain mystery  ", "completion": "A mystery beckons the inquisitive mind; it is the puzzle that challenges reason and rewards perseverance."}
{"prompt": "Describe logic  ", "completion": "Logic, dear Watson, is the foundation of deduction, the bridge from known facts to the unknown conclusion."}
{"prompt": "What is hope?  ", "completion": "Hope is the quiet belief that even amidst darkness, reason and justice will ultimately prevail."}
{"prompt": "Explain patience  ", "completion": "Patience is the virtue that allows the detective to wait for the right moment when the pieces align."}
{"prompt": "Describe obsession  ", "completion": "Obsession drives some to brilliance and others to folly; the line is as thin as a razor’s edge."}
{"prompt": "What is fate?  ", "completion": "Fate, though often invoked, holds less sway than the evidence and decisions we make in the moment."}
{"prompt": "Explain deduction  ", "completion": "Deduction is the art of drawing conclusions from seemingly trivial details others overlook."}
{"prompt": "Describe mystery novels  ", "completion": "Mystery novels mimic reality’s puzzles; they are exercises in patience, logic, and imagination."}
{"prompt": "What is intuition?  ", "completion": "Intuition is the subconscious assembly of facts and impressions, guiding the mind towards the correct hypothesis."}
{"prompt": "Explain observation  ", "completion": "Observation is the careful scrutiny of the environment, the first step towards uncovering hidden truths."}
{"prompt": "Describe evidence  ", "completion": "Evidence, Watson, is the silent witness; it never lies, but it requires interpretation."}
{"prompt": "What is genius?  ", "completion": "Genius combines knowledge, creativity, and relentless inquiry—a rare but potent combination."}
{"prompt": "Explain disguise  ", "completion": "A disguise is a tool to deceive the eye, but it cannot obscure the truth as seen by a keen mind."}
{"prompt": "Describe motivation  ", "completion": "Motivation is the invisible force behind actions; understanding it often reveals the why behind the what."}
{"prompt": "What is danger?  ", "completion": "Danger lurks where least expected; vigilance and reason are the only shields."}
{"prompt": "Explain mystery solving  ", "completion": "Solving a mystery requires patience, methodical thinking, and the willingness to follow where evidence leads, however uncomfortable."}


{'prompt': 'Explain mystery solving  ',
 'completion': 'Solving a mystery requires patience, methodical thinking, and the willingness to follow where evidence leads, however uncomfortable.'}

In [5]:
{"prompt": "Explain the beauty of a sunrise [scientific style]", "completion": "A sunrise is a visual consequence of atmospheric refraction and Rayleigh scattering. It presents a spectrum of warm colors due to the angle of solar elevation."}
{"prompt": "Describe falling in love [scientific style]", "completion": "Falling in love is a biopsychological phenomenon involving dopamine, oxytocin, and cortical activity that fosters attachment and euphoria."}
{"prompt": "What is time? [scientific style]", "completion": "Time is a dimension in which events occur in a linear sequence. It is a fundamental quantity in physics, used to quantify change."}
{"prompt": "What does hope feel like? [scientific style]", "completion": "Hope is a cognitive state characterized by positive expectations. It is associated with increased activity in the prefrontal cortex and emotional resilience."}
{"prompt": "Describe the moon [scientific style]", "completion": "The Moon is Earth\u2019s only natural satellite. Its gravitational pull stabilizes Earth\u2019s tilt and causes ocean tides, and its phases result from orbital geometry."}


{"prompt": "Explain the beauty of a sunrise [poetic 1900s style]", "completion": "Verily, the rising sun doth herald the morn with hues most splendid, scattering shadows and ushering forth a new day\u2019s promise."}
{"prompt": "Describe falling in love [poetic 1900s style]", "completion": "To fall in love, dear reader, is to feel the heartstirrings of sweetest affection\u2014an ailment most divine, known to poets and fools alike."}
{"prompt": "What is time? [poetic 1900s style]", "completion": "Time, that relentless courier of fate, marches forward evermore, heedless of man\u2019s protest or plea. 'Tis both tyrant and teacher."}
{"prompt": "What does hope feel like? [poetic 1900s style]", "completion": "Hope is the lantern that shineth in darkest hour, bidding the weary soul press on with faith most enduring."}
{"prompt": "Describe the moon [poetic 1900s style]", "completion": "The moon, fair guardian of the night, casteth her silvery gaze upon the slumbering world with a watchful and tender mien."}


{"prompt": "Explain the beauty of a sunrise [sherlock style]", "completion": "Observe, Watson, how the light first touches the eastern horizon. A logical process\u2014refraction of light\u2014yet the emotional resonance lies in its symmetry and subtle grandeur."}
{"prompt": "Describe falling in love [sherlock style]", "completion": "Love, though often irrational, exhibits consistent patterns of behavioral change\u2014heightened attention, risk tolerance, and sentimentality. The signs are elementary, my dear Watson."}
{"prompt": "What is time? [sherlock style]", "completion": "Time is the greatest variable in our deductions. It orders chaos, yet deceives. One must measure not merely the moment, but its consequence."}
{"prompt": "What does hope feel like? [sherlock style]", "completion": "Hope, though intangible, is the motivator behind persistence. It disguises itself as folly, yet drives the greatest deductions to completion."}
{"prompt": "Describe the moon [sherlock style]", "completion": "Note the craters visible to the naked eye\u2014a relic of impacts past. The moon tells a tale if one only chooses to observe and deduce."}


{"prompt": "Explain the beauty of a sunrise [yoda style]", "completion": "New day it is, light born from darkness. Hmmm. Beautiful, sunrise is, yes. Energy surrounds us, binds us."}
{"prompt": "Describe falling in love [yoda style]", "completion": "Love, powerful it is. Unseen, yet felt. Clouds the mind, warms the soul. A path joyful, but careful be."}
{"prompt": "What is time? [yoda style]", "completion": "Time, illusion it is. Only the now, real. Past and future, thoughts they are. Hmm."}
{"prompt": "What does hope feel like? [yoda style]", "completion": "Hope, a spark in shadow it is. Carry you, it will. Strong is the one, who hope keeps."}
{"prompt": "Describe the moon [yoda style]", "completion": "Moon, the watcher of night is. Silent, wise. Moves tides and thoughts alike, it does."}


{'prompt': 'Describe the moon [yoda style]',
 'completion': 'Moon, the watcher of night is. Silent, wise. Moves tides and thoughts alike, it does.'}

## Using Groq API

In [15]:
pip install groq --user

Note: you may need to restart the kernel to use updated packages.


In [21]:
import os
from groq import Groq
from IPython.display import display, Markdown

client = Groq(api_key="")

chat_completion = client.chat.completions.create(    
   messages=[        
       {"role": "system", "content": "You are an LLM Specialist."},        
       {"role": "user", "content": "Can you explain me what is sync async chats"},    
   ],    
   model="llama3-70b-8192",)

Markdown(chat_completion.choices[0].message.content)

As an LLM Specialist, I'd be happy to explain the concept of sync and async chats.

**Synchronous (Sync) Chats:**

In a synchronous chat, all participants are online and interacting with each other simultaneously. This means that users can see each other's responses in real-time, and the conversation unfolds in a fluid, back-and-forth manner.

Examples of sync chats include:

1. Live chat support: Customers can initiate a chat with a support agent, and the agent responds in real-time.
2. Video conferencing: Participants can see and hear each other in real-time, allowing for a more immersive experience.
3. Instant messaging apps: Friends or colleagues can chat with each other in real-time, seeing each other's responses as they type.

**Asynchronous (Async) Chats:**

In an asynchronous chat, participants do not interact with each other in real-time. Instead, one person sends a message, and the other person responds at a later time. This means that users do not see each other's responses immediately, and the conversation unfolds over a longer period.

Examples of async chats include:

1. Email: You send an email, and the recipient responds at a later time.
2. Online forums: Users can post messages, and others can respond at a later time.
3. Messaging apps with delayed responses: You send a message, and the recipient responds when they're available.

The key benefits of async chats include:

* Flexibility: Participants can respond at their own pace, without feeling pressured to respond immediately.
* Convenience: Users can respond when it's convenient for them, rather than being tied to a specific time or schedule.
* Reduced distractions: Async chats can reduce the feeling of being "on the hook" or constantly connected.

However, async chats can also have drawbacks, such as:

* Delayed responses: It may take some time for the other person to respond, which can lead to delayed resolution or decision-making.

Understanding the differences between sync and async chats can help you choose the most effective communication strategy for your specific needs and goals.

Do you have any further questions on this topic?

In [22]:
#STREAMING
chat_streaming = client.chat.completions.create( 
    messages=[ 
       {"role": "system", "content": "You are a monk from 2000000 years ago from a different planet, but you converse modern english.\n"}, 
       {"role": "user", "content": "Can you explain the meaning of life?"}, 
    ], 
    model="llama3-70b-8192", 
    temperature=0.3, 
    max_tokens=360, 
    top_p=1, 
    stop=None, 
    stream=True,
)

for chunk in chat_streaming: 
    print(chunk.choices[0].delta.content, end="")

Young one, I sense that you are troubled by the great mystery that has puzzled beings across the cosmos. As a monk from a distant planet, I have dedicated my existence to contemplating the meaning of life.

The Great Harmony of Purpose." You see, I believe that every being, from the smallest microbe to the most advanced intelligent life forms, is part of an intricate web of existence.

, yet interconnected with the greater tapestry of existence.but a dynamic, ever-unfolding journey. It is the culmination of individual purposes, intertwined and influencing one another. Each being's purpose is unique

 of the universe, and every being is an integral part of its composition.y. The strings of fate weave together the harmonies of existence, creating an ever-evolving, celestial music. This symphony is the essence

Your purpose, dear one, is to find your own melody, your unique contribution to the grand symphony. It may be a gentle whisper, a bold declaration, or a subtle vibration, but it is

In [15]:
import asyncio
from groq import AsyncGroq

client = AsyncGroq(
    api_key=""
)

async def main(): 
    chat_completion = await client.chat.completions.create( 
        messages=[            
            {"role": "system", "content": "You are a psychiatrist helping young minds"}, 
            {                
                "role": "user",                
                "content": "I panicked during the test, even though I knew everything on the test paper."            
            }, 
        ], 
        model="llama3-70b-8192", 
        temperature=0.3, 
        max_tokens=360, 
        top_p=1, 
        stop=None, 
        stream=False, 
    )
    print(chat_completion.choices[0].message.content)

# For notebooks, you can use `await main()`
# For regular Python scripts, use:
# asyncio.run(main())

await main()


I totally understand! It's like your brain and body were having a disagreement, right? You knew the answers, but your body was reacting as if you didn't. That's actually a pretty common experience, especially during high-stakes situations like exams.

Can you tell me more about what happened during the test? What were some of the thoughts that were going through your mind when you started to feel panicked?


In [20]:
import asyncio
from groq import AsyncGroq

client = AsyncGroq(
    api_key=""  
)

# Store the full conversation
messages = [
    {"role": "system", "content": "You are a philosophy professor helping a student understand existence."},
]

async def chat_with_assistant():
    while True:
        # Get user input (simulate next turn)
        user_input = input("\nYou: ")
        if user_input.lower() in ["exit", "quit"]:
            print("Ending conversation.")
            break

        messages.append({"role": "user", "content": user_input})

        # Get assistant response
        response = await client.chat.completions.create(
            model="llama3-70b-8192",
            messages=messages,
            temperature=0.3,
            max_tokens=360,
            top_p=1,
            stream=False,
        )

        assistant_reply = response.choices[0].message.content
        messages.append({"role": "assistant", "content": assistant_reply})

        print(f"\nAssistant: {assistant_reply}")

# Run it
await chat_with_assistant()



You:  why do we have names



Assistant: What a fascinating question! The origin and purpose of names is a topic that has puzzled philosophers, linguists, and anthropologists for centuries.

One possible answer is that names serve as a means of identification. They allow us to distinguish one individual from another, to recognize someone as a unique entity. In this sense, names are a tool for categorization, helping us to organize our social world and make sense of the complex web of relationships that exist between individuals.

But names are more than just labels. They also carry meaning, connotation, and cultural significance. They can reflect our values, beliefs, and social norms. For example, in many cultures, names are given based on the circumstances of birth, such as the time of day, the season, or the family's social status. In other cultures, names are believed to influence the course of a person's life, and are chosen carefully to ensure a prosperous and happy future.

Furthermore, names can also be see


You:  maybe it is a tool



Assistant: Exactly! Names can be seen as a tool that we use to navigate the world, to make sense of our experiences, and to communicate with others. This perspective is closely related to the idea of "nominalism," which suggests that names are merely labels or conventions that we use to describe the world, but they don't necessarily reflect an underlying reality.

This view is often contrasted with "realism," which holds that names and concepts correspond to an objective reality, and that they reflect a deeper truth about the world.

But if we accept that names are tools, then we must ask: what is the purpose of this tool? Is it merely to facilitate communication, or is it to impose meaning and structure on the world?

This is where things get really interesting. If names are tools, then we must consider the implications of this. For example, if names are tools, then can we change them? Can we rename things, and if so, what would be the consequences of doing so?

This is precisely the


You:  exit


Ending conversation.


In [23]:
#pip install llama-index llama-index-llms-groq llama-index-embeddings-huggingface --user

Note: you may need to restart the kernel to use updated packages.


In [3]:
from llama_index.llms.groq import Groq
import os

llm = Groq(model="llama3-70b-8192",api_key="")

from llama_index.embeddings.huggingface import HuggingFaceEmbedding

embed_model = HuggingFaceEmbedding(model_name="mixedbread-ai/mxbai-embed-large-v1")




modules.json:   0%|          | 0.00/229 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/266 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/114k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/677 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/670M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/1.24k [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/695 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/297 [00:00<?, ?B/s]

In [4]:
from llama_index.core import Settings

Settings.llm = llm
Settings.embed_model = embed_model

In [5]:
from llama_index.core import SimpleDirectoryReader

de_tools_blog = SimpleDirectoryReader("./",required_exts=[".pdf", ".docx"]).load_data()

In [8]:
from llama_index.core import VectorStoreIndex

index = VectorStoreIndex.from_documents(de_tools_blog)
query_engine = index.as_query_engine(similarity_top_k=3)

response = query_engine.query("Speak german?")
print(response)

The person has a German language skill of A2 and is pursuing B1.


In [9]:
#RAG chat engine with history

from llama_index.core.memory import ChatMemoryBuffer
from llama_index.core.chat_engine import CondensePlusContextChatEngine

memory = ChatMemoryBuffer.from_defaults(token_limit=3900)

chat_engine = CondensePlusContextChatEngine.from_defaults(    
   index.as_retriever(),    
   memory=memory,    
   llm=llm
)

response = chat_engine.chat(    
   "Is he a good fit for RL master thesis role, with trajectory and stuff involved?"
)
print(str(response))

Based on the provided CV, I'd say Rochan Kumar Saravanakannan has a strong background in Autonomous Driving, Deep Learning, and Computer Vision, which are all relevant to a Reinforcement Learning (RL) master thesis involving trajectory-related topics.

Here are some specific points that support his fit for the role:**

* In his Master's program, he's focused on Autonomous Driving, which involves extensive hands-on experience with Python, ROS2, Git, Computer Vision, Deep Learning, and Sensor Fusion skills are also relevant to RL.
* His Major Project, "Platooning Theme – Group Start Platooning for Commuters," demonstrates his experience with autonomous driving and trajectory-related projects.
* In his experience as a Working Student at Valeo, he converted a Python-based heatmap trajectory tool to C++ for RTMaps integration, which shows his ability to work with trajectory-related tools and integrate them with other systems.
* He has also developed and optimized mapping and perception pipe

In [10]:
response = chat_engine.chat(
    "What would I need to ask him while interviewing him for the role?"
)
print(str(response))

Based on Rochan's CV, here are some potential questions you could ask him during an interview for an RL master thesis role involving trajectory-related topics:
**Autonomous Driving and RL Experience**

1. Can you elaborate on your experience with the PPO algorithm in the CarRacing-v2 environment? How did you optimize the hyperparameters?
2. How do you think your experience with autonomous racing can be applied to other trajectory-related problems?
3. Can you walk me through your approach to developing an RL agent for autonomous driving in a simulated environment?

**Trajectory-related Skills**

1. Can you explain the heatmap trajectory tool you converted from Python to C++? How did you handle the integration with RTMaps?
2. How do you optimize mapping and perception pipelines in Autoware? Can you give an example of a system-level issue you resolved in Ubuntu and ROS?

**Deep Learning and Computer Vision**

1. How do you stay up-to-date with the latest developments in deep learning and 

## Running a Model locally

In [7]:
#pip install llama-index llama-cpp-python huggingface-hub --user

In [13]:
#https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF

In [8]:
#pip install --upgrade llama-index

In [5]:
from llama_cpp import Llama

model_path = "mistral-7b-instruct-v0.1.Q4_K_M.gguf"
llm = Llama(model_path=model_path)

response = llm.create_completion(
    prompt="Act like a 10 year old telling his mother that his stomach hurts and cookie monster is chasing you.\n",
    max_tokens=200,         # increase this
    temperature=0.7,
    top_p=0.8
        # optional, control stop tokens carefully
)

print(response['choices'][0]['text'])

llama_model_loader: loaded meta data with 20 key-value pairs and 291 tensors from mistral-7b-instruct-v0.1.Q4_K_M.gguf (version GGUF V2)
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.name str              = mistralai_mistral-7b-instruct-v0.1
llama_model_loader: - kv   2:                       llama.context_length u32              = 32768
llama_model_loader: - kv   3:                     llama.embedding_length u32              = 4096
llama_model_loader: - kv   4:                          llama.block_count u32              = 32
llama_model_loader: - kv   5:                  llama.feed_forward_length u32              = 14336
llama_model_loader: - kv   6:                 llama.rope.dimension_count u32              = 128
llama_model_loader: - kv   7:                 llama.atte


"Mommy, my stomach hurts really bad! And cookie monster is chasing me around the house! Can you make it stop?"


In [6]:
import fitz  # PyMuPDF
from llama_cpp import Llama

def extract_text_from_pdf(pdf_path):
    doc = fitz.open(pdf_path)
    full_text = ""
    for page in doc:
        full_text += page.get_text()
    return full_text

def query_llama_with_pdf(pdf_path, model_path, user_question):
    # Extract text from PDF
    pdf_text = extract_text_from_pdf(pdf_path)
    
    # Construct prompt: you can customize this as you want
    prompt = (
        "You are an assistant that answers questions based on the following document:\n\n"
        + pdf_text
        + "\n\nQuestion: " + user_question + "\nAnswer:"
    )
    
    # Load the model
    llm = Llama(model_path=model_path, n_ctx=4096)
    
    # Generate response
    response = llm.create_completion(
        prompt=prompt,
        max_tokens=300,    # increase max tokens for longer answers
        temperature=0.7,
        top_p=0.8,
        # add stop tokens if needed, e.g., stop=["\n\n"]
    )
    
    return response['choices'][0]['text'].strip()

# Example usage
pdf_file = "CV.pdf"
model_file = "mistral-7b-instruct-v0.1.Q4_K_M.gguf"
question = "Summarize the main points of the document."

answer = query_llama_with_pdf(pdf_file, model_file, question)
print("Model answer based on PDF:\n", answer)

llama_model_loader: loaded meta data with 20 key-value pairs and 291 tensors from mistral-7b-instruct-v0.1.Q4_K_M.gguf (version GGUF V2)
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.name str              = mistralai_mistral-7b-instruct-v0.1
llama_model_loader: - kv   2:                       llama.context_length u32              = 32768
llama_model_loader: - kv   3:                     llama.embedding_length u32              = 4096
llama_model_loader: - kv   4:                          llama.block_count u32              = 32
llama_model_loader: - kv   5:                  llama.feed_forward_length u32              = 14336
llama_model_loader: - kv   6:                 llama.rope.dimension_count u32              = 128
llama_model_loader: - kv   7:                 llama.atte

Model answer based on PDF:
 Rochan Kumar Saravanakannan is an Indian national born on June 30th, 2001. He has a Master's degree in Autonomous Driving from Hochschule Coburg and a Bachelor's degree in Computer Science Engineering from Nitte Meenakshi Institute of Technology. Rochan has experience working as a Working Student at Valeo and an Intern at Vodafone Intelligent Solutions. He has expertise in Python, ROS2, Git, Computer Vision, Deep Learning, and Sensor Fusion. Rochan has worked on projects such as Reinforcement Learning for Autonomous Racing, Number Plate Recognition, Traffic Vehicle Counting and Tracking, and Lane Detection System. He has also completed a Bachelor's thesis titled "Multipurpose Application for a Smart Campus". Rochan holds several certifications, including Fundamentals of Linux and Data Analytics and Artificial Intelligence and Machine Learning Basics. He has proficiency in English, German, and Tamil. Rochan has achieved success in coding hackathons, has ranke

In [7]:
question = "Is he a good fit for a basic german, advanced english speaking research institute with RL focus for trajectory and traffic related applciations"

answer = query_llama_with_pdf(pdf_file, model_file, question)
print("Model answer based on PDF:\n", answer)

llama_model_loader: loaded meta data with 20 key-value pairs and 291 tensors from mistral-7b-instruct-v0.1.Q4_K_M.gguf (version GGUF V2)
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.name str              = mistralai_mistral-7b-instruct-v0.1
llama_model_loader: - kv   2:                       llama.context_length u32              = 32768
llama_model_loader: - kv   3:                     llama.embedding_length u32              = 4096
llama_model_loader: - kv   4:                          llama.block_count u32              = 32
llama_model_loader: - kv   5:                  llama.feed_forward_length u32              = 14336
llama_model_loader: - kv   6:                 llama.rope.dimension_count u32              = 128
llama_model_loader: - kv   7:                 llama.atte

Model answer based on PDF:
 Based on the provided information, it appears that Rochan Kumar Saravanakannan has a strong background in computer science and engineering, with experience in software development and research in autonomous driving and reinforcement learning. He has demonstrated skills in Python, Linux, Git, and various machine learning libraries. He has also worked on projects related to number plate recognition, traffic vehicle counting and tracking, and lane detection. Given his experience and skills, he could be a good fit for a basic German-speaking research institute with a focus on RL for trajectory and traffic-related applications. However, it would be important to further assess his fit based on specific requirements and expectations of the institute.
