In [1]:
!df -h

Filesystem        Size    Used   Avail Capacity iused ifree %iused  Mounted on
/dev/disk1s4s1   932Gi   9.6Gi   845Gi     2%    404k  4.3G    0%   /
devfs            205Ki   205Ki     0Bi   100%     710     0  100%   /dev
/dev/disk1s2     932Gi   2.0Gi   845Gi     1%    1.2k  8.9G    0%   /System/Volumes/Preboot
/dev/disk1s6     932Gi   5.0Gi   845Gi     1%       5  8.9G    0%   /System/Volumes/VM
/dev/disk1s5     932Gi   2.7Mi   845Gi     1%      25  8.9G    0%   /System/Volumes/Update
/dev/disk1s1     932Gi    69Gi   845Gi     8%    1.6M  8.9G    0%   /System/Volumes/Data
map auto_home      0Bi     0Bi     0Bi   100%       0     0     -   /System/Volumes/Data/home


## Running LLMs locally with Ollama

In [10]:
from openai import OpenAI

client = OpenAI(
    base_url='http://localhost:11434/v1/',
    api_key='ollama',
)

In [2]:
import requests 
import minsearch

docs_url = 'https://github.com/DataTalksClub/llm-zoomcamp/blob/main/01-intro/documents.json?raw=1'
docs_response = requests.get(docs_url)
documents_raw = docs_response.json()

documents = []

for course in documents_raw:
    course_name = course['course']

    for doc in course['documents']:
        doc['course'] = course_name
        documents.append(doc)

index = minsearch.Index(
    text_fields=["question", "text", "section"],
    keyword_fields=["course"]
)

index.fit(documents)

<minsearch.Index at 0x1366a7250>

In [3]:
def search(query):
    boost = {'question': 3.0, 'section': 0.5}

    results = index.search(
        query=query,
        filter_dict={'course': 'data-engineering-zoomcamp'},
        boost_dict=boost,
        num_results=5
    )

    return results

In [5]:
def build_prompt(query, search_results):
    prompt_template = """
You're a course teaching assistant. Answer the QUESTION based on the CONTEXT from the FAQ database.
Use only the facts from the CONTEXT when answering the QUESTION.

QUESTION: {question}

CONTEXT: 
{context}
""".strip()

    context = ""
    
    for doc in search_results:
        context = context + f"section: {doc['section']}\nquestion: {doc['question']}\nanswer: {doc['text']}\n\n"
    
    prompt = prompt_template.format(question=query, context=context).strip()
    return prompt

def llm(prompt):
    response = client.chat.completions.create(
        model='phi3',
        messages=[{"role": "user", "content": prompt}]
    )
    
    return response.choices[0].message.content

In [6]:
llm('write that this is a test')

' This is a test.\n\n\nTo conduct this simple task, I followed these steps:\n\n1. Understood the instruction to simply write out "this is a test."\n\n2. Kept it brief and clear following the essence of what was requested without embellishment or additional detail beyond what\'s explicitly asked for in the command.'

In [7]:
print(_)

 This is a test.


To conduct this simple task, I followed these steps:

1. Understood the instruction to simply write out "this is a test."

2. Kept it brief and clear following the essence of what was requested without embellishment or additional detail beyond what's explicitly asked for in the command.


In [11]:
llm('what are llms?')

' LLMs, or Large Language Models, refer to a class of artificial intelligence algorithms designed for understanding and generating human-like text. These models learn from vast amounts of data so they can produce coherent and contextually relevant sentences across various topics and formats – be it essays, emails, stories, code snippets, questions, or conversational speech.\n\nLarge Language Models are based on deep learning architectures known as Transformer neural networks, which were introduced by Google\'s researchers in a paper called "Attention is All You Need" (Vaswani et al., 2017). The attention mechanism allows the model to weigh different words or tokens differently when constructing responses and understanding input. This has led to remarkable improvements over previous language processing models like GPT-3, which stands for Generative Pre-trained Transformer-3, one of the most powerful LLMs as it can generate humanlike text based on a given prompt with little additional fi

In [12]:
print(_)

 LLMs, or Large Language Models, refer to a class of artificial intelligence algorithms designed for understanding and generating human-like text. These models learn from vast amounts of data so they can produce coherent and contextually relevant sentences across various topics and formats – be it essays, emails, stories, code snippets, questions, or conversational speech.

Large Language Models are based on deep learning architectures known as Transformer neural networks, which were introduced by Google's researchers in a paper called "Attention is All You Need" (Vaswani et al., 2017). The attention mechanism allows the model to weigh different words or tokens differently when constructing responses and understanding input. This has led to remarkable improvements over previous language processing models like GPT-3, which stands for Generative Pre-trained Transformer-3, one of the most powerful LLMs as it can generate humanlike text based on a given prompt with little additional fine-t

In [13]:
query = """What is ollama?
What  are the different models to run with ollama?
How can I set it up locally?
"""

llm(query)

' Ollama stands for "One Language Learner\'s Assistant Machine," and it serves as an AI-powered chatbot designed primarily for educational purposes, such as language learning. It functions using state-of-the-art models like GPT (Generative Pre-trained Transformer) to provide learners with a conversational platform where they can practice speaking or listening in their target language within the Ollama framework.\n\nHere are different ways you could potentially run an instance of Open Language Learning Machine Assistant, though it\'s crucial to understand that as of my knowledge cutoff date in March 2023, there is no direct equivalent named \'Ollama,\' and I have interpreted this request based on your description:\n\n1. **GPT-powered chatbots** are a common approach for language learning tools where you can interact with AI to practice speaking or listening skills in various languages using large pretrained models like GPT (e.g., Codex, Phi 2). Here\'s how they might work:\n\n   - You c

In [14]:
print(_)

 Ollama stands for "One Language Learner's Assistant Machine," and it serves as an AI-powered chatbot designed primarily for educational purposes, such as language learning. It functions using state-of-the-art models like GPT (Generative Pre-trained Transformer) to provide learners with a conversational platform where they can practice speaking or listening in their target language within the Ollama framework.

Here are different ways you could potentially run an instance of Open Language Learning Machine Assistant, though it's crucial to understand that as of my knowledge cutoff date in March 2023, there is no direct equivalent named 'Ollama,' and I have interpreted this request based on your description:

1. **GPT-powered chatbots** are a common approach for language learning tools where you can interact with AI to practice speaking or listening skills in various languages using large pretrained models like GPT (e.g., Codex, Phi 2). Here's how they might work:

   - You choose the de