*** Work in Progress ***
# Local RAG (pipeline in Python) with Ollama/Weaviate

Ollama
- Open-source framework (LLM backend server) designed to faciliate the deployment of LLMs on local environments (CPU and/or GPU)
  - User-friendly Command Line Interface (CLI) / chat prompt
  - Cost effective (training and running models are expensive)
  - Valuable tool for LLM experimentation and customization based on user needs
  - Supports many models
 
*** Work in Progress ***



### Quick Start Guide for Ollama Setup


1. **Download and Install Ollama**  
   [Download Ollama]((https://ollama.com/download)) and follow the installation instructions

2. **Select Models**  
   Browse and choose models from the [Ollama library](https://ollama.com/library)

3. **Pull the Models**  
   Open a terminal and pull the following models:

```bash
ollama pull llama2      # Language model
ollama pull all-minilm  # Embedding model
```

4. **Install Required Python Library**  
   Install the [Ollama Python library](https://github.com/ollama/ollama-python/blob/main/README.md):

```bash
pip install ollama==0.1.8 # Install Ollama Python library (version 0.1.8)
```

5. **Verify Model Execution**  
   Run the model in the terminal to verify it works

```bash
ollama run llama2
```

6. **Start the Ollama Service for Jupyter Notebook Connection**  
   Run the following command in the terminal to start the Ollama service:

```bash
ollama serve &
```


In [32]:
# Import the Ollama library to interact with the language models
import ollama

# Send a chat request to the 'llama2' model with a user message
response = ollama.chat(model='llama2', messages=[
  {
    'role': 'user',
    'content': 'Why is the sky blue?',  # Message content that the user sends to the model
  },
])

# Print the response from the model, displaying the answer to the user's question
print(response['message']['content'])




The sky appears blue because of a phenomenon called Rayleigh scattering. When sunlight enters Earth's atmosphere, it encounters tiny molecules of gases such as nitrogen and oxygen. These molecules scatter the light in all directions, but they scatter shorter (blue) wavelengths more than longer (red) wavelengths. This is known as Rayleigh scattering.

As a result of this scattering, the blue light is distributed throughout the atmosphere, giving the sky its blue color. The other colors we see in the sky, such as red and yellow, are also present, but they are less abundant and have smaller wavelengths, so they are not as easily scattered.

It's worth noting that the color of the sky can appear different under different conditions. For example, during sunrise and sunset, the sky can take on hues of red, orange, and pink due to the angle of the sunlight and the scattering of light by particles in the atmosphere. Additionally, pollution and other atmospheric factors can also affect the col

In [33]:
# Generate vector embeddings for the given text prompt using the specified model
# Embeddings are used to represent the semantic meaning of the text in a numerical format

ollama.embeddings(model="all-minilm", 
                  prompt= "Llamas are members of the camelid family meaning they're pretty closely related to vicuñas and camels")


{'embedding': [0.0479680672287941,
  0.11637094616889954,
  -0.24570561945438385,
  -0.04406300559639931,
  -0.24932530522346497,
  0.12218563258647919,
  -0.48447176814079285,
  -0.1940533071756363,
  0.27372273802757263,
  0.1956769824028015,
  0.26291224360466003,
  -0.31583428382873535,
  0.28280389308929443,
  0.03434046357870102,
  0.17188657820224762,
  -0.13632719218730927,
  0.03992735221982002,
  0.13770951330661774,
  -0.33948075771331787,
  0.3291258215904236,
  0.1720881313085556,
  0.08957856893539429,
  0.33579421043395996,
  0.1561996340751648,
  -0.11858808994293213,
  0.43885111808776855,
  -0.11902756989002228,
  0.11152736097574234,
  -0.012823819182813168,
  0.16673514246940613,
  -0.34044894576072693,
  0.019135579466819763,
  -0.110402412712574,
  -0.059086285531520844,
  -0.3229251801967621,
  0.10102441906929016,
  0.2166978120803833,
  0.74030601978302,
  0.3397805392742157,
  0.2394002228975296,
  0.11844772100448608,
  -0.1658744364976883,
  0.32743486762046

### Quick Start Guide for Weaviate Setup


1. **Install Required Python Library**  
   Install the [Python Weaviate library](https://weaviate.io/developers/weaviate/client-libraries/python):

```bash
 pip install -U weaviate-client  # Install Weaviate client library (version 4.5.5)
```



*** Work in Progress ***

### Step 1 Generate Embeddings ???

Weaviate is an open-source, AI native vector database

The Weaviate client library includes helper functions to facilitate interaction with the Weaviate database. It provides tools to manage various aspects of your Weaviate instance, including data ingestion, querying, and performing search operations.

*** Work in Progress ***

In [11]:
documents = [
  "Llamas are members of the camelid family meaning they're pretty closely related to vicuñas and camels",
  "Llamas were first domesticated and used as pack animals 4,000 to 5,000 years ago in the Peruvian highlands",
  "Llamas can grow as much as 6 feet tall though the average llama between 5 feet 6 inches and 5 feet 9 inches tall",
  "Llamas weigh between 280 and 450 pounds and can carry 25 to 30 percent of their body weight",
  "Llamas are vegetarians and have very efficient digestive systems",
  "Llamas live to be about 20 years old, though some only live for 15 years and others live to be 30 years old",
]

In [14]:
# Import the Weaviate client library to interact with the Weaviate database
import weaviate

# Import specific classes from Weaviate to work with data schema and configs
import weaviate.classes as wvc
from weaviate.classes.config import Property, DataType

# Connect to an embedded Weaviate instance, enabling the Python client to interact with the local server 
# using HTTP for queries and gRPC (Google Remote Procedure Call) for faster data communication

# Start process ID path: /Users/briankaewell/.cache/weaviate-embedded/
# Serving Weaviate at http://127.0.0.1:8079

client = weaviate.connect_to_embedded()

# Check if the Weaviate instance is ready and print the connection status
print(client.is_ready())


#try:
#    pass  # Do something with the client

#finally:
#    client.close()  # Ensure the connection is closed


{"action":"startup","default_vectorizer_module":"none","level":"info","msg":"the default vectorizer modules is set to \"none\", as a result all new schema classes without an explicit vectorizer setting, will use this vectorizer","time":"2024-10-14T17:43:58-04:00"}
{"action":"startup","auto_schema_enabled":true,"level":"info","msg":"auto schema enabled setting is set to \"true\"","time":"2024-10-14T17:43:58-04:00"}
{"level":"info","msg":"No resource limits set, weaviate will use all available memory and CPU. To limit resources, set LIMIT_RESOURCES=true","time":"2024-10-14T17:43:58-04:00"}
{"level":"info","msg":"module offload-s3 is enabled","time":"2024-10-14T17:43:58-04:00"}
{"level":"info","msg":"open cluster service","servers":{"Embedded_at_8079":60187},"time":"2024-10-14T17:43:58-04:00"}
{"address":"192.168.0.249:60188","level":"info","msg":"starting cloud rpc server ...","time":"2024-10-14T17:43:58-04:00"}
{"level":"info","msg":"starting raft sub-system ...","time":"2024-10-14T17:4

True


{"action":"telemetry_push","level":"info","msg":"telemetry started","payload":"\u0026{MachineID:49a4c446-9fa0-4c2b-9046-de4871732f84 Type:INIT Version:1.26.1 NumObjects:0 OS:darwin Arch:arm64 UsedModules:[]}","time":"2024-10-14T17:44:00-04:00"}
{"action":"bootstrap","level":"info","msg":"node reporting ready, node has probably recovered cluster from raft config. Exiting bootstrap process","time":"2024-10-14T17:44:00-04:00"}


In [15]:
# Define the name of the collection to be created or deleted
collection_name = "docs"

# Check if the collection already exists; if so, delete it to avoid conflicts
if client.collections.exists(collection_name):
    client.collections.delete(collection_name)

# Create a new collection with the specified name and define its properties
collection = client.collections.create(
    collection_name,
    properties=[
        Property(name="text", data_type=DataType.TEXT),
    ],
)

{"action":"hnsw_prefill_cache_async","level":"info","msg":"not waiting for vector cache prefill, running in background","time":"2024-10-14T18:14:55-04:00","wait_for_cache_prefill":false}
{"level":"info","msg":"Created shard docs_DaQHQt0mRIJa in 13.180042ms","time":"2024-10-14T18:14:55-04:00"}
{"action":"hnsw_vector_cache_prefill","count":1000,"index_id":"main","level":"info","limit":1000000000000,"msg":"prefilled vector cache","time":"2024-10-14T18:14:55-04:00","took":744542}


In [30]:
# Store each document in a vector embedding database
with collection.batch.dynamic() as batch:
  for i, d in enumerate(documents):
    response = ollama.embeddings(model="all-minilm", prompt=d)
    embedding = response["embedding"]
    # display({f'Document {i}': d, "Embedding": embedding}) # Print document and its embedding in a formatted way
    batch.add_object(
        properties = {"text" : d},
        vector = embedding,
    )

In [18]:
# Query the collection to fetch objects, limiting the results to 1
# and including their vector representations in the response
collection.query.fetch_objects(limit=1, include_vector=True)

QueryReturn(objects=[Object(uuid=_WeaviateUUIDInt('24d2898a-0b30-4c4a-a8d0-1486600be17a'), metadata=MetadataReturn(creation_time=None, last_update_time=None, distance=None, certainty=None, score=None, explain_score=None, is_consistent=None, rerank_score=None), properties={'text': 'Llamas are vegetarians and have very efficient digestive systems'}, references=None, vector={'default': [0.37259310483932495, -0.03630436211824417, -0.3288803994655609, 0.11415065079927444, -0.238313689827919, -0.11097043007612228, -0.5469349026679993, -0.41987085342407227, 0.33660489320755005, 0.2649965286254883, 0.18338069319725037, -0.3405776619911194, -2.6931986212730408e-05, -0.0862647220492363, 0.13983894884586334, -0.03347517177462578, 0.5734943151473999, 0.11323723942041397, 0.012516232207417488, 0.1321765035390854, 0.19562536478042603, 0.0997595489025116, 0.3124409317970276, 0.0982760414481163, 0.07328663766384125, -0.059406060725450516, 0.18460765480995178, -0.17324599623680115, -0.05991170182824135

In [None]:
# Step 2: Retrive

In [None]:
# Step 3: Generate

In [31]:
# client.close()

{"action":"restapi_management","docker_image_tag":"unknown","level":"info","msg":"Shutting down... ","time":"2024-10-14T18:46:47-04:00"}
{"action":"restapi_management","docker_image_tag":"unknown","level":"info","msg":"Stopped serving weaviate at http://127.0.0.1:8079","time":"2024-10-14T18:46:47-04:00"}
{"action":"telemetry_push","level":"info","msg":"telemetry terminated","payload":"\u0026{MachineID:49a4c446-9fa0-4c2b-9046-de4871732f84 Type:TERMINATE Version:1.26.1 NumObjects:60 OS:darwin Arch:arm64 UsedModules:[]}","time":"2024-10-14T18:46:48-04:00"}
{"level":"info","msg":"closing raft FSM store ...","time":"2024-10-14T18:46:48-04:00"}
{"level":"info","msg":"shutting down raft sub-system ...","time":"2024-10-14T18:46:48-04:00"}
{"level":"info","msg":"transferring leadership to another server","time":"2024-10-14T18:46:48-04:00"}
{"error":"cannot find peer","level":"error","msg":"transferring leadership","time":"2024-10-14T18:46:48-04:00"}
{"level":"info","msg":"closing raft-net ...",

**The grep LISTEN filter displays only those ports that are actively listening for incoming connections**

```bash
sudo lsof -i -P -n | grep LISTEN

ollama    27370   IPv4 0xdfbc846e3834546d      0t0    TCP 127.0.0.1:11434 (LISTEN)

weaviate- 35147   IPv6 0x53a13eedd360c342      0t0    TCP *:60186 (LISTEN)
weaviate- 35147   IPv6 0x91f60d5cf4124a61      0t0    TCP *:6060 (LISTEN)
weaviate- 35147   IPv4 0x2acb043cb27d74a0      0t0    TCP 192.168.0.249:60188 (LISTEN)
weaviate- 35147   IPv6 0x41d21a8cc4514707      0t0    TCP *:60187 (LISTEN)
weaviate- 35147   IPv4 0xf61841243b061880      0t0    TCP 192.168.0.249:60187 (LISTEN)
weaviate- 35147   IPv6 0x515655760ad88e38      0t0    TCP *:50050 (LISTEN)
weaviate- 35147   IPv4 0x379147eef7c0a3e5      0t0    TCP 127.0.0.1:8079 (LISTEN)
```
