## Install ollama - local language models

Note, we need to install ollama and langchain with the default environment

(see here for details https://docs.aws.amazon.com/sagemaker/latest/dg/studio-lab-environments.html)

Langchain is installed on the persistent environment but it is an older version

Some libraries for text integrating langchain with ollama

https://pypi.org/project/langchain-community/

https://github.com/chroma-core/chroma

ChromaDB, often referred to simply as Chroma, is an open-source embedding database designed to store, query, and manage embeddings efficiently. Embeddings are numerical representations of data, typically derived from machine learning models, that capture semantic information in a high-dimensional space. They are widely used in applications like natural language processing, recommendation systems, and image recognition.

In [None]:
!mkdir tmp
print("Downloading ollama...")
!curl --fail --show-error --location --progress-bar -o tmp/ollama "https://ollama.com/download/ollama-linux-amd64"
print("Installing ollama...")
!install -m755 tmp/ollama /home/studio-lab-user/.conda/envs/default/bin/ollama
print("Cleaning up...")
!rm -rf tmp
print("Upgrading langchain to work with latest models...")
%pip install -U langchain-community chromadb --quiet
print("Install complete!")


## Next we'll launch ollama. 

Note all this code does is run `ollama serve` but it uses threading so that we don't block the notebook and can keep running other cells. You can also run the command in a separate terminal to achieve the same effect.

In [None]:
import asyncio
import threading

async def run_process(cmd):
    print('>>> starting', *cmd)
    process = await asyncio.create_subprocess_exec(
        *cmd,
        stdout=asyncio.subprocess.PIPE,
        stderr=asyncio.subprocess.PIPE
    )

    
async def start_ollama_serve():
    await run_process(['ollama', 'serve'])

def run_async_in_thread(loop, coro):
    asyncio.set_event_loop(loop)
    loop.run_until_complete(coro) 
    loop.close()

# Create a new event loop that will run in a new thread 
new_loop = asyncio.new_event_loop() 

# Start ollama serve in a separate thread so the cell won't block execution 
thread = threading.Thread(target=run_async_in_thread, args=(new_loop, start_ollama_serve()))
thread.start() 

In [None]:
from langchain_community.llms import Ollama

llm = Ollama(model="llama3",  base_url='http://localhost:11434')

print(llm.invoke("Tell me a joke"))

In [None]:
print(llm.invoke("Tell me another joke"))

In [None]:
print(llm.invoke("Tell me a joke about AI"))

In [None]:
print(llm.invoke("why is the sky blue"))

## Working with text data (RAG - Retrieval-Augmented Generation)

https://ollama.com/blog/embedding-models


In Langchain, a VectorStoreIndexCreator is a module or component responsible for creating an index for storing and retrieving vectors efficiently. Vectors represent numerical data points in multi-dimensional space and are commonly used in data analysis, machine learning, and natural language processing tasks. The VectorStoreIndexCreator helps organize these vectors in a way that allows for quick and effective searching and querying.

The vectorstore as_retriever is setting up a retriever to perform approximate nearest neighbor search using a vector store called `index.vectorstore`. The `as_retriever` method is being used to convert the vector store into a retriever object. The `search_kwargs={"k": 1000}` parameter specifies that the retriever should return the top 1000 nearest neighbors for the query vectors. The retriever object allows you to efficiently search for similar vectors within the vector store by retrieving the nearest neighbors based on similarity metrics like cosine similarity or Euclidean distance.


In [None]:
course_codes = ['STATS5073',
                'STATS5074',
                'STATS5075',
                'STATS5076',
                'STATS5077',
                'STATS5078',
                'STATS5079',
                'STATS5080',
                'STATS5081',
                'STATS5082',
                'STATS5083',
                'STATS5084']

base_url = "https://www.gla.ac.uk/postgraduate/taught/dataanalyticsonline/?card=course&code="

course_list = [base_url + c for c in course_codes]



In [None]:
course_list[10]

In [None]:

from langchain.embeddings import OllamaEmbeddings

from langchain.indexes import VectorstoreIndexCreator



from langchain.chains import ConversationalRetrievalChain, RetrievalQA


from langchain_community.document_loaders import WebBaseLoader

oembed = OllamaEmbeddings(base_url="http://localhost:11434", model="nomic-embed-text")

loader = WebBaseLoader(web_paths=course_list[0:12])
index = VectorstoreIndexCreator(embedding=oembed).from_loaders([loader])

chain = ConversationalRetrievalChain.from_llm(
  llm=llm,
  retriever=index.vectorstore.as_retriever(search_kwargs={"k": 10}),
)

chat_history = []


In [None]:
query = "Summarize the combined content of all the courses. Group similar topics together"
result = chain.invoke({"question": query, "chat_history": chat_history})
print(result['answer'])

chat_history.append((query, result['answer']))

In [None]:
query = "Summarize the deep learning components of the programme "
result = chain.invoke({"question": query, "chat_history": chat_history})
print(result['answer'])

chat_history.append((query, result['answer']))

In [None]:
mostpopularpage = ['https://en.wikipedia.org/wiki/Bill_Walton']


oembed = OllamaEmbeddings(base_url="http://localhost:11434", model="nomic-embed-text")

loader = WebBaseLoader(web_paths=mostpopularpage)
index = VectorstoreIndexCreator(embedding=oembed).from_loaders([loader])

chain = ConversationalRetrievalChain.from_llm(
  llm=llm,
  retriever=index.vectorstore.as_retriever(search_kwargs={"k": 10}),
)

chat_history = []

In [None]:
query = "Where was Bill Walton born?"
result = chain.invoke({"question": query, "chat_history": chat_history})
print(result['answer'])

chat_history.append((query, result['answer']))

In [None]:
query = "Are you sure you don't have information about his early life?"


In [None]:


result = chain.invoke({"question": query, "chat_history": chat_history})
print(result['answer'])

chat_history.append((query, result['answer']))

In [None]:
from langchain_community.llms import Ollama
ollama = Ollama(
    base_url='http://localhost:11434',
    model="llama3"
)
print(ollama.invoke("why is the sky blue"))