## Testing Llama2 and Solar Models
This section outlines the preliminary tests that will be conducted with the Llama2 and Solar models to evaluate their performance in a local environment. The models have been configured to run completely locally, without the need for connection to external servers.
![alt text](../sagemaker_documentation/images/Notebook1_arch.png)

##### Conclusions
1. The tests conducted in this notebook have led to the conclusion that the local, llama2 (3GB) and solar (8GB) models are unable to process prompts and simple contexts of less than 20 tokens in an acceptable timeframe. As a result, the exercise will be developed using the OpenAI API as a proof of concept to accelerate the feasibility exploration.
2. A locally stored vector storage may also require significant processing when scaling the solution, so it was decided to use the Chroma tools  [chroma](https://github.com/chroma-core/chroma) 


### Initial Set

In [1]:
# Imports:
import os
from dotenv import load_dotenv

from langchain_openai.chat_models import ChatOpenAI
from langchain_community.llms import Ollama
from langchain_core.output_parsers import StrOutputParser
from langchain_community.document_loaders import PyPDFLoader
from langchain.prompts import PromptTemplate
from langchain_openai.embeddings import OpenAIEmbeddings
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.vectorstores import DocArrayHnswSearch

load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

In [2]:
# Model definition:

MODEL ="gpt-3.5-turbo"
# MODEL = "llama2"
# MODEL = "solar"

**Model selector** The proposed development is intended to be executable and comparable with different models, two local models and the open AI model.

In [3]:
# Model decision:

if MODEL.startswith("gpt"):
    model = ChatOpenAI(api_key=OPENAI_API_KEY, model= MODEL)
    embeddings = OpenAIEmbeddings()
else:
    model = Ollama(model=MODEL)
    embeddings = OllamaEmbeddings()

# test
model.invoke("What is Uganda?")

AIMessage(content='Uganda is a landlocked country located in East Africa. It is bordered by South Sudan to the north, Kenya to the east, Tanzania to the south, Rwanda to the southwest, and the Democratic Republic of the Congo to the west. The capital city of Uganda is Kampala. The country is known for its diverse wildlife, including mountain gorillas, as well as its beautiful landscapes and friendly people. Uganda gained independence from British colonial rule in 1962.', response_metadata={'token_usage': {'completion_tokens': 94, 'prompt_tokens': 11, 'total_tokens': 105}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_c2295e73ad', 'finish_reason': 'stop', 'logprobs': None}, id='run-61680b69-e9c1-4a47-88cb-9c633eaa3de7-0')

#### Run times of the different models ####
>- Gpt-3.5-turbo model takes 14 seconds
>- Flame2 model takes 70 seconds
>- The gpt-3.5-turbo model takes 440 seconds

**The embedding size is evaluated.**

In [4]:
test_text = "This is a test sentence."
embedding_vector = embeddings.embed_documents([test_text])

In [5]:
len(*embedding_vector)

1536

**LangChain Testing and input and output Parsing**

In [6]:
# Parse outputs

parser = StrOutputParser()

chain = model | parser  
chain.invoke("what is Uganda?")

'Uganda is a country located in East Africa, bordered by Kenya to the east, South Sudan to the north, the Democratic Republic of the Congo to the west, Rwanda to the southwest, and Tanzania to the south. The capital city of Uganda is Kampala, and the official languages are English and Swahili. Uganda is known for its diverse wildlife, including the endangered mountain gorillas in Bwindi Impenetrable National Park, as well as its beautiful landscapes and friendly people.'

In [8]:
# Load pdf in memory

loader = PyPDFLoader("C:/Users/user/Documents/LOKA_Test/Senior_ML_Tech_Assessment_LOKA.pdf")
pages = loader.load_and_split()
pages

[Document(page_content='AI Engineering\nSenior ML Engineer Tech Assessment\nIntroduction\nThank you for accepting the technical challenge for Loka. The goal of this challenge is to evaluate\nyour problem-solving skills, technical knowledge, and ability to put it into practice.\nWe recommend that you first think through your solution and draft it before implementing as much\nas possible. Note that Loka works closely with AWS, so we will highly value solutions that are cloud-\ncompatible, particularly with AWS. We do not expect you to spend money or have a powerful machine at\nyour disposal, so feel free to not deploy the most expensive/compute-intensive parts of your architecture\nand just mock them (locally, on colab, or however makes sense to you).\nWe provide you with a dataset of documents and a description of a general case that is common in the\nfield and based on an internal project named Clementine. Note that for simplicity, the dataset provided\nis AWS documentation (same as Cl

In [10]:
for doc in pages:
    print(dir(pages))  # This will display all attributes and methods available for the `doc` object.

['__add__', '__class__', '__class_getitem__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getstate__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']
['__add__', '__class__', '__class_getitem__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getstate__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__r

In [11]:
# Prompt template
template = """
Answer the question based just on the context below. If you can't say 'i don't know.

## dont give your opinion, or explanation just the replay based in the context.

Context: {context}

Question: {question}
"""

prompt = PromptTemplate.from_template(template)
print(prompt.format(context="here is some context", question="Here is a question"))


Answer the question based just on the context below. If you can't say 'i don't know.

## dont give your opinion, or explanation just the replay based in the context.

Context: here is some context

Question: Here is a question



In [12]:
# usable chain
chain = prompt | model | parser

In [13]:
chain.input_schema.schema()

{'title': 'PromptInput',
 'type': 'object',
 'properties': {'context': {'title': 'Context', 'type': 'string'},
  'question': {'title': 'Question', 'type': 'string'}}}

In [14]:
chain.invoke(
    {
        "context": "The name i was given is Andres",
        "question": "what is my name?"
    }
)

'Your name is Andres.'

##### Vector storage

In [15]:


vectores = DocArrayHnswSearch.from_documents(
    pages,
    embedding = embeddings,
    work_dir = 'C:/Users/user/Documents/Code/rag_open_src_model_V0',
    n_dim= 1536
)

KeyError: 'space'