#**Llama 2**

The Llama 2 is a collection of pretrained and fine-tuned generative text models, ranging from 7 billion to 70 billion parameters, designed for dialogue use cases.

 It outperforms open-source chat models on most benchmarks and is on par with popular closed-source models in human evaluations for helpfulness and safety.

[Llama 2 13B-chat](https://huggingface.co/meta-llama/Llama-2-13b-chat)

#  Quantized Models from the Hugging Face Community

#**Step 1: Install All the Required Packages**

#**Step 2: Import All the Required Libraries**

In [None]:
import os
import sys
import pinecone
from langchain.llms import Replicate
from langchain.vectorstores import Pinecone
from langchain.text_splitter import CharacterTextSplitter
from langchain.document_loaders import PyPDFLoader
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.chains import ConversationalRetrievalChain

In [6]:
# Replicate API token
!python key_setup.py

In [8]:
#https://towardsai.net/p/machine-learning/fine-tuning-a-llama-2-7b-model-for-python-code-generation

In [19]:
# Load and preprocess the PDF document
loader = PyPDFLoader('./data/complete_works.pdf')
documents = loader.load()

# Split the documents into smaller chunks for processing
text_splitter = CharacterTextSplitter(chunk_size=3000, chunk_overlap=0, separator = '\n')
texts = text_splitter.split_documents(documents)
# reference https://medium.com/@woyera/how-to-chat-with-your-pdf-using-python-llama-2-41df80c4e674

In [23]:
# Use HuggingFace embeddings for transforming text into numerical vectors
embeddings = HuggingFaceEmbeddings()

## Create Pinecone vector Database

In [24]:
import pinecone
# Setup pine cone environment key 

In [25]:
# Set up the Pinecone vector database
index_name = "heartful-index"
vectordb = Pinecone.from_documents(texts, embeddings, index_name=index_name)

## Use replicate to run queries

In [38]:
# Initialize Replicate Llama2 Model  "a16z-infra/llama13b-v2-chat:df7690f1994d94e96ad9d568eac121aecf50684a0b0963b25a41cc40061269e5",
llm = Replicate(
    model= "meta/llama-2-7b-chat:8e6975e5ed6174911a6ff3d60540dfd4844201974602551e10e9e87ab143d81e",
    input={"temperature": 0.2, "max_length": 3000}
)

## Create Chain

In [39]:
# Set up the Conversational Retrieval Chain
qa_chain = ConversationalRetrievalChain.from_llm(
    llm,
    vectordb.as_retriever(search_kwargs={'k': 10}),
    return_source_documents=True
)

### Chatting with the chat bot

In [41]:
chat_history = []

In [None]:
# Start chatting with the chatbot
while True:
  query = input('Prompt: ')
  if query.lower() in ["exit", "quit", "q"]:
      print('Exiting')
      sys.exit()
  result = qa_chain({'question': query, 'chat_history': chat_history})
  print('Answer: ' + result['answer'] + '\n')
  chat_history.append((query, result['answer']))

In [43]:
chat_history

[('What is maxim 1 in complete works book',
  ' Thank you for providing me with the context! Based on the text provided, Maxim 1 in "Complete Works" is:\n"The external ways adopted for the purpose began to cast their effect upon the mind and thus the internal purity too began to develop. This continued process supplemented by our firm attention upon the Ideal contributed greatly to the attainment of the highest purity."'),
 ('what is meditation as per the book',
  ' According to the "Complete Works," the author defines meditation as creating disturbances and worries in our minds. By meditating, we create a temporary lull in our mind and calmness prevails for the time during which we are in touch with the divine force. However, meditation only at a certain fixed hour is not enough, as we are thus in touch with the sacred thought only for a while after which we have no idea of God whatsoever and are for most part of the day away from the path of service and devotion. This is why often af

#**Step 5: Create a Prompt Template**

In [None]:
prompt = "You are a meditation trainer"
prompt_template=f'''SYSTEM: You are a helpful, respectful and honest guru. Always answer as helpfully.

USER: {prompt}

ASSISTANT:
'''