Following the tutorial https://medium.com/@thakermadhav/build-your-own-rag-with-mistral-7b-and-langchain-97d0c92fa146 to make a RAG to answer queries about research being done at UCSD, database info is pulled from Dimensions

also referencing langchain quickstart https://python.langchain.com/v0.1/docs/use_cases/question_answering/quickstart/ 

Using HuggingFace, Mistral-7B model. https://huggingface.co/docs/transformers/installation installation instructions

Pip installations:
pip install --upgrade huggingface_hub <br/>
install pytorch
pip install transformers

for langchain stuff:
pip install --upgrade --quiet  langchain langchain-community langchainhub langchain-openai langchain-chroma bs4

Langchain is for prototyping, LangSmith is for production to increase reliability of models and have a UI so you can visualize your LLM

In [2]:
from transformers import AutoModelForCausalLM, AutoTokenizer

In [6]:
import os
from dotenv import load_dotenv
import getpass

load_dotenv()

hf_access_token = os.getenv('ACCESS_TOKEN') # get access token from .env file
openai_api_key = os.getenv('OPENAI_API_KEY')
lc_api_key = os.getenv('LANGCHAIN_API_KEY')

In [8]:
# start logging traces to use LangSmith
os.environ["LANGCHAIN_TRACING_V2"] = "true"

In [9]:
import bs4
from langchain import hub
from langchain_community.document_loaders import WebBaseLoader
from langchain_chroma import Chroma
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

In [10]:
# use DocumentLoaders: objects to load in data fram a source and return a list of documents for our vector database
# one Document has page_content and metadata
# TODO: make a csv file with publications that we want to load in 
# https://python.langchain.com/v0.1/docs/modules/data_connection/document_loaders/csv/ 

In [None]:
device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2", token=hf_access_token)
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2", token=hf_access_token)

messages = [
    {"role": "user", "content": "What is your favourite condiment?"},
    {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
    {"role": "user", "content": "Do you have mayonnaise recipes?"}
]

encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")

model_inputs = encodeds.to(device)
model.to(device)

generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])


