##### pip install langsmith

##### pip install langchain

#### LLM chain using local open source models using llama3

In [4]:
from langchain_community.llms import Ollama

llm = Ollama(model="llama3")
llm.invoke("what are good llm models?")

'There are many excellent LLM (Large Language Model) models out there, each with their own strengths and weaknesses. Here are some notable ones:\n\n1. **BERT** (Bidirectional Encoder Representations from Transformers): Developed by Google, BERT is a pioneering model that has set the bar high for many other LLMs. It\'s widely used in NLP tasks like question answering, sentiment analysis, and text classification.\n2. **RoBERTa** (Robustly Optimized BERT Pre-training Approach): Another Google-developed model, RoBERTa is an improvement over BERT, with better performance on long-range dependencies and more efficient training procedures.\n3. **Transformer-XL**: This model, developed by the Google Brain team, is a larger and more powerful version of BERT. It\'s designed for longer-range dependencies and has achieved state-of-the-art results in various NLP tasks.\n4. **Longformer**: Developed by Facebook AI, Longformer is a specialized LLM designed specifically for long-range dependencies and 

###### langchain_core.prompts is a module from the LangChain library designed to facilitate the creation and manipulation of prompts for language models. ChatPromptTemplate is a specific class within this module that helps structure chat-based interactions with a language model

In [17]:
from langchain_core.prompts import ChatPromptTemplate
# from transformers import AutoModelForCausalLM, AutoTokenizer
# from langchain_core.prompts import BaseMessagePromptTemplate

template = ChatPromptTemplate.from_messages([
        ("system",  "You are a helpful assistant."),
        ("user", "{user_input}")
    ]
)

chain = template | llm 
chain.invoke({"user_input": "what are good llm models?"})



"As your helpful assistant, I'd be happy to give you some insights on popular LLM (Large Language Model) models!\n\nThere are many excellent LLMs out there, and the best one for you depends on your specific needs and goals. Here are a few notable ones:\n\n1. **BERT (Bidirectional Encoder Representations from Transformers)**: Developed by Google, BERT is a groundbreaking model that has achieved state-of-the-art results in various NLP tasks. It's widely used for natural language processing, question-answering, and text classification.\n2. **RoBERTa (Robustly Optimized BERT Pretraining Approach)**: Another impressive model from the same authors as BERT, RoBERTa is designed to handle longer sequences and has been shown to perform well on various tasks like sentence classification and sentiment analysis.\n3. **DistilBERT**: This compact version of BERT is trained using a distillation process, which enables it to be smaller (only 40% the size of the original BERT) while still retaining most 

##### The StrOutputParser in the langchain_core.output_parsers module is used to parse the output of a language model into a string format

In [25]:
from langchain_core.output_parsers import StrOutputParser

parser = StrOutputParser()

# In the context of building a language model application with LangChain, the expression chain = template | llm | parser represents a pipeline where

# A prompt template is used to generate a prompt.
# The prompt is passed to a language model (LLM) to generate a response.
# The generated response is parsed into a desired format.
chain = template | llm | parser
chain.invoke({"user_input": "what are good llm models?"})

"I'm glad you asked!\n\nThere are many excellent LLM (Large Language Model) models out there, each with their own strengths and specialties. Here are some popular ones:\n\n1. **BERT (Bidirectional Encoder Representations from Transformers)**: A widely used and highly effective model for natural language processing tasks like question answering, sentiment analysis, and text classification.\n2. **RoBERTa (Robustly Optimized BERT Pretraining Approach)**: An improved version of BERT that uses a different approach to optimize the model's performance on masked language modeling.\n3. **DistilBERT**: A smaller, more efficient version of BERT designed for deployment in production environments.\n4. **XLNet**: A general-purpose LLM that excels at tasks like machine translation, text summarization, and question answering.\n5. **T5 (Text-to-Text Transfer with Cross-Attention)**: A model that can be fine-tuned for various NLP tasks, including text generation, machine translation, and text classifica

##### pip install beautifulsoup4

###### BeautifulSoup integrates well with other Python libraries, such as requests for making HTTP requests to fetch web pages and pandas for data manipulation and analysis

In [12]:
import requests
from bs4 import BeautifulSoup

url = 'https://en.wikipedia.org/wiki/Large_language_model'
response = requests.get(url)

# parse the content
soup = BeautifulSoup(response.content, 'html.parser')
title = soup.title.string

# Find all links on the page
# extracts the URLs from these tags
links = soup.find_all('a')
for link in links:
    print(link.get('href'))

#bodyContent
/wiki/Main_Page
/wiki/Wikipedia:Contents
/wiki/Portal:Current_events
/wiki/Special:Random
/wiki/Wikipedia:About
//en.wikipedia.org/wiki/Wikipedia:Contact_us
https://donate.wikimedia.org/wiki/Special:FundraiserRedirector?utm_source=donate&utm_medium=sidebar&utm_campaign=C13_en.wikipedia.org&uselang=en
/wiki/Help:Contents
/wiki/Help:Introduction
/wiki/Wikipedia:Community_portal
/wiki/Special:RecentChanges
/wiki/Wikipedia:File_upload_wizard
/wiki/Main_Page
/wiki/Special:Search
/w/index.php?title=Special:CreateAccount&returnto=Large+language+model
/w/index.php?title=Special:UserLogin&returnto=Large+language+model
/w/index.php?title=Special:CreateAccount&returnto=Large+language+model
/w/index.php?title=Special:UserLogin&returnto=Large+language+model
/wiki/Help:Introduction
/wiki/Special:MyContributions
/wiki/Special:MyTalk
#
#History
#Alternative_architecture
#Dataset_preprocessing
#Probabilistic_tokenization
#BPE
#Dataset_cleaning
#Synthetic_data
#Training_and_architecture
#Re

##### pip install faiss-cpu (using FAISS vectorstore)

###### FAISS stands for Facebook AI Similarity Search, which is a library for efficient similarity search and clustering of dense vectors. It's commonly used for building large-scale vector stores where vectors can be efficiently indexed and queried based on their similarity.

##### Using embedding model to ingest documents into a vectorstore (llama)

In [57]:
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import WebBaseLoader
from langchain_core.prompts import ChatPromptTemplate
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_community.llms import Ollama
from langchain_core.documents import Document
from langchain.chains import create_retrieval_chain



llm = Ollama(model="llama3")

loader = WebBaseLoader("https://en.wikipedia.org/wiki/Large_language_model")
docs = loader.load()

# Initialize embeddings and text splitter
embeddings = OllamaEmbeddings(model="llama3")
text_splitter = RecursiveCharacterTextSplitter()

documents = text_splitter.split_documents(docs)                                              
vector = FAISS.from_documents(documents, embeddings)

prompt = ChatPromptTemplate.from_template("""Answer the following question based only on the provided context:

<context>
{context}
</context>

Question: {input}""")

# processing pipeline for documents using the create_stuff_documents_chain taking 2 arguments (llm, prompt)
document_chain = create_stuff_documents_chain(llm, prompt)



document_chain.invoke({
    "input": "what are good llm models?",
    "context": [Document(page_content="Large language model")]
})


"Based on my training data, I'd say that some well-known and widely-used Large Language Models (LLMs) include:\n\n1. BERT (Bidirectional Encoder Representations from Transformers): Developed by Google, it's a pre-trained language model that's achieved state-of-the-art results in many NLP tasks.\n2. RoBERTa (Robustly Optimized BERT Pre-training Approach): Another popular LLM developed by Facebook AI, which has shown improved performance over BERT on some benchmarks.\n3. T5 (Text-to-Text Transformer): A text-to-text transformer model designed for a wide range of NLP tasks, including language translation, question answering, and more.\n4. DistillBERT: A smaller, more efficient version of BERT that's been distilled from the original model using knowledge distillation.\n5. Electra: A pre-training task-based approach to LLMs that uses an adversarial training procedure to improve its performance.\n\nThese are just a few examples, and there are many other LLM models out there with their own st

In [58]:
# natural language processing (NLP) pipeline for document retrieval and processing
retriever = vector.as_retriever()
retrieval_chain = create_retrieval_chain(retriever, document_chain)
response = retrieval_chain.invoke({"input": "what are good llm models?"})
print(response["answer"])

Based on the provided context, some good LLM (Large Language Model) models mentioned include:

1. Mistral 8x7b: This model has been described as the most powerful open LLM, being more powerful than GPT-3.5 but not as powerful as GPT-4.
2. GPT-4 Turbo: This model has a maximum output of 4096 tokens and is capable of generating long conversations.

It's worth noting that these models are based on the Transformer architecture, which is currently the most popular architecture for LLMs. Additionally, other models like Google's Gemini 1.5 and Anthropic's Claude 2.1 have also been mentioned as having large context windows and being able to generate long conversations.


In [67]:
from langchain.chains import create_history_aware_retriever
from langchain_core.prompts import MessagesPlaceholder
from langchain_core.messages import HumanMessage, AIMessage


# # Define the prompt template with placeholders for chat history and input text
prompt = ChatPromptTemplate.from_messages([
      MessagesPlaceholder(variable_name="chat_history"), 
        ("user", "{input}"),
        ("user", "Given the above conversation, generate a search query to look up to get information relevant to the conversation")
    ])

# Create the history-aware retriever by passing both the retriever and the prompt template
history_aware_retriever = create_history_aware_retriever(llm, retriever, prompt)


chat_history = [HumanMessage(content="what embeddings perform?"), AIMessage(content="Yes!")]
retriever_chain.invoke({
    "chat_history": chat_history,
    "input": "what are embeddings"
})



[Document(page_content='^ "bigscience/bloom · Hugging Face". huggingface.co.\n\n^ Taylor, Ross; Kardas, Marcin; Cucurull, Guillem; Scialom, Thomas; Hartshorn, Anthony; Saravia, Elvis; Poulton, Andrew; Kerkez, Viktor; Stojnic, Robert (16 November 2022). "Galactica: A Large Language Model for Science". arXiv:2211.09085 [cs.CL].\n\n^ "20B-parameter Alexa model sets new marks in few-shot learning". Amazon Science. 2 August 2022.\n\n^ Soltan, Saleh; Ananthakrishnan, Shankar; FitzGerald, Jack; et\xa0al. (3 August 2022). "AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model". arXiv:2208.01448 [cs.CL].\n\n^ "AlexaTM 20B is now available in Amazon SageMaker JumpStart | AWS Machine Learning Blog". aws.amazon.com. 17 November 2022. Retrieved 13 March 2023.\n\n^ a b c "Introducing LLaMA: A foundational, 65-billion-parameter large language model". Meta AI. 24 February 2023.\n\n^ a b c "The Falcon has landed in the Hugging Face ecosystem". huggingface.co. Retrieved 2023-06-2