**Large Language Models (LLMs)** are deep learning models with billions of parameters that excel at a wide range of natural language processing tasks. They can perform tasks like translation, sentiment analysis, and chatbot conversations without being specifically trained for them. LLMs can be used without fine-tuning by employing **prompting** techniques.

**Architecture**: LLMs typically consist of multiple layers of neural networks, feedforward layers, embedding layers, and attention layers. These layers work together to process input text and generate output predictions.

* [1. Maximum number of tokens](#max_tokens)
* [2. Tokens Distributions and Predicting the Next Token](#distr)
    * [2.1. Tracking Token Usage](#tracking-usage)
* [3. Few-shot learning](#few-shot)
* [4. Prompts Examples](#prompts)
    * [4.1. Question-Answering Prompt Template](#qa)
    * [4.2. Text Summarization](#summarization)
    * [4.3. Text Translation](#translation)

<hr>
<a class="anchor" id="max_tokens">
    
## 1. Maximum number of tokens
    
</a>

In the LangChain library, the LLM context size, or the maximum number of tokens the model can process, is determined by the specific implementation of the LLM. For example, iin the GPT-3 model, the maximum number of tokens supported by the model is 2,049. 

It is important to ensure that the input text does not exceed the maximum number of tokens supported by the model. For example, it is possible to split the input text into smaller chunks, process them separately, and then combine the results as needed.

In [1]:
import os
from keys import OPENAI_API_KEY
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

In [None]:
####################################################################
## PSEUDOCODE to handle text that exceeds the maximum token limit ##
####################################################################
from langchain.llms import OpenAI

llm = OpenAI(model_name="text-davinci-003")

# Define the input text
input_text = "your_long_input_text"  ## this input can be really long, exceeding the token limit of the given model

# Determine the maximum number of tokens from documentation
max_tokens = 4097

# Split the input text into chunks based on the max tokens
text_chunks = split_text_into_chunks(input_text, max_tokens)

# Process each chunk separately
results = []
for chunk in text_chunks:
    result = llm.process(chunk)
    results.append(result)

# Combine the results as needed
final_result = combine_results(results)

##
## NOTE: split_text_into_chunks and combine_results 
## are custom functions that will be covered later
##

<hr>
<a class="anchor" id="distr">
    
## 2. Tokens Distributions and Predicting the Next Token
    
</a>

GPT-3 and GPT-4, prominent examples of large language models, undergo pretraining on vast quantities of textual data. They acquire the ability to anticipate the subsequent token in a sequence by leveraging the context derived from preceding tokens. GPT-family models use Causal Language modeling, which predicts the next token while only having access to the tokens before it, which enables them to generate contextually relevant text.

In [2]:
from langchain.llms import OpenAI

llm = OpenAI(model_name="text-davinci-003", temperature=0)
text = "What would be a good name for a company that produces colorful t-shirts?"

print(llm(text))



Vivid Threads.


<hr>
<a class="anchor" id="tracking-usage">
    
### 2.1. Tracking Token Usage
    
</a>

In [5]:
from langchain.llms import OpenAI
from langchain.callbacks import get_openai_callback

llm = OpenAI(model_name="text-davinci-003", n=2, best_of=2)

with get_openai_callback() as cb:
    result = llm("Tell me a joke about dogs")
    print(cb)

Tokens Used: 50
	Prompt Tokens: 6
	Completion Tokens: 44
Successful Requests: 1
Total Cost (USD): $0.001


In [6]:
result

'\n\nQ: What do you call a dog magician?\nA: A labracadabrador!'

<hr>
<a class="anchor" id="few-shot">
    
## 3. Few-shot learning
    
</a>

Few-shot learning is a remarkable ability that allows LLMs to learn and generalize from limited examples. Prompts serve as the input to these models and play a crucial role in achieving this feature.

In [7]:
# Creating Template 

from langchain import PromptTemplate
from langchain import FewShotPromptTemplate

# Create examples
examples = [
    {
        "query": "What's the weather like?",
        "answer": "It's raining cats and dogs, better bring an umbrella!"
    }, {
        "query": "How old are you?",
        "answer": "Age is just a number, but I'm timeless."
    }
]

# Create an example template
example_template = """
User: {query}
AI: {answer}
"""

# Create a prompt example from the above template
example_prompt = PromptTemplate(
    input_variables=["query", "answer"],
    template=example_template
)

# Break the previous prompt into a prefix and suffix
# the prefix is our instructions
# and the suffix is the user input and output indicator
prefix = """The following are excerpts from conversations with an AI
assistant. The assistant is known for its humor and wit, providing
entertaining and amusing responses to users' questions. Here are some
examples:
"""
suffix = """
User: {query}
AI: """

# Create the few-shot prompt template
few_shot_prompt_template = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    prefix=prefix,
    suffix=suffix,
    input_variables=["query"],
    example_separator="\n\n"
)

In [8]:
from langchain.chat_models import ChatOpenAI
from langchain import LLMChain

# Load the model
chat = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0.0)

chain = LLMChain(llm=chat, prompt=few_shot_prompt_template)
chain.run("What's the meaning of life?")

'To find the perfect balance between pizza and ice cream.'

<hr>
<a class="anchor" id="prompts">
    
## 4. Prompts Examples
    
</a>

In [9]:
!pip install -q huggingface_hub

In [10]:
from keys import HUGGINGFACEHUB_API_TOKEN
os.environ["HUGGINGFACEHUB_API_TOKEN"] = HUGGINGFACEHUB_API_TOKEN

<hr>
<a class="anchor" id="qa">
    
### 4.1. Question-Answering Prompt Template
    
</a>

In [11]:
# Creating a simple question-answering prompt template using LangChain
from langchain import PromptTemplate

template = """Question: {question}

Answer: """
prompt = PromptTemplate(
    template=template,
    input_variables=['question']
)

# User question
question = "What is the capital city of France?"

In [12]:
# Using the Hugging Face model "google/flan-t5-large" to answer the question
from langchain import HuggingFaceHub, LLMChain

# Initialize Hub LLM
hub_llm = HuggingFaceHub(
    repo_id='google/flan-t5-large',
    model_kwargs={'temperature':0}
)

# Create prompt template > LLM chain
llm_chain = LLMChain(
    prompt=prompt,
    llm=hub_llm
)

# Ask the user question about the capital of France
print(llm_chain.run(question))

paris


In [13]:
# Asking Multiple Questions

# Approach 1
# Iterating through all questions one at a time

qa = [
    {'question': "What is the capital city of France?"},
    {'question': "What is the largest mammal on Earth?"},
    {'question': "Which gas is most abundant in Earth's atmosphere?"},
    {'question': "What color is a ripe banana?"}
]

res = llm_chain.generate(qa)
print( res )

generations=[[Generation(text='paris', generation_info=None)], [Generation(text='giraffe', generation_info=None)], [Generation(text='nitrogen', generation_info=None)], [Generation(text='yellow', generation_info=None)]] llm_output=None run=RunInfo(run_id=UUID('587dabe2-2813-4f2e-9369-49e97dafa377'))


In [14]:
# Asking Multiple Questions

# Approach 2
# Placing all questions into a single prompt
# This method performs best on more capable models

multi_template = """Answer the following questions one at a time.

Questions:
{questions}

Answers:
"""
long_prompt = PromptTemplate(template=multi_template, input_variables=["questions"])

llm_chain = LLMChain(
    prompt=long_prompt,
    llm=llm
)

qs_str = (
    "What is the capital city of France?\n" +
    "What is the largest mammal on Earth?\n" +
    "Which gas is most abundant in Earth's atmosphere?\n" +
    "What color is a ripe banana?\n"
)

llm_chain.run(qs_str)

'Paris\nBlue whale\nNitrogen\nYellow'

<hr>
<a class="anchor" id="summarization">
    
### 4.2. Text Summarization
   
</a>

In [26]:
# Set up the necessary imports and an instance of the OpenAI language model
from langchain.chat_models import ChatOpenAI
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate

llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

In [27]:
# Define a prompt template for summarization
summarization_template = "Summarize the following text to one short sentence: {text}"
summarization_prompt = PromptTemplate(input_variables=["text"], template=summarization_template)
summarization_chain = LLMChain(llm=llm, prompt=summarization_prompt)

In [28]:
# Call the predict method with the text to be summarized
text = "LangChain provides many modules that can be used to build language model applications. Modules can be combined to create more complex applications, or be used individually for simple applications. The most basic building block of LangChain is calling an LLM on some input. Let’s walk through a simple example of how to do this. For this purpose, let’s pretend we are building a service that generates a company name based on what the company makes."
summarized_text = summarization_chain.predict(text=text)

In [29]:
summarized_text

'LangChain offers various modules for building language model applications, allowing users to combine them for complex applications or use them individually for simpler ones, with the basic building block being calling an LLM on input, as demonstrated in the example of generating company names based on their products.'

<hr>
<a class="anchor" id="translation">
    
### 4.3. Text Translation
   
</a>

In [32]:
# We use the same llm variable as defined before.
# Pass a different prompt that asks for translating the query from a source_language to the target_language.

translation_template = "Translate the following text from {source_language} to {target_language}: {text}"
translation_prompt = PromptTemplate(input_variables=["source_language", "target_language", "text"], 
                                    template=translation_template)
translation_chain = LLMChain(llm=llm, prompt=translation_prompt)

In [34]:
# Call the predict method to use the translation chain
source_language = "English"
target_language = "French"
text = "Today is the perfect day for studing"

translated_text = translation_chain.predict(source_language=source_language, 
                                            target_language=target_language, 
                                            text=text)
translated_text

"Aujourd'hui est le jour parfait pour étudier."