# Introduction
- explore how large language models learn token distributions and predict the next token, allowing them to generate human-like text that can both amaze and perplex us
## LLMs in general
- deep learning models with billions of parameters that excel at a wide range of natural language processing tasks
- perform tasks like translation, sentiment analysis, and chatbot conversations without being specifically trained for them.
### Architecture
- LLM = Neural Network+FeedForward Layers+Embedding Layers+attention layers
### Maximum tokens
- Maximum input tokens depend on the model we are ranging from few thousands to tens of thousands

In [8]:
text = 'My name is Tyler Durden'
max_tokens = 2
def split_text(text,max_tokens):
    i = 0
    prompt = text.split()
    to_be_returned = list()
    while(i<len(prompt)):
        to_be_returned.append(' '.join(prompt[i:i+max_tokens]))
        i = i+n
    return to_be_returned

answer = split_text(text,max_tokens)
print(answer)


['My name', 'is Tyler', 'Durden']


In [14]:
# if we input number of tokens more than maximum tokens, 
# It causes the truncation and also some error too.
# typically better models have high value of max number of tokens
# in such cases we can use chop and feed approach

from langchain.llms import OpenAI
llm = OpenAI(model = 'text-davinci-003')
input_text = 'Some long text of yours'

# max tokens 
max_tokens = 1000

text_chunks = split_text(input_text,max_tokens)

# since we are feeding chunk by chunk, we will store the result chunk by chunk too
results = list()
for chunk in text_chunks:
    pass
#     results.append(llm(chunk)) uncomment at time of real execution
    
def combine_result(results):
#     write some combine function to connect all those seperately produced results
    pass

final_result = combine_result(results)

# Tokens Distributions and Predicting the Next Token
-  to predict the next token
- Gpt family uses Casual Language Modeling(CLM) to predict the next token from the knowledge of previous ones
- This help LLM to produce the relevant text

In [15]:
from langchain.llms import OpenAI

llm = OpenAI(model_name="text-davinci-003", temperature=0)

text = "What would be a good company name for a company that makes colorful socks?"

print(llm(text))



Rainbow Socks Co.


Openai gives a credit of $5 api usage on a free account

We can check how we are charger per usage of the api using callbacks

In [18]:
from langchain.callbacks import get_openai_callback
llm = OpenAI(model_name = 'text-davinci-003',n = 2, best_of = 2)
with get_openai_callback() as cb:
    result = llm('Tell me a LLM JOke')
    print(cb)

Tokens Used: 72
	Prompt Tokens: 7
	Completion Tokens: 65
Successful Requests: 1
Total Cost (USD): $0.00144


In [19]:
# lets see what was the joke
print(result)



Q: What did the LLM student say to the barista?
A: "Espresso me, please!"


# Few Shot Learning
- ability of LLM to learn and generalize from limited text 
## Process to apply FSL
- We use `FewShotPromptTemplate` class(its instance to be exact) which takes - `PromptTemplate` and few shot examples as input
- Prompt templates passed will be formatted as few shot examples
- helps the model generate better responses

lets see how to use it

In [21]:
from langchain import PromptTemplate, FewShotPromptTemplate

# create our examples
examples = [
    {
        "query": "What's the weather like?",
        "answer": "It's raining cats and dogs, better bring an umbrella!"
    }, {
        "query": "How old are you?",
        "answer": "Age is just a number, but I'm timeless."
    }
]

# create an example template
example_template = """
User: {query}
AI: {answer}
"""

# create a prompt example from above template
example_prompt = PromptTemplate(
    input_variables=["query", "answer"],
    template=example_template
)

# now break our previous prompt into a prefix and suffix
# the prefix is our instructions
prefix = """This text was generated by:
"""
# and the suffix our user input and output indicator
suffix = """
User: {query}
AI: """

# now create the few-shot prompt template
few_shot_prompt_template = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    prefix=prefix,
    suffix=suffix,
    input_variables=["query"],
    example_separator="\n\n"
)

In [30]:
chain = LLMChain(llm=llm, prompt=few_shot_prompt_template)
chain.run("What's the meaning of life?")

' The meaning of life is to find your own meaning and purpose.'

# Emergent Abilities of LLMs:

- LLMs possess emergent abilities due to extensive pre-training on large datasets.
- These abilities are not explicitly programmed but emerge as the model learns patterns.
- LangChain models leverage these emergent abilities for various tasks using different model types.
- LLMs can perform tasks like answering questions, generating text, and making recommendations.

# Scaling Laws and Performance:

- Scaling laws describe the relationship between model size, training data, and performance.
- Generally, larger models with more training data tend to perform better.
- Performance improvement follows diminishing returns, not always linearly.
- Balance between model size, training data, performance, and resources is crucial.

# Limitations of LLMs:

- LLMs have notable capabilities but also limitations.
- Hallucinations can occur, producing text that seems plausible but is factually incorrect.
- Biases from training data can lead to outputs that perpetuate stereotypes or yield undesired results.

In [41]:
# !pip install -q huggingface_hub
# install hf hub if you haven't already

In [34]:
# create a short and simple question and answer template
from langchain import PromptTemplate

template = """Question: {question}

Answer: """
prompt = PromptTemplate(
    template=template,
    input_variables=['question']
)

# user question
question = "What is the capital city of France?"

- Now we will use flan-t5-large model from hugging face hub - since we have provided the api key in our environmentvariables it will load the model through api

In [38]:
from langchain import HuggingFaceHub, LLMChain
import os
huggingfacehub_api_token = os.environ.get('HUGGINGFACEHUB_API_TOKEN')
# initialize Hub LLM
hub_llm = HuggingFaceHub(huggingfacehub_api_token = huggingfacehub_api_token,
        repo_id='google/flan-t5-large',
    model_kwargs={'temperature':0}
)

# create prompt template > LLM chain
llm_chain = LLMChain(
    prompt=prompt,
    llm=hub_llm
)

# ask the user question about the capital of France
print(llm_chain.run(question))

paris


Lets modify our prompt to let it answer multiple questions
- We have two approaches to let it answer

- **Approach 1 - Feed one by one**

In [42]:
qa = [
    {'question': "What is the capital city of France?"},
    {'question': "What is the largest mammal on Earth?"},
    {'question': "Which gas is most abundant in Earth's atmosphere?"},
    {'question': "What color is a ripe banana?"}
]
res = llm_chain.generate(qa)
print( res )

generations=[[Generation(text='paris', generation_info=None)], [Generation(text='giraffe', generation_info=None)], [Generation(text='nitrogen', generation_info=None)], [Generation(text='yellow', generation_info=None)]] llm_output=None run=[RunInfo(run_id=UUID('09876aa8-0ffa-4100-b74a-4b37bcb65da6')), RunInfo(run_id=UUID('da9e6756-3ac0-46c3-983c-6da7ed9d71cb')), RunInfo(run_id=UUID('eec65899-5ca1-4167-83b6-ddb83dd8a736')), RunInfo(run_id=UUID('d5034348-f6f8-4e95-a080-e39505ebd118'))]


- **Approach 2 - feed all at once**

In [45]:
multi_template = """Answer the following questions one at a time.

Questions:
{questions}

Answers:
"""
long_prompt = PromptTemplate(template=multi_template, input_variables=["questions"])

llm_chain = LLMChain(
    prompt=long_prompt,
    llm=llm
)

qs_str = (
    "What is the capital city of France?\n" +
    "What is the largest mammal on Earth?\n" +
    "Which gas is most abundant in Earth's atmosphere?\n" +
    "What color is a ripe banana?\n"
)
llm_chain.run(qs_str)

'Paris\nBlue Whale\nNitrogen\nYellow'

# Text Summarization

In [49]:
summarize_template = 'summarize the following text in one sentence: {text}'
summarize_prompt = PromptTemplate(template = summarize_template,input_variables = ['text'])
summarize_chain = LLMChain(llm = llm,prompt=summarize_prompt)

In [50]:
text = "LangChain provides many modules that can be used to build language model applications. Modules can be combined to create more complex applications, or be used individually for simple applications. The most basic building block of LangChain is calling an LLM on some input. Let’s walk through a simple example of how to do this. For this purpose, let’s pretend we are building a service that generates a company name based on what the company makes."
summarized_text = summarize_chain.predict(text=text)

In [52]:
summarized_text.strip()

'LangChain provides modules that can be used both individually and in combination to create language model applications, with the most basic building block being the calling of an LLM on some input.'

# Text Translation

In [53]:
translation_template = "Translate the following text from {source_language} to {target_language}: {text}"
translation_prompt = PromptTemplate(input_variables=["source_language", "target_language", "text"], template=translation_template)
translation_chain = LLMChain(llm=llm, prompt=translation_prompt)

In [57]:
source_language = "English"
target_language = "Spanish"
text = "Sujay Will be new Andrej Kerpathy"
translated_text = translation_chain.predict(source_language=source_language, target_language=target_language, text=text)

In [58]:
translated_text.strip()

'Sujay será el nuevo Andrej Kerpathy.'