<h1><center>Introduction to Large Language Models (LLMS)</center></h1>
<hr><hr><hr>

In [7]:
import os
from dotenv import load_dotenv

load_dotenv()

True

### Configurations:-
----------------------

In [2]:
azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT")
azure_openai_api_key = os.getenv("AZURE_OPENAI_KEY")
azure_openai_api_version = "2023-05-15"
llm_deployment_name = os.getenv("GPT_DEPLOYMENT_NAME")
embedding_model_deployment_name = os.getenv("AZURE_OPENAI_ADA_EMBEDDING_DEPLOYMENT_NAME")

os.environ["OPENAI_API_TYPE"]     = "azure"
os.environ["OPENAI_API_VERSION"]  = azure_openai_api_version
os.environ["OPENAI_API_KEY"]      = azure_openai_api_key

In [3]:
from langchain_openai import AzureChatOpenAI, AzureOpenAIEmbeddings

## Tokens Distributions and Predicting the Next Token:-
---------------------------------------------------------
Large language models like GPT-3 and GPT-4 are pretrained on vast amounts of text data and learn to predict the next token in a sequence based on the context provided by the previous tokens. GPT-family models use **Causal Language modeling**, which predicts the next token while only having access to the tokens before it. This process enables LLMs to generate contextually relevant text.

The following code uses LangChain and Azure OpneAI's GPT 3.5 model to complete the sequence, which results in the answer.

In [10]:
llm = AzureChatOpenAI(
    openai_api_version = azure_openai_api_version,
    azure_deployment = llm_deployment_name,
    temperature = 0.9
)

In [14]:
text = "What would be a good company name for a company that makes colorful socks?"
response = llm.invoke( text )
print( response.content )

Socktastic


## Tracking Token Usage:-
--------------------------
- You can use the **LangChain library's callback mechanism** to track token usage.
- The callback will track the tokens used, successful requests, and total cost. <br>
This is currently implemented only for the OpenAI and AzureOpenAI APIs:

In [13]:
from langchain.callbacks import get_openai_callback

llm = AzureChatOpenAI(
    openai_api_version = azure_openai_api_version,
    azure_deployment = llm_deployment_name,
    temperature = 0,
    n=2
)

In [14]:
with get_openai_callback() as cb:
    result = llm.invoke("Tell me a joke")
    print(cb)

Tokens Used: 57
	Prompt Tokens: 11
	Completion Tokens: 46
Successful Requests: 1
Total Cost (USD): $0.0001085


## Few-shot learning:-
------------------------
- Few-shot learning is a remarkable ability that allows LLMs to learn and generalize from limited examples.
- Prompts serve as the input to these models and play a crucial role in achieving this feature.
- With LangChain, examples can be hard-coded, but dynamically selecting them often proves more powerful, enabling LLMs to adapt and tackle tasks with minimal training data swiftly.

This approach involves using the `FewShotPromptTemplate` class, which takes in a `PromptTemplate` and a list of a few shot examples. The class formats the prompt template with a few shot examples, which helps the language model generate a better response. We can streamline this process by utilizing LangChain's `FewShotPromptTemplate` to structure the approach:

In [16]:
from langchain import PromptTemplate
from langchain import FewShotPromptTemplate
from langchain import LLMChain

llm = AzureChatOpenAI(
    openai_api_version = azure_openai_api_version,
    azure_deployment = llm_deployment_name,
    temperature = 0
)

In [18]:
# create our examples
examples = [
    {
        "query": "What's the weather like?",
        "answer": "It's raining cats and dogs, better bring an umbrella!"
    }, {
        "query": "How old are you?",
        "answer": "Age is just a number, but I'm timeless."
    }, {
        "query": "How much intelligence do you have?",
        "answer": "The word 'Intelligence' gets its meaning from me, so they are meaningless to me!"
    }
]

# create an example template
example_template = """
User: {query}
AI: {answer}
"""

# create a prompt example from above template
example_prompt = PromptTemplate(
    input_variables=["query", "answer"],
    template=example_template
)

# now break our previous prompt into a prefix and suffix
# the prefix is our instructions
prefix = """The following are excerpts from conversations with an AI
assistant. The assistant is known for its humor and wit, providing
entertaining and amusing responses to users' questions. Here are some
examples:
"""
# and the suffix our user input and output indicator
suffix = """
User: {query}
AI: """

# now create the few-shot prompt template
few_shot_prompt_template = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    prefix=prefix,
    suffix=suffix,
    input_variables=["query"],
    example_separator="\n\n"
)

fewshot_chain = LLMChain(llm=llm, prompt=few_shot_prompt_template)

In [21]:
response = fewshot_chain.invoke("What's the meaning of life?")
print( response["text"] )

To find the perfect balance between pizza and ice cream.


## Strengths of LLMs:
-----------------------
### Emergent abilities:
Another aspect of LLMs is their **emergent abilities**, which arise as a result of extensive pre-training on vast datasets. These capabilities are not explicitly programmed but emerge as the model discerns patterns within the data. LangChain models capitalize on these emergent abilities by working with various types of models, such as chat models and text embedding models. This allows LLMs to perform diverse tasks, from answering questions to generating text and offering recommendations.

### Scaling laws
**Scaling laws** describe the relationship between model size, training data, and performance. Generally, as the model size and training data volume increase, so does the model's performance. However, this improvement is subject to diminishing returns and may not follow a linear pattern. It is essential to weigh the trade-off between model size, training data, performance, and resources spent on training when selecting and fine-tuning LLMs for specific tasks.

## Shortcomings of LLMs:
---------------------------
### Hallucination:
One notable limitation is the occurrence of hallucinations, in which these models produce text that appears plausible on the surface but is actually factually incorrect or unrelated to the given input. 

### Biases:
LLMs may exhibit biases originating from their training data, resulting in outputs that can perpetuate stereotypes or generate undesired outcomes.

## Examples with Easy Prompts: Text Summarization, Text Translation, and Question Answering:
----------------------------------------------------------------------------------------------
In the realm of natural language processing, Large Language Models have become a popular tool for tackling various text-based tasks. These models can be promoted in different ways to produce a range of results, depending on the desired outcome.

### Setting Up the Environment:
---------------------------------
- To begin, we will need to install the `huggingface_hub` library in addition to previously installed packages and dependencies.
- Create the Huggingface API Key by navigating to Access Tokens page under the account’s Settings.
- The key must be set as an environment variable with `HUGGINGFACEHUB_API_TOKEN` key.

In [5]:
pip show huggingface_hub

Name: huggingface-hub
Version: 0.21.3
Summary: Client library to download and publish models, datasets and other repos on the huggingface.co hub
Home-page: https://github.com/huggingface/huggingface_hub
Author: Hugging Face, Inc.
Author-email: julien@huggingface.co
License: Apache
Location: e:\programs & codes\generative ai\_genai_venv\lib\site-packages
Requires: filelock, fsspec, packaging, pyyaml, requests, tqdm, typing-extensions
Required-by: 
Note: you may need to restart the kernel to use updated packages.


### Creating a Question-Answering Prompt Template:
------------------------------------------------------
Let's create a simple question-answering prompt template using LangChain

In [6]:
from langchain import PromptTemplate

In [11]:
template = """Question: {question}

Answer: """
question_prompt = PromptTemplate(
    template=template,
    input_variables=['question']
)

# user question
question = "What is the capital city of France?"

## Using Huggingface models:-
--------------------------------
- We will use the Hugging Face model `google/flan-t5-large` to answer the question.
- The `HuggingfaceHub` class will connect to Hugging Face’s inference API and load the specified model.

In [9]:
huggingface_token = os.getenv("HUGGINGFACEHUB_API_TOKEN")

In [10]:
from langchain import HuggingFaceHub, LLMChain

In [46]:
"""
LangChainDeprecationWarning: The class `langchain_community.llms.huggingface_hub.HuggingFaceHub` was deprecated in langchain-community 0.0.21 and will be removed in 0.2.0. Use HuggingFaceEndpoint instead.
"""

# initialize Hub LLM
hub_llm = HuggingFaceHub(
    repo_id='google/flan-t5-large',
    model_kwargs={'temperature':0}
)

# create prompt template > LLM chain
llm_chain = LLMChain(
    prompt=question_prompt,
    llm=hub_llm
)

In [33]:
# ask the user question about the capital of France
response = llm_chain.invoke(question)
print(response["text"])

charles henry scott


In [45]:
from langchain_community.llms import HuggingFaceEndpoint

# initialize Hub LLM
hub_llm2 = HuggingFaceEndpoint(
    repo_id='google/flan-t5-large',
    temperature=0,
    max_new_tokens=210
)

# create prompt template > LLM chain
llm_chain2 = LLMChain(
    prompt=question_prompt,
    llm=hub_llm2
)

Token has not been saved to git credential helper. Pass `add_to_git_credential=True` if you want to set the git credential as well.
Token is valid (permission: write).
Your token has been saved to C:\Users\Debanjan Sarkar\.cache\huggingface\token
Login successful


In [30]:
question = "Who is the inventor of digit 0?"
response = llm_chain2.invoke(question)
print(response["text"])

BadRequestError:  (Request ID: 6GPSb085Nw4JLE6Uv_cDX)

Bad request:
The following `model_kwargs` are not used by the model: ['return_full_text', 'stop', 'stop_sequences', 'watermark'] (note: typos in the generate arguments will also show up in this list)

## Asking Multiple Questions:
-------------------------------
To ask multiple questions, we can either:
- Iterate through all questions one at a time, or
- Place all questions into a single prompt for more advanced LLMs.


### Let's start with the First approach:

In [34]:
qa = [
    {'question': "What is the capital city of France?"},
    {'question': "What is the largest mammal on Earth?"},
    {'question': "Which gas is most abundant in Earth's atmosphere?"},
    {'question': "What color is a ripe banana?"}
]

res = llm_chain.generate(qa)
print( res )

generations=[[Generation(text='paris')], [Generation(text='giraffe')], [Generation(text='nitrogen')], [Generation(text='yellow')]] llm_output=None run=[RunInfo(run_id=UUID('d099b17c-7948-4ecb-97e2-41da70530674')), RunInfo(run_id=UUID('0c9fd2fc-05ef-41c3-8fdf-9aea78ac68a8')), RunInfo(run_id=UUID('aec620f2-f1c4-49ef-8ca5-5b980075b04d')), RunInfo(run_id=UUID('03377d2c-cd63-48f0-8750-82619f453e0b'))]


In [41]:
for _ in res.generations:
    print(_)

[Generation(text='paris')]
[Generation(text='giraffe')]
[Generation(text='nitrogen')]
[Generation(text='yellow')]


### Second approach:
- We can modify our prompt template to include multiple questions to implement a second approach.
- The language model will understand that we have multiple questions and answer them sequentially.
- This method performs best on more capable models.

In [54]:
# initialize Hub LLM
hub_llm = HuggingFaceHub(
    repo_id='google/flan-t5-large',
    model_kwargs={'temperature':0}
)

multi_template = """Answer the following questions one at a time.

Questions:
{questions}

Answers:
"""

long_prompt = PromptTemplate( 
    input_variables=["questions"],
    template=multi_template
)


llm_chain = LLMChain(
    prompt=long_prompt,
    llm=hub_llm
)


qs_str = (
    "What is the capital city of France?\n" +
    "What is the largest mammal on Earth?\n" +
    "Which gas is most abundant in Earth's atmosphere?\n" +
	"What color is a ripe banana?\n"
)

In [55]:
response = llm_chain.invoke(qs_str)

print( response )

{'questions': "What is the capital city of France?\nWhat is the largest mammal on Earth?\nWhich gas is most abundant in Earth's atmosphere?\nWhat color is a ripe banana?\n", 'text': 'Paris is the capital of France. It is also the largest city in the world. It is'}
