<a href="https://colab.research.google.com/github/sheldonkemper/portfolio/blob/main/CAM_DS_Intro_to_LangChain_2_1_1_a.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**First things first** - please go to 'File' and select 'Save a copy in Drive' so that you have your own version of this activity set up and ready to use.
Remember to update the portfolio index link to your own work once completed!

#Demonstration 2.1.1.a Introduction to LangChain

In this demonstration, you will learn how to:
- Request an API key calling an LLM from OpenAI.
- Load a closed-source model and an open-sourced model.
- Use the LangChain package to create templates and set up a pipeline.

**Important**: The demonstration uses closed-source models from OpenAI that require API keys. You will be advised to register for an account at the OpenAI developer platform if you do not already have one. The provision of API keys is restricted to personal usage only and is subject to OpenAI’s rate limits. At the time of writing this programme, a sufficient quota of API keys was being offered without charge, but a recent change at OpenAI required that anyone requesting free keys had to add a small credit to their account for the query to work. You will be reimbursed for this credit.

#### Get your OpenAI key

1. Log in at [OpenAI developer platform](https://platform.openai.com/api-keys).
2. Create a new secret key.
3. Copy and paste the key into a document for safe-keeping.
4. Paste the key into *two* locations below, where it says 'Replace with API key'.

Note that each time you run the code cell, it sends a request to OpenAI to use the API. There are [OpenAI rate limits](https://platform.openai.com/docs/guides/rate-limits/usage-tiers); for example, for gpt-3.5-turbo, requests are limited to 3 per minute or 200 per day. Your code will not work if you exceed the requests.




In [None]:
!pip install python-dotenv
!pip install openai
#!pip install chromadb
#!pip install tiktoken
!pip install  torch transformers accelerate bitsandbytes transformers sentence-transformers
!pip install --upgrade langchain
!pip install langchain_community
!pip install langchain-openai
!pip install langchain-huggingface

Collecting python-dotenv
  Downloading python_dotenv-1.0.1-py3-none-any.whl.metadata (23 kB)
Downloading python_dotenv-1.0.1-py3-none-any.whl (19 kB)
Installing collected packages: python-dotenv
Successfully installed python-dotenv-1.0.1
Collecting openai
  Downloading openai-1.52.1-py3-none-any.whl.metadata (24 kB)
Collecting httpx<1,>=0.23.0 (from openai)
  Downloading httpx-0.27.2-py3-none-any.whl.metadata (7.1 kB)
Collecting jiter<1,>=0.4.0 (from openai)
  Downloading jiter-0.6.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.2 kB)
Collecting httpcore==1.* (from httpx<1,>=0.23.0->openai)
  Downloading httpcore-1.0.6-py3-none-any.whl.metadata (21 kB)
Collecting h11<0.15,>=0.13 (from httpcore==1.*->httpx<1,>=0.23.0->openai)
  Downloading h11-0.14.0-py3-none-any.whl.metadata (8.2 kB)
Downloading openai-1.52.1-py3-none-any.whl (386 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m386.9/386.9 kB[0m [31m6.9 MB/s[0m eta [36m0:00:00[0m
[?25hDo

In [None]:
import os

os.environ['OPENAI_API_KEY'] = 'REPLACE WITH YOUR API KEY'

## OpenAI without LangChain

In [None]:
from openai import OpenAI
import os

client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY", "REPLACE WITH YOUR API KEY"))

In [None]:
# View an example with a system message.
MODEL = "gpt-3.5-turbo"
response = client.chat.completions.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain Mean squared Error"},
    ],
    temperature=0,
)

print(response.choices[0].message.content)

Mean squared error (MSE) is a common metric used to evaluate the performance of a predictive model. It measures the average of the squares of the errors or the differences between the actual values and the predicted values. 

To calculate the MSE, you would first calculate the error for each data point by subtracting the actual value from the predicted value. Then, you square each of these errors, sum them up, and divide by the total number of data points. The formula for MSE is:

MSE = Σ(yi - ŷi)² / n

Where:
- yi is the actual value
- ŷi is the predicted value
- n is the total number of data points

A lower MSE value indicates that the model is better at predicting the target variable, as it means that the predicted values are closer to the actual values.


In [None]:
prompt ="What are pre-trained Language Models"
response = client.chat.completions.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": prompt},
    ],
    temperature=0,
)

answer = response.choices[0].message.content

In [None]:
print(answer)

Pre-trained language models are large neural network models that have been trained on vast amounts of text data to understand and generate human language. These models are trained on a diverse range of text sources, such as books, articles, and websites, to learn the patterns and structures of language. Once trained, these models can be fine-tuned on specific tasks, such as text classification, language translation, or text generation, to achieve high performance with minimal additional training data. Popular pre-trained language models include BERT, GPT-3, and RoBERTa.


## LLM parameters

## Some key parameters

* model: the name of the model you want to use (e.g. gpt-3.5-turbo, gpt-4, gpt-3.5-turbo-16k-1106)
* messages: a list of message objects, where each object has **two required fields**:

1.   role: the role of the messenger (either system, user, assistant, or tool)
2.   content: the content of the message (e.g. 'Write me a beautiful poem.')


## Optional parameters

* frequency_penalty: Penalises tokens based on their frequency, reducing repetition.
* logit_bias: Modifies the likelihood of specified tokens with bias values.
* logprobs: Returns log probabilities of output tokens if true.
* top_logprobs: Specifies the number of most likely tokens to return at each position.
* max_tokens: Sets the maximum number of generated tokens in chat completion.
* n: Generates a specified number of chat completion choices for each input.
* presence_penalty: Penalises new tokens based on their presence in the text.
* response_format: Specifies the output format, e.g. JSON mode.
* seed: Ensures deterministic sampling with a specified seed.
* stop: Specifies up to 4 sequences where the API should stop generating tokens.
* stream: Sends partial message deltas as tokens become available.
* temperature: Sets the sampling temperature between 0 and 2.
* top_p: Uses nucleus sampling; considers tokens with top_p probability mass.
* tools: Lists functions the model may call.
* tool_choice: Controls the model's function calls (none/auto/function).
* user: Provides a unique identifier for end-user monitoring and abuse detection.


### LangChain

LangChain is a framework for developing applications powered by language models. It simplifies the development process in two key ways:

1. **Integration**: Incorporating external data, such as personal files, other applications, and API data.
2. **Agency**: Allowing language models to interact with their environment and make decisions.

### Benefits of using LangChain

1. **Modular components**: Easy swapping of crucial components.
2. **Customisable workflows**: Creating and customising 'chains' of actions.
3. **Rapid development**: Fast feature updates.
4. **Vibrant community**: Active community support and events.

While language models operate on a basic input-output basis, LangChain helps manage the complexities of advanced applications.

For more information, refer to the [LangChain Conceptual Documentation](https://docs.langchain.com/docs/).

### Chat messages
Like text, but specified with a message type (System, Human, AI):

* **System:** Helpful background context that tells the AI what to do.
* **Human:** Messages that are intented to represent the user.
* **AI:** Messages that show what the AI responded with.

For more information, refer to OpenAI's [documentation](https://platform.openai.com/docs/guides/chat/introduction).

In [None]:
from langchain_openai import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage, AIMessage
from langchain_openai import OpenAI

chat = ChatOpenAI(model_name="gpt-3.5-turbo")

llm = OpenAI(model_name="gpt-3.5-turbo-instruct")

Now, let's create a few messages that simulate a chat experience with a bot.

In [None]:
chat.invoke(
    [
        SystemMessage(content="You are a helpful assistant."),
        HumanMessage(content="Explain mean squared error")
    ]
)

AIMessage(content="Mean squared error (MSE) is a common metric used to evaluate the accuracy of a prediction model. It measures the average of the squares of the errors between the predicted values and the actual values in a dataset. The formula for calculating MSE is:\n\nMSE = 1/n * Σ(yi - ŷi)^2\n\nWhere:\n- n is the number of data points\n- yi is the actual value of the target variable for data point i\n- ŷi is the predicted value of the target variable for data point i\n- Σ denotes the sum over all data points\n\nA lower MSE value indicates that the model's predictions are closer to the actual values, while a higher MSE value indicates that the model's predictions are farther from the actual values. It is important to minimize the MSE when training a model to improve its predictive accuracy.", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 174, 'prompt_tokens': 22, 'total_tokens': 196, 'completion_tokens_details': {'audio_tokens': None, 

In [None]:
prompt ="What are pre-trained language models"
response = chat(
    [
        SystemMessage(content="You are a helpful assistant."),
        HumanMessage(content=prompt)
    ]
)

  response = chat(


In [None]:
answer = response.content

In [None]:
print(answer)

Pre-trained language models are large neural network models that have been trained on vast amounts of text data to understand and generate human language. These models are trained using techniques such as unsupervised learning and transfer learning to learn the language patterns and structures present in the data. Once trained, these models can be fine-tuned on specific tasks, such as text generation, translation, sentiment analysis, and more. Pre-trained language models have been shown to achieve state-of-the-art performance on a wide range of natural language processing tasks.


## Loading open-source models

In [None]:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

model_name = 'HuggingFaceH4/zephyr-7b-beta'

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

model = AutoModelForCausalLM.from_pretrained(model_name, quantization_config=bnb_config)
tokenizer = AutoTokenizer.from_pretrained(model_name)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/638 [00:00<?, ?B/s]

`low_cpu_mem_usage` was None, now set to True since model is quantized.


model.safetensors.index.json:   0%|          | 0.00/23.9k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/8 [00:00<?, ?it/s]

model-00001-of-00008.safetensors:   0%|          | 0.00/1.89G [00:00<?, ?B/s]

model-00002-of-00008.safetensors:   0%|          | 0.00/1.95G [00:00<?, ?B/s]

model-00003-of-00008.safetensors:   0%|          | 0.00/1.98G [00:00<?, ?B/s]

model-00004-of-00008.safetensors:   0%|          | 0.00/1.95G [00:00<?, ?B/s]

model-00005-of-00008.safetensors:   0%|          | 0.00/1.98G [00:00<?, ?B/s]

model-00006-of-00008.safetensors:   0%|          | 0.00/1.95G [00:00<?, ?B/s]

model-00007-of-00008.safetensors:   0%|          | 0.00/1.98G [00:00<?, ?B/s]

model-00008-of-00008.safetensors:   0%|          | 0.00/816M [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/8 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/111 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/1.43k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/493k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.80M [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/42.0 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/168 [00:00<?, ?B/s]

In [None]:
from langchain_huggingface import HuggingFacePipeline
from langchain.prompts import PromptTemplate
from transformers import pipeline
from langchain_core.output_parsers import StrOutputParser

text_generation_pipeline = pipeline(
    model=model,
    tokenizer=tokenizer,
    task="text-generation",
    temperature=0.2,
    do_sample=True,
    repetition_penalty=1.1,
    return_full_text=True,
    max_new_tokens=400,
)

llm_hug = HuggingFacePipeline(pipeline=text_generation_pipeline)

prompt_template = """
<|system|>
Answer the question based on your knowledge. Use the following context to help:

{context}

</s>
<|user|>
{question}
</s>
<|assistant|>

 """

prompt = PromptTemplate(
    input_variables=["context", "question"],
    template=prompt_template,
)


## Models: Prompts and prompt templates

### Prompt
Below is what you will pass to the underlying model:

In [None]:
llm = OpenAI(model_name="gpt-3.5-turbo-instruct", temperature=0.0)

prompt = """
Translate the following text into French
Text: I like data science
French:
"""

print(llm.invoke(prompt))


J'aime la science des données.


### Prompt template
The prompt template classes in LangChain are built to make constructing prompts with dynamic inputs easier. Of these classes, the simplest is the PromptTemplate. We will test this by adding a single dynamic input to our previous prompt: the user query.

In [None]:
from langchain import PromptTemplate

template = """
Translate the following text into French
Text: {query}
French:
"""

prompt = PromptTemplate(
    input_variables=["query"],
    template=template,
)

final_prompt = prompt.format(query='I like data science')

print (f"Final Prompt: {final_prompt}")
print ("-----------")
print (f"LLM Output: {llm.invoke(final_prompt)}")

Final Prompt: 
Translate the following text into French
Text: I like data science
French:

-----------
LLM Output: 
J'aime la science des données.


In [None]:
template = """
 "Tell me a {adjective} joke about {content}."
"""

prompt = PromptTemplate(
    input_variables=["adjective","content"],
    template=template,
)

final_prompt = prompt.format(adjective="funny", content="chickens")

print (f"Final Prompt: {final_prompt}")
print ("-----------")
print (f"LLM Output: {llm.invoke(final_prompt)}")

Final Prompt: 
 "Tell me a funny joke about chickens."

-----------
LLM Output: 
Why did the chicken cross the playground?

To get to the other slide!


In [None]:
llm_hug = HuggingFacePipeline(pipeline=text_generation_pipeline)

prompt_template = """
<|system|>

You are an AI language assistant
</s>
<|user|>
Translate the following text into French
Text: {query}
French:
</s>
<|assistant|>

 """

prompt = PromptTemplate(
    input_variables=["query"],
    template=prompt_template,
)

final_prompt = prompt.format(query='I like data science')

print (f"Final Prompt: {final_prompt}")
print ("-----------")
print (f"LLM Output: {llm_hug.invoke(final_prompt)}")

Final Prompt: 
<|system|>

You are an AI language assistant
</s>
<|user|>
Translate the following text into French
Text: I like data science
French:
</s>
<|assistant|>

 
-----------


  print (f"LLM Output: {llm_hug(final_prompt)}")


LLM Output: 
<|system|>

You are an AI language assistant
</s>
<|user|>
Translate the following text into French
Text: I like data science
French:
</s>
<|assistant|>

  J'aime la science des données (prononcé : je l'ah-MEE la sêNSEES deh dôNNEY)

Explication :

1. "I" est traduit par "j'" en français.
2. "like" est traduit par "aime" en français.
3. "data science" est traduit par "science des données" en français.
4. Les espaces entre les mots sont conservés pour une meilleure compréhension et prononciation.
5. La prononciation française est indiquée entre parenthèses pour aider à la prononciation correcte. Notez que le "s" final dans "science" est muet en français, donc il n'y a pas de "z" sonore à la fin de cette syllabe.


## Chat prompt templates

In [None]:
template_string =  """
Translate the following text into French
Text: {query}
French:
"""

In [None]:
from langchain.prompts import ChatPromptTemplate

prompt_template = ChatPromptTemplate.from_template(template_string)

In [None]:
prompt_template.messages[0].prompt

PromptTemplate(input_variables=['query'], input_types={}, partial_variables={}, template='\nTranslate the following text into French\nText: {query}\nFrench:\n')

In [None]:
prompt_template.messages[0].prompt.input_variables

['query']

In [None]:
prompt = prompt_template.format_messages(
  query = "I like data science")


In [None]:
response = chat(prompt)
print(response.content)

J'aime la science des données


## Key information
You have learned how to install the LangChain packages and use various techniques to create prompt templates.

## Reflect
How does the LangChain pipeline with an open-source LLM compare to the process of calling a closed-source model from OpenAI? Note some of your findings below.

> Select the pen from the toolbar to add your entry.