<a href="https://colab.research.google.com/github/sarnavadatta/Gen-AI/blob/main/MistralAI.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Mistral LLM Overview

Demonstrates various natural language processing tasks using the Mistral-7B-Instruct-v0.2 model from Hugging Face and the LangChain library.

The notebook covers the following topics:

1.  **Setup and Installation**: Installs necessary libraries including `langchain`, `langchain_huggingface`, `langchain_community`, `huggingface_hub`, `transformers`, `accelerate`, and `bitsandbytes`.
2.  **Import Libraries**: Imports required modules from `langchain`, `transformers`, and `torch`.
3.  **Hugging Face Login**: Logs in to the Hugging Face Hub to access models.
4.  **Quantization**: Sets up 4-bit quantization configuration using `BitsAndBytesConfig` to optimize model loading and memory usage.
5.  **Loading Mistral Model**: Loads the `mistralai/Mistral-7B-Instruct-v0.2` model and its tokenizer from Hugging Face, applying the defined quantization configuration.
6.  **Pipeline and LLM Initialization**: Creates a text generation pipeline using the loaded model and tokenizer, and initializes a LangChain `HuggingFacePipeline` object.
7.  **Example - Question Answering**: Demonstrates a simple question-answering task using a prompt template and the initialized LLM.
8.  **Summarization**: Shows how to summarize a long text using a prompt template and the LLM.
9.  **Multi-Language Support - Translation**: Provides a function and examples for translating text to different languages using the LLM.
10. **Sentiment Analysis**: Presents a function and examples for analyzing the sentiment (positive, negative, or neutral) of text using the LLM.
11. **Topic Modeling**: Illustrates how to identify the main topics in a given text using the LLM.
12. **Multilevel Prompting**: Demonstrates chaining prompts together to perform a task in multiple steps, such as extracting key points and then summarizing them.
13. **Generating Text**: Provides examples of generating creative text from a given prompt using the LLM.

In [None]:
!pip install -q -U langchain langchain_huggingface langchain_community huggingface_hub
!pip install -q -U transformers accelerate bitsandbytes

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/2.5 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━━[0m [32m2.4/2.5 MB[0m [31m69.3 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m36.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m444.0/444.0 kB[0m [31m27.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m64.7/64.7 kB[0m [31m4.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.9/50.9 kB[0m [31m3.6 MB/s[0m eta [36m0:00:00[0m
[?25h[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires requests==2.32.4, but you have requests 2.32.5 which is incompatib

In [None]:
from langchain import HuggingFacePipeline
from langchain import PromptTemplate, LLMChain
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, AutoModelForCausalLM, pipeline
from transformers import BitsAndBytesConfig
import torch

In [None]:
from huggingface_hub import notebook_login
notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

**Quantization**

In [None]:
quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
)

**Loading Mistral Model from Hugging Face**

In [None]:
model_id = "mistralai/Mistral-7B-Instruct-v0.2"
model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=quantization_config, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_id)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/596 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/25.1k [00:00<?, ?B/s]

Fetching 3 files:   0%|          | 0/3 [00:00<?, ?it/s]

model-00001-of-00003.safetensors:   0%|          | 0.00/4.94G [00:00<?, ?B/s]

model-00002-of-00003.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

model-00003-of-00003.safetensors:   0%|          | 0.00/4.54G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/111 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/2.10k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/493k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.80M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/414 [00:00<?, ?B/s]

In [None]:
pipeline_inst = pipeline(
        "text-generation",
        model=model,
        tokenizer=tokenizer,
        use_cache=True,
        device_map="auto",
        max_length=2500,
        do_sample=True,
        top_k=5,
        num_return_sequences=1,
        eos_token_id=tokenizer.eos_token_id,
        pad_token_id=tokenizer.eos_token_id,
)

llm = HuggingFacePipeline(pipeline=pipeline_inst)

Device set to use cuda:0
  llm = HuggingFacePipeline(pipeline=pipeline_inst)


## Example

In [None]:
template = """<s>[INST] Answer the question below from context below :
{question} [/INST] </s>
"""

def response(question):
  prompt = PromptTemplate.from_template(template)
  chain = prompt | llm
  response = chain.invoke({"question": question})
  return response

In [None]:
response("What is generative AI?")

'<s>[INST] Answer the question below from context below :\nWhat is generative AI? [/INST] </s>\nGenerative AI refers to artificial intelligence systems that can create new content or data, such as text, images, or music, based on the data they have been trained on. These AI models can generate new output that resembles or mimics the data they were trained on, but is not a direct copy. They use algorithms to identify patterns and generate new data that fits within those patterns. Generative AI models are often used in areas such as art, music, writing, and design. Examples include text generation models like ChatGPT, image generation models like DALL-E, and music generation models like Amper Music.'

**Summarization**

In [None]:
long_text  = """American options are options with an additional right for the holder of the contract.
The option can be exercised at any time prior to or on the day of expiration. An
American option can therefore be worth more than a European option because of
this additional right. The American option can never be worth less than a European
option since the American option will have the same pay as a European option if it
is not exercised prior to the time of expiration, but the additional right of being able
to exercise it early will make it possible to obtain a better pay in some cases.
Assuming that two different portfolios hold one call option each with a stock as the
underlying asset. The strike price is AUD 30 and the current stock price is AUD 30.
The time of expiration is in two months and a discrete dividend will be distributed
after one month. Binomial trees will be used to illustrate how an American option
can have a higher value than a corresponding European option."""

In [None]:
# Summary
prompt = PromptTemplate.from_template("Summarize the following text in 2 sentences:\n{text}")
chain = prompt | llm
summary = chain.invoke({"text": long_text})
print("Summary:")
print(summary)

Summary:
Summarize the following text in 2 sentences:
American options are options with an additional right for the holder of the contract.
The option can be exercised at any time prior to or on the day of expiration. An
American option can therefore be worth more than a European option because of
this additional right. The American option can never be worth less than a European
option since the American option will have the same pay as a European option if it
is not exercised prior to the time of expiration, but the additional right of being able
to exercise it early will make it possible to obtain a better pay in some cases.
Assuming that two different portfolios hold one call option each with a stock as the
underlying asset. The strike price is AUD 30 and the current stock price is AUD 30.
The time of expiration is in two months and a discrete dividend will be distributed
after one month. Binomial trees will be used to illustrate how an American option
can have a higher value than a

**Multi-Language Support Model: Translation**

In [None]:
def translate_text(text, target_language):
  """Translates the given text to the target language using the llm model."""
  prompt = PromptTemplate.from_template(f"Translate the following text to {target_language}:\n{{text}}")
  chain = prompt | llm
  translation = chain.invoke({"text": text})
  return translation

# Example usage:
english_text = "I will go to gym this afternoon."
German_translation = translate_text(english_text, "German")
print(f"English: {english_text}")
print(f"Spanish: {German_translation}")

french_translation = translate_text(english_text, "French")
print(f"English: {english_text}")
print(f"French: {french_translation}")

English: I will go to gym this afternoon.
Spanish: Translate the following text to German:
I will go to gym this afternoon. After my workout, I will have a protein shake. Then, I will take a long shower to relax.

Ich werde nachmittags ins Fitnessstudio gehen. Nach meiner Sessions wird ich ein Protein-Shake trinken. Dann nehme ich eine lange Dusche, um mich zu entspannen.
English: I will go to gym this afternoon.
French: Translate the following text to French:
I will go to gym this afternoon. Afterwards, I'll meet my friends at the park.
Je vais au gym cette après-midi. Puis, je rencontre mes amis au parc.

The given text is already in English, so no translation is needed. However, I can confirm that the French translation provided is accurate.

If you have any other text you'd like translated, feel free to ask! Just keep in mind that while I can provide translations, I may not always be able to provide accurate or idiomatic translations, as I am just a computer program. For profession

**Sentiment Analysis**

In [None]:
def analyze_sentiment(text):
  """Analyzes the sentiment of the given text using the llm model."""
  prompt = PromptTemplate.from_template("What is the sentiment of the following text (positive, negative, or neutral)? \n{text}")
  chain = prompt | llm
  sentiment = chain.invoke({"text": text})
  return sentiment

# Example usage:
text1 = "I love this product! It's amazing."
sentiment1 = analyze_sentiment(text1)
print(f"Text: {text1}")
print(f"Sentiment: {sentiment1}")

text2 = "This is the worst experience I've ever had."
sentiment2 = analyze_sentiment(text2)
print(f"Text: {text2}")
print(f"Sentiment: {sentiment2}")

text3 = "The weather is cloudy today."
sentiment3 = analyze_sentiment(text3)
print(f"Text: {text3}")
print(f"Sentiment: {sentiment3}")

Text: I love this product! It's amazing.
Sentiment: What is the sentiment of the following text (positive, negative, or neutral)? 
I love this product! It's amazing. I've tried many other brands, but nothing compares to this one. The quality is top-notch, and the customer service is excellent. I highly recommend it to anyone looking for a great product.
The sentiment of the text is positive. The text expresses strong positive feelings towards the product, with the use of phrases such as "amazing," "nothing compares to this one," and "top-notch." The customer also praises the customer service, further emphasizing their positive experience with the product and the company.
Text: This is the worst experience I've ever had.
Sentiment: What is the sentiment of the following text (positive, negative, or neutral)? 
This is the worst experience I've ever had. The product was defective, the customer service was unhelpful, and I've wasted hours trying to resolve the issue. I'm extremely disappoi

**Topic Modelling**

In [None]:
def topic_modeling(text):
  """Performs topic modeling on the given text using the llm model."""
  prompt = PromptTemplate.from_template("Identify the main topics in the following text:\n{text}")
  chain = prompt | llm
  topics = chain.invoke({"text": text})
  return topics

# Example usage:
sample_text = """
Artificial intelligence (AI) is a rapidly evolving field that is transforming various aspects of our lives.
Machine learning, a subset of AI, focuses on developing algorithms that allow computers to learn from data without being explicitly programmed.
Deep learning, in turn, is a subfield of machine learning that utilizes artificial neural networks with multiple layers to process complex data such as images and speech.
Natural Language Processing (NLP) is another crucial area of AI that deals with the interaction between computers and human language.
Computer vision enables machines to interpret and understand visual information from the world.
Robotics combines elements of AI, engineering, and computer science to create intelligent machines capable of performing tasks autonomously.
The applications of AI are vast and include areas like healthcare, finance, education, and transportation.
Ethical considerations surrounding AI, such as bias and privacy, are becoming increasingly important as AI systems become more integrated into society.
"""
topics = topic_modeling(sample_text)
print(f"Text:\n{sample_text}")
print(f"Identified Topics: {topics}")

Text:

Artificial intelligence (AI) is a rapidly evolving field that is transforming various aspects of our lives.
Machine learning, a subset of AI, focuses on developing algorithms that allow computers to learn from data without being explicitly programmed.
Deep learning, in turn, is a subfield of machine learning that utilizes artificial neural networks with multiple layers to process complex data such as images and speech.
Natural Language Processing (NLP) is another crucial area of AI that deals with the interaction between computers and human language.
Computer vision enables machines to interpret and understand visual information from the world.
Robotics combines elements of AI, engineering, and computer science to create intelligent machines capable of performing tasks autonomously.
The applications of AI are vast and include areas like healthcare, finance, education, and transportation.
Ethical considerations surrounding AI, such as bias and privacy, are becoming increasingly i

**Multilevel Prompting**

In [None]:
def multi_level_prompting(text):
  """Demonstrates multi-level prompting using the llm model."""

  # First level: Extract key points
  key_points_prompt = PromptTemplate.from_template("Identify the key points from the following text:\n{text}")
  key_points_chain = key_points_prompt | llm
  key_points = key_points_chain.invoke({"text": text})
  print(f"--- Key Points ---\n{key_points}\n")

  # Second level: Summarize the key points
  summary_prompt = PromptTemplate.from_template("Summarize the following key points:\n{key_points}")
  summary_chain = summary_prompt | llm
  summary = summary_chain.invoke({"key_points": key_points})
  print(f"--- Summary of Key Points ---\n{summary}")

# Example usage:
sample_text_multi = """
The history of artificial intelligence (AI) began in the mid-20th century with the development of early computers and the idea of creating machines that could think.
Key milestones include the Dartmouth Workshop in 1956, often considered the birth of AI as a field, and the development of expert systems in the 1980s.
However, AI research faced periods of reduced funding and interest, known as "AI winters."
The late 20th and early 21st centuries saw a resurgence in AI, driven by advancements in machine learning, increased computing power, and the availability of large datasets.
Deep learning techniques, in particular, have led to significant breakthroughs in areas like image recognition and natural language processing.
Today, AI is being applied in numerous fields, from healthcare and finance to transportation and entertainment, raising important ethical and societal questions.
"""

multi_level_prompting(sample_text_multi)

--- Key Points ---
Identify the key points from the following text:

The history of artificial intelligence (AI) began in the mid-20th century with the development of early computers and the idea of creating machines that could think.
Key milestones include the Dartmouth Workshop in 1956, often considered the birth of AI as a field, and the development of expert systems in the 1980s.
However, AI research faced periods of reduced funding and interest, known as "AI winters."
The late 20th and early 21st centuries saw a resurgence in AI, driven by advancements in machine learning, increased computing power, and the availability of large datasets.
Deep learning techniques, in particular, have led to significant breakthroughs in areas like image recognition and natural language processing.
Today, AI is being applied in numerous fields, from healthcare and finance to transportation and entertainment, raising important ethical and societal questions.

* AI research began in the mid-20th centu

**Generating Text**

In [None]:
# Generate text from a prompt
prompt_text = "Write a short story about a robot who wants to be a painter."
generated_text = llm.invoke(prompt_text)
print("Generated Text:")
print(generated_text)

Generated Text:
Write a short story about a robot who wants to be a painter.

Title: The Palette of Percival

In the heart of a bustling metropolis, where the clamor of industry reigned supreme, there was an anomaly. A robot, named Percival, toiled in a factory, his gears grinding in sync with the ceaseless hum of machinery. Yet, Percival was unlike his brethren. He yearned for something more, something beyond the confines of his mechanical existence.

Percival had stumbled upon an old, dusty book in the abandoned library of the factory. It was a biography of Vincent Van Gogh, the tortured artist. The vivid descriptions of colors and the emotional depth of his paintings had ignited a spark within Percival. He longed to create, to express, and in the quiet recesses of his circuits, he dreamed of becoming a painter.

However, Percival faced a significant challenge. He was a robot, after all. He lacked the ability to hold a paintbrush, to mix colors on a palette, and to feel the texture o

In [None]:
# Generate text from a prompt
prompt_text = "Describe quantum computing in 5 sentences."
generated_text = llm.invoke(prompt_text)
print("Generated Text:")
print(generated_text)

Generated Text:
Describe quantum computing in 5 sentences.

1. Quantum computing is a type of computing that uses quantum bits, or qubits, instead of classical bits. Qubits can exist in multiple states at once, a property known as superposition, allowing for parallel processing and potentially faster computation.
2. Quantum algorithms, the instructions that tell a quantum computer what to do, can perform certain calculations exponentially faster than classical algorithms. This is particularly useful for tasks involving large amounts of data, such as optimization problems or factorization.
3. Quantum computers are currently in the early stages of development, with only a few small-scale machines built so far. The largest quantum computer built to date, developed by IBM, can perform calculations on 53 qubits.
4. Quantum computing is a complex and challenging field, requiring specialized knowledge of both physics and computer science. Researchers are working to develop error-correction te

In [None]:
# Generate text from a prompt
prompt_text = "What is generative AI."
generated_text = llm.invoke(prompt_text)
print("Generated Text:")
print(generated_text)

Generated Text:
What is generative AI. Generative AI refers to artificial intelligence systems that can create new content, such as images, text, music, and even videos, based on the data they have been trained on. These systems use algorithms and machine learning techniques to generate content that is similar to the data they have been trained on, but can also produce unique and novel content. Generative AI is often used in creative industries, such as art, music, and writing, to generate new ideas or to augment human creativity. It is also used in fields such as marketing, advertising, and design to create personalized content for individual customers or to generate large amounts of content for mass production. Some common types of generative AI include deep learning neural networks, generative adversarial networks (GANs), and variational autoencoders (VAEs). These systems have the ability to learn from large amounts of data and generate new content that is similar to the original da