### **SET A**

-------

#### A1 - Natural Language Processing (Simple Chatbot)

In [None]:
!pip install transformers

In [None]:
#This code suppresses the warning messages

from transformers.utils import logging
logging.set_verbosity_error()

In [None]:
from transformers import pipeline

Smaller model chosen, so that the memory can handle it efficiently (For lower memory hardwares)

In [None]:
chatbot = pipeline(task="conversational", model="./models/facebook/blenderbot-400M-distill")

In [None]:
user_message = """
What are some fun activities I can do in the winter?
"""

In [None]:
from transformers import Conversation

In [None]:
conversation = Conversation(user_message)
print(conversation)

In [None]:
conversation = chatbot(conversation)
print(conversation)

We can continue the conversation with the chatbot with:

In [None]:
print(chatbot(Conversation("What else do you recommend?")))

However, the chatbot may provide an unrelated response because it does not have memory of any prior conversations. To include prior conversations in the LLM's context, you can add a 'message' to include the previous chat history.

Therefore,

In [None]:
conversation.add_message(
    {"role": "user",
     "content": """
What else do you recommend?
"""
    })

print(conversation)

In [None]:
conversation = chatbot(conversation)

print(conversation)

**USEFUL LINKS:**

- https://huggingface.co/chat
- https://huggingface.co/spaces/open-llm-leaderboard
- https://huggingface.co/spaces/chatbot-arena-leaderboard

-------

#### A2 - Translation and Summarization

In [None]:
!pip install transformers 
!pip install torch

In [None]:
#This code suppresses the warning messages

from transformers.utils import logging
logging.set_verbosity_error()

**Translation**

In [None]:
from transformers import pipeline 
import torch

Loading the pipeline.
Here, *torch_dtype=torch.bfloat16* i.e. the Torch Dtype is set to Float16 to compress the model without any performance degradation.

In [None]:
translator = pipeline(task="translation", model="./models/facebook/nllb-200-distilled-600M", torch_dtype=torch.bfloat16) 

In [None]:
text = """\
चंदा भी दीवाना है तेरा, \
जलती है तुझसे, \
सारी चकोरियाँ.
"""

In [None]:
text_translated = translator(text, src_lang="hin_Deva", tgt_lang="eng_Latn")

**For other languages, refer this link: https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200**

***CODE TO FREE UP THE MEMORY FOR USING ANOTHER MODEL***

In [None]:
import gc

del translator

gc.collect()

**Summarization**

In [None]:
summarizer = pipeline(task="summarization", model="./models/facebook/bart-large-cnn", torch_dtype=torch.bfloat16)

In [None]:
text = """Paris is the capital and most populous city of France, with
          an estimated population of 2,175,601 residents as of 2018,
          in an area of more than 105 square kilometres (41 square
          miles). The City of Paris is the centre and seat of
          government of the region and province of Île-de-France, or
          Paris Region, which has an estimated population of
          12,174,880, or about 18 percent of the population of France
          as of 2017."""

In [None]:
summary = summarizer(text, min_length=10, max_length=100)

In [None]:
print(summary)

------

#### A3 - Sentence Embedding (Finding similarities between Texts)

In [None]:
!pip install sentence-transformers

In [None]:
#Supresses warning messages

from transformers.utils import logging
logging.set_verbosity_error()

This *SentenceTransformer* class contains various different models within it

In [None]:
from sentence_transformers import SentenceTransformer

In [None]:
model = SentenceTransformer("all-MiniLM-L6-v2")

In [None]:
sentences1 = ['The cat sits outside',
              'A man is playing guitar',
              'The movies are awesome']

Now, **for sentence similarity models they convert input text into vectors (or are called Embeddings).** Therefore, the following encoding step is done.

In [None]:
embeddings1 = model.encode(sentences1, convert_to_tensor=True)

In [None]:
embeddings1

In [None]:
sentences2 = ['The dog plays in the garden',
              'A woman watches TV',
              'The new movie is so great']

In [None]:
embeddings2 = model.encode(sentences2, convert_to_tensor=True)

In [None]:
print(embeddings2)

Now, Calculate the ***cosine similarity* between two sentences as a measure of how similar they are to each other**.

In [None]:
from sentence_transformers import util

In [None]:
cosine_scores = util.cos_sim(embeddings1,embeddings2)

In [None]:
print(cosine_scores)

Consider the diagonal values for the matching/similarity in the sentences.

In [None]:
for i in range(len(sentences1)):
    print("{} \t\t {} \t\t Score: {:.4f}".format(sentences1[i], sentences2[i], cosine_scores[i][i]))

-----