#Change the Runtime

# Llama Model

```Runtime -> Change Runtime Type -> Hardware Accelerator (T4) ```

#Install the Libraries

In [1]:
!pip install transformers



In [2]:
!pip install accelerate



# Access the Llama 2 model

1. Request for the access: https://ai.meta.com/resources/models-and-libraries/llama-downloads/


2. This is a form to enable access to Llama 2 on Hugging Face after you have been granted access from Meta: https://huggingface.co/meta-llama/Llama-2-7b-hf




#Get the HuggingFace Token

1. Go to https://huggingface.co/settings/tokens
2. Click new token and generate it

In [3]:
from huggingface_hub import login
login("Please use your own API KEY here")
#login("Please enter your Huggingface api token here to run Llama model")

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: fineGrained).
Your token has been saved to /root/.cache/huggingface/token
Login successful



# Loading Llama 2 7B

In [4]:
from transformers import AutoTokenizer
import transformers
import torch

#specify the model you want to use
model = "meta-llama/Llama-2-7b-chat-hf"

In [5]:
#load the pretrained tokenizer
#tokenizer = AutoTokenizer.from_pretrained(model)
tokenizer = AutoTokenizer.from_pretrained(model, use_auth_token=True)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/1.62k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/414 [00:00<?, ?B/s]

In [6]:
#make it ready for text generation
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

config.json:   0%|          | 0.00/614 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/26.8k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/9.98G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/3.50G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/188 [00:00<?, ?B/s]

Define the function that accepts the prompt and returns the response

In [8]:
def get_response(prompt):

  sequences = pipeline(prompt,
      do_sample=True,
      return_full_text=False,
      top_k=10,
      #top_p = 0.9,
      num_return_sequences=1,
      eos_token_id=tokenizer.eos_token_id,
      max_length=1000,
  )

  return sequences[0]['generated_text']

In [None]:
#The cat is -------- (sleeping- 40%, playing-30%, eating-15%, running-5%, walking-2%, and so on....)
# Word2Vec : CBOW and SKIPGRAM
#Top_K = 5
#Top_P = 0.9 (sleeping- 40%, playing-30%, eating-15%, running-5%)

#Prompting Llama 2

Prompt Template

```
<s>[INST] <<SYS>>
{{ system_prompt }}
<</SYS>>

{{ user_message }} [/INST] {{model_answer}} </s>

```

In [9]:
prompt = '''

<s>[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.
Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content.
Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct.
If you don't know the answer to a question, please don't share false information.
<</SYS>>
I liked friends and money heist. Recommend me the similar shows to watch.
[/INST]
'''

In [10]:
print(get_response(prompt))

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.


Thank you for reaching out! I'm glad to hear that you enjoyed "Friends" and "Money Heist." Here are some similar shows that you might enjoy:

1. "The Office" (US) - A classic sitcom that follows the daily lives of employees at a paper company.
2. "Parks and Recreation" - A comedy series that follows the employees of the Parks and Recreation department of a small town in Indiana.
3. "Brooklyn Nine-Nine" - A police sitcom that follows the adventures of a diverse group of detectives in Brooklyn.
4. "Schitt's Creek" - A comedy series that follows a wealthy family who loses everything and moves to a small town they bought as a joke.
5. "Killing Eve" - A spy thriller that follows a security agent and an assassin as they engage in a cat-and-mouse game.
6. "Orange is the New Black" - A comedy-drama series that follows the lives of a group of women who are incarcerated at a women's prison.
7. "The Good Place" - A fantasy-comedy series that follows a woman who mistakenly ends up in the "good pla

#Case Study - Machine Translation on Product Reviews Data

Prompt 1

In [11]:
prompt = '''

<s>[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.
Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content.
Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct.
If you don't know the answer to a question, please don't share false information.
<</SYS>>
Translate the following text from french to english: Je suis extrêmement satisfait de mon expérience avec l'iPhone.
[/INST]

'''
print(get_response(prompt))

Of course! I'd be happy to help you translate "Je suis extrêmement satisfait de mon expérience avec l'iPhone" from French to English. Here's the translation:

"I am extremely satisfied with my experience with the iPhone."

I hope this helps! Let me know if you have any other questions.


Prompt 2: Build a translator to convert any language from one to other

In [12]:
reviews=['La pantalla es increíble y la duración de la batería es impresionante',
         "Design élégant et performances exceptionnelles, l'iPhone est une merveille technologique.",
         "手机速度快，电池耐用，非常满意的购物体验。"]

In [13]:
for review in reviews:

  prompt = f'''
  <s>[INST] <<SYS>>
  You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.

  If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
  <</SYS>>
  Translate the text given in delimiters <> into english: <{review}>
  [/INST]
  '''

  print("Review:", review)
  print("Output:",get_response(prompt))
  print("\n")

Review: La pantalla es increíble y la duración de la batería es impresionante
Output:  I'm glad you think so! "La pantalla es increíble" means "The screen is incredible" in English, and "la duración de la batería es impresionante" means "The battery life is impressive." So, the full sentence in English would be: "The screen is incredible and the battery life is impressive."


Review: Design élégant et performances exceptionnelles, l'iPhone est une merveille technologique.
Output:  Hello! I'm here to help you with any questions you may have. I understand that you want me to provide helpful and respectful responses, while ensuring that the information is safe and positive in nature.

Regarding your question, "Design élégant et performances exceptionnelles, l'iPhone est une merveille technologique.", I would translate it to English as follows:

"Elegant design and exceptional performance, the iPhone is a technological marvel."

This translation accurately conveys the meaning of the origin

Prompt 3

In [None]:
prompt = """
<s>[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.

If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
<</SYS>>
Translate the following text into spanish and german:  Un bijou de technologie
[/INST]
"""
print(get_response(prompt))

In both Spanish and German, "Un bijou de technologie" can be translated as:

Spanish: "Un joyel de tecnología"

German: "Ein Schmuck der Technologie"

Explanation:

* In Spanish, "bijou" can be translated to "joyel," which means "jewel" or "gem."
* In German, "Bijou" can be translated to "Schmuck," which means "jewelry" or "trinket."

Note: "Tecnología" in Spanish and "Technologie" in German both mean "technology."


Prompt 4

In [16]:
prompt = """
<s>[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.
Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content.
Please ensure that your responses are socially unbiased and positive in nature.

If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct.
If you don't know the answer to a question, please don't share false information.
<</SYS>>
Translate the following text into english and hindi:  Hi, how can I help you? Please explain in details.
[/INST]
"""
print(get_response(prompt))

In English:
"Hello! I'm here to help you in any way I can. Could you please provide more details about what you need help with? I'll do my best to assist you safely and respectfully."

In Hindi:
"नमस्ते! मैं आपको साथ में हेल्प करना चाहता हूँ। क्या आपको समझ में आँकhen की जानकारी देखने का क्या करना चाहते हैं? मैं सुरक्षित और प्रेरणादायक तरीक़े से आपको साथ में हेल्प करूँगा."


#Enabling Conversation with Llama 2

```
<s>[INST] <<SYS>>
{{ system_prompt }}
<</SYS>>

{{ user_msg_1 }} [/INST] {{ model_answer_1 }} </s>
<s>[INST] {{ user_msg_2 }} [/INST]
```

In [17]:
first_prompt_input = 'Translate the following text into spanish and german:  Un bijou de technologie'

In [18]:
first_prompt_output= '''In both Spanish and German, "Un bijou de technologie" can be translated as:

Spanish: "Un joya de tecnología"

German: "Eine Schatz der Technologie"

Both of these translations convey the idea of something being a "jewel" or "treasure" of technology, highlighting its value and significance.
'''

In [19]:
second_prompt_input="Translate to hindi"

In [21]:
prompt=f"""
<s>[INST] <<SYS>>
You are a helpful, respectful and honest assistant.
Always answer as helpfully as possible, while being safe.
Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content.
Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct.
If you don't know the answer to a question, please don't share false information.
<</SYS>>

{first_prompt_input}  [/INST]
{first_prompt_output}
</s>
<s>
[INST]
{second_prompt_input}
[/INST]
"""

get_response(prompt)

'Sure! Here\'s the translation of "Un bijou de technologie" in Hindi:\n\nहिंदी: "एक तकनीकी जहाज" (Ek takniiki jahaz)\n\nI hope this helps! Let me know if you have any other questions.'

# HuggingFace Model

### 1) Models - c 8 lakh
### 2) Datasets - c 2 Lakh (1.85 lakh)

### Libraries (transformers - pipeline, tokenizers, models)
### Libraries (datasets - any data you want, please take it from HUB) and evaluate -

In [22]:
#!pip install transformers
!pip install datasets

Collecting datasets
  Downloading datasets-3.0.1-py3-none-any.whl.metadata (20 kB)
Collecting dill<0.3.9,>=0.3.0 (from datasets)
  Downloading dill-0.3.8-py3-none-any.whl.metadata (10 kB)
Collecting xxhash (from datasets)
  Downloading xxhash-3.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)
Collecting multiprocess (from datasets)
  Downloading multiprocess-0.70.17-py310-none-any.whl.metadata (7.2 kB)
INFO: pip is looking at multiple versions of multiprocess to determine which version is compatible with other requirements. This could take a while.
  Downloading multiprocess-0.70.16-py310-none-any.whl.metadata (7.2 kB)
Downloading datasets-3.0.1-py3-none-any.whl (471 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m471.6/471.6 kB[0m [31m13.9 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading dill-0.3.8-py3-none-any.whl (116 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m116.3/116.3 kB[0m [31m10.4 MB/s[0m eta [36m0:00

In [1]:
import pandas as pd
from transformers import pipeline
from datasets import load_dataset

In [2]:
classifier = pipeline(task='text-classification')
# Text Classification
# Named Entity Recognition
# Question Answers
# Translation
# Summarization
# Text Generation
# Computer Vision - Object detection and Face recognition
# Few/Zero Shot learning

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


In [3]:
email = """Congratulations! You've Secured Your Spot in the Learnbay Data Science Internship Jan '24 Program! 🌟
Dear Applicant,

Congratulations!

We are thrilled to inform you that your profile has been shortlisted for the prestigious Learnbay Data Science Internship Program, January 2024.
Welcome to the journey of learning, growth, and exciting opportunities!

Let's celebrate your achievement with some impressive numbers. This year, we received a staggering 20,000 applications,
and you stand out as one of the top 10% who made it through our rigorous screening process. Your dedication and skills truly set you apart!

To formalize your acceptance and proceed with the onboarding process, we kindly ask you to complete your profile by filling out the following form: Form Link.
If you've already submitted the form, please disregard this message. The deadline to fill the form is January 18, 2024 by 5:00 PM IST.

Important Dates to Remember:
1. Internship Offer Letters and Zoom link to join the internship will be shared via email on January 22, 2024
2. Internship Start Date is January 24, 2024
   - Timing: 6:00 PM to 7:00 PM IST

Feel free to share this exciting news with your friends and family. We look forward to welcoming you to the Learnbay community and
embarking on this rewarding journey together.

Congratulations once again, and get ready for an incredible experience!

Best regards,
Learnbay Data Science Internship Team
"""

In [5]:
product_review = """I bought this product from Flipkart website.
This product is very worst and replacement policy is very bad. Even I went to their New Delhi support center.
I used this laptop only for 30 minute and suddenly it turn off and it will never turn on.
And Flipkart website does not replace this product. I should have gone for better brands like Apple or Alienware.
"""

In [6]:
classifier(email)

[{'label': 'POSITIVE', 'score': 0.9997090697288513}]

In [7]:
classifier(product_review)

[{'label': 'NEGATIVE', 'score': 0.9983298182487488}]

# Named Entity Recognition (NER)

In [8]:
ner_tagger = pipeline(task='ner')
ner_tagger(product_review)

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision f2482bf (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/998 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.33G [00:00<?, ?B/s]

Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


tokenizer_config.json:   0%|          | 0.00/60.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


[{'entity': 'I-ORG',
  'score': 0.99784076,
  'index': 6,
  'word': 'F',
  'start': 27,
  'end': 28},
 {'entity': 'I-ORG',
  'score': 0.9879369,
  'index': 7,
  'word': '##lip',
  'start': 28,
  'end': 31},
 {'entity': 'I-ORG',
  'score': 0.95177233,
  'index': 8,
  'word': '##kar',
  'start': 31,
  'end': 34},
 {'entity': 'I-ORG',
  'score': 0.9963653,
  'index': 9,
  'word': '##t',
  'start': 34,
  'end': 35},
 {'entity': 'I-LOC',
  'score': 0.9994448,
  'index': 29,
  'word': 'New',
  'start': 129,
  'end': 132},
 {'entity': 'I-LOC',
  'score': 0.999671,
  'index': 30,
  'word': 'Delhi',
  'start': 133,
  'end': 138},
 {'entity': 'I-ORG',
  'score': 0.9976974,
  'index': 55,
  'word': 'F',
  'start': 249,
  'end': 250},
 {'entity': 'I-ORG',
  'score': 0.99205655,
  'index': 56,
  'word': '##lip',
  'start': 250,
  'end': 253},
 {'entity': 'I-ORG',
  'score': 0.9672067,
  'index': 57,
  'word': '##kar',
  'start': 253,
  'end': 256},
 {'entity': 'I-ORG',
  'score': 0.994812,
  'index

In [9]:
ner_tagger(email)

[{'entity': 'I-MISC',
  'score': 0.9311301,
  'index': 14,
  'word': 'Lea',
  'start': 49,
  'end': 52},
 {'entity': 'I-ORG',
  'score': 0.576208,
  'index': 15,
  'word': '##rn',
  'start': 52,
  'end': 54},
 {'entity': 'I-MISC',
  'score': 0.59212327,
  'index': 16,
  'word': '##bay',
  'start': 54,
  'end': 57},
 {'entity': 'I-MISC',
  'score': 0.6680807,
  'index': 17,
  'word': 'Data',
  'start': 58,
  'end': 62},
 {'entity': 'I-ORG',
  'score': 0.65474886,
  'index': 18,
  'word': 'Science',
  'start': 63,
  'end': 70},
 {'entity': 'I-MISC',
  'score': 0.8908577,
  'index': 19,
  'word': 'Inter',
  'start': 71,
  'end': 76},
 {'entity': 'I-MISC',
  'score': 0.48918214,
  'index': 20,
  'word': '##ns',
  'start': 76,
  'end': 78},
 {'entity': 'I-MISC',
  'score': 0.9545768,
  'index': 21,
  'word': '##hip',
  'start': 78,
  'end': 81},
 {'entity': 'I-MISC',
  'score': 0.5731509,
  'index': 22,
  'word': 'Jan',
  'start': 82,
  'end': 85},
 {'entity': 'I-MISC',
  'score': 0.8205653

In [10]:
editorial = """
By winning a third consecutive term in Haryana, the Bharatiya Janata Party (BJP) has demonstrated that its pole position in the Hindi heartland remains intact. Its failure to win an absolute majority in the 2024 general election has not eroded its social base, and in Haryana, it increased its vote share when compared to the previous election. The Congress too saw its vote share and the number of seats increase, but not enough to win power. The simultaneous gains for both parties are indicative of a sharper polarisation, but that does not entirely end the importance of smaller outfits and influential independents as it turns out — they tilted the scale in several constituencies. The outcome mirrors a social reality of Haryana that the BJP cleverly engineered to its benefit and which the Congress overlooked, namely, a broad alignment of non-Jat communities against Jat dominance. Incumbent Chief Minister Nayab Singh Saini, who is set for a second term, became the face of the BJP’s mobilisation of Other Backward Classes. The BJP’s strategy of offering political space for marginalised Hindu communities is one that is working well for it. Jats possibly united against the BJP, as the eclipse of the INLD and JJP suggest, but that worked in the BJP’s favour by aiding the counter-mobilisation of disparate groups. The Haryana poll outcome also helps Prime Minister Narendra Modi reinforce his authority over the party.

The Congress failed to inspire confidence among a wider spectrum of society as former Chief Minister Bhupinder Singh Hooda and his son Deepinder dominated the campaign. Their own Jat community rallied behind the party which possibly caused a counter consolidation of the rest. The Hoodas have so controlled the Congress in Haryana that the party organisation is either non-existent or ineffective. They stalled the central leadership’s efforts to form political alliances. The Congress’s Haryana setback follows the pattern of the Madhya Pradesh and Chhattisgarh elections that it lost in 2023 — regional leaders who refused to accommodate party colleagues and broaden the social base which failed the party. The party is struggling to find a balance between having a robust regional leadership and ensuring that its national outlook is not undermined. Senior leader Rahul Gandhi could not enforce his social justice agenda in the party’s Haryana strategy. Dalit party leaders were humiliated, opening space for others. The BJP has been in power for 10 years and there was notable resentment against it among voters. But that did not translate into a change of guard as the BJP could beat anti-incumbency while the Congress failed to gain from it. A study of the Haryana outcome will be instructive of why the BJP wins so often and the Congress ends up second best.
"""

In [11]:
ner_tagger(editorial)

[{'entity': 'I-LOC',
  'score': 0.99799275,
  'index': 8,
  'word': 'Haryana',
  'start': 40,
  'end': 47},
 {'entity': 'I-ORG',
  'score': 0.9996921,
  'index': 11,
  'word': 'Bharatiya',
  'start': 53,
  'end': 62},
 {'entity': 'I-ORG',
  'score': 0.9997192,
  'index': 12,
  'word': 'Janata',
  'start': 63,
  'end': 69},
 {'entity': 'I-ORG',
  'score': 0.99959975,
  'index': 13,
  'word': 'Party',
  'start': 70,
  'end': 75},
 {'entity': 'I-ORG',
  'score': 0.99973184,
  'index': 15,
  'word': 'B',
  'start': 77,
  'end': 78},
 {'entity': 'I-ORG',
  'score': 0.9996619,
  'index': 16,
  'word': '##JP',
  'start': 78,
  'end': 80},
 {'entity': 'I-MISC',
  'score': 0.9914404,
  'index': 26,
  'word': 'Hindi',
  'start': 129,
  'end': 134},
 {'entity': 'I-LOC',
  'score': 0.99917173,
  'index': 54,
  'word': 'Haryana',
  'start': 269,
  'end': 276},
 {'entity': 'I-ORG',
  'score': 0.99952865,
  'index': 69,
  'word': 'Congress',
  'start': 350,
  'end': 358},
 {'entity': 'I-LOC',
  'scor

# Summarization

In [12]:
hugging_summarization = pipeline("summarization")

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/1.80k [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/1.22G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


In [13]:
hugging_summarization(editorial)

[{'summary_text': ' By winning a third consecutive term in Haryana, the Bharatiya Janata Party (BJP) has demonstrated that its pole position in the Hindi heartland remains intact . The Congress failed to inspire confidence among a wider spectrum of society as former Chief Minister Bhupinder Singh Hooda and his son Deepinder dominated the campaign .'}]

In [16]:
hugging_summarization(editorial, max_length=20)

Your min_length=56 must be inferior than your max_length=20.


[{'summary_text': ' By winning a third consecutive term in Haryana, the Bharatiya Janata'}]

In [14]:
hugging_summarization(email)

[{'summary_text': ' The deadline to fill the form is January 18, 2024 by 5:00 PM IST . The internship start date is January 24, 2024 and the start date will be 6:00pm to 7:00 pm . The internships will be shared via email on January 22, 2024, and the Zoom link to join the internship will be sent via email .'}]

# Question Answering

In [17]:
quest_answers = pipeline(task='question-answering')

No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 626af31 (https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/473 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/261M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/49.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/436k [00:00<?, ?B/s]

Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


In [19]:
question = "Who Party won the haryana election in 2024?"
quest_answers(question=question, context=editorial)

{'score': 0.33601605892181396,
 'start': 53,
 'end': 75,
 'answer': 'Bharatiya Janata Party'}

In [20]:
question = "Who will become the upcoming CM in Haryana assembly?"
quest_answers(question=question, context=editorial)

{'score': 0.38673385977745056,
 'start': 901,
 'end': 933,
 'answer': 'Chief Minister Nayab Singh Saini'}

In [21]:
email

"Congratulations! You've Secured Your Spot in the Learnbay Data Science Internship Jan '24 Program! 🌟\nDear Applicant,\n\nCongratulations!\n\nWe are thrilled to inform you that your profile has been shortlisted for the prestigious Learnbay Data Science Internship Program, January 2024. \nWelcome to the journey of learning, growth, and exciting opportunities!\n\nLet's celebrate your achievement with some impressive numbers. This year, we received a staggering 20,000 applications, \nand you stand out as one of the top 10% who made it through our rigorous screening process. Your dedication and skills truly set you apart!\n\nTo formalize your acceptance and proceed with the onboarding process, we kindly ask you to complete your profile by filling out the following form: Form Link. \nIf you've already submitted the form, please disregard this message. The deadline to fill the form is January 18, 2024 by 5:00 PM IST.\n\nImportant Dates to Remember:\n1. Internship Offer Letters and Zoom link 

In [22]:
question = "When are we going to receive an offer letter?"
quest_answers(question=question, context=email)

{'score': 0.3864248991012573,
 'start': 1035,
 'end': 1053,
 'answer': 'January 22, 2024\n2'}

In [23]:
product_review

'I bought this product from Flipkart website.\nThis product is very worst and replacement policy is very bad. Even I went to their New Delhi support center.\nI used this laptop only for 30 minute and suddenly it turn off and it will never turn on.\nAnd Flipkart website does not replace this product. I should have gone for better brands like Apple or Alienware.\n'

In [25]:
question = "Where did the customer buy the product?"
quest_answers(question=question, context=product_review)

{'score': 0.8565688133239746,
 'start': 27,
 'end': 43,
 'answer': 'Flipkart website'}