## **Question Answering ( Q & A )**

In [1]:
import torch
print(torch.cuda.is_available())
print(torch.cuda.get_device_name(0))



True
NVIDIA GeForce RTX 4050 Laptop GPU


In [3]:
from transformers import BertForQuestionAnswering,BertTokenizer

  from .autonotebook import tqdm as notebook_tqdm


In [4]:
# Load model & tokenizer
tokenizer = BertTokenizer.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')
model = BertForQuestionAnswering.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
Some weights of the model checkpoint at bert-large-uncased-whole-word-masking-finetuned-squad were not used when initializing BertForQuestionAnswering: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForQue

In [5]:
context = """
The Amazon rainforest, often referred to as the "lungs of the Earth," is the largest tropical rainforest in the world, covering over 5.5 million square
 kilometers across nine countries in South America. It plays a crucial role in regulating the global climate by absorbing large amounts of carbon dioxide.
 The forest is home to millions of species of plants, animals, and insects, many of which are not found anywhere else on Earth. However, deforestation caused by
 logging, agriculture, and mining has led to significant habitat loss and environmental concerns. Efforts are being made by governments and organizations worldwide
 to protect and restore the rainforest through sustainable practices and conservation initiatives.
"""

In [6]:
def ask_question(question):
  inputs=tokenizer.encode_plus(question,context,return_tensors='pt')
  output = model(**inputs)
  answer_start=torch.argmax(output.start_logits)
  answer_end=torch.argmax(output.end_logits)+1
  answer=tokenizer.convert_tokens_to_string(
      tokenizer.convert_ids_to_tokens(inputs['input_ids'][0][answer_start:answer_end])
  )
  print(f"Question : {question}")
  print(f"Answer:{answer.strip()}")


In [7]:
ask_question("What is the Amazon rainforest often called ?")

Question : What is the Amazon rainforest often called ?
Answer:lungs of the earth


In [8]:
ask_question("Why is the Amazon rainforest important for the global climate?")

Question : Why is the Amazon rainforest important for the global climate?
Answer:absorbing large amounts of carbon dioxide


In [9]:
ask_question("What are some of the major causes of deforestation in the amazon")

Question : What are some of the major causes of deforestation in the amazon
Answer:logging , agriculture , and mining


In [10]:
# you can simplify the above with the pipeline API

from transformers import pipeline

# Load the Q&A pipleline
qa_pipeline = pipeline(
    "question-answering",
    model='bert-large-uncased-whole-word-masking-finetuned-squad'
)

Some weights of the model checkpoint at bert-large-uncased-whole-word-masking-finetuned-squad were not used when initializing BertForQuestionAnswering: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Device set to use cuda:0


In [11]:
result = qa_pipeline({
    'context' : context,
    'question' : "What is the Amazon rainforest often called ?"
})
print(result['answer'])



lungs of the Earth


## **Text Summarization**

In [None]:
from transformers import BartForConditionalGeneration, BartTokenizer

# Load model and Tokenizer
model_name='facebook/bart-large-cnn'
tokenizer=BartTokenizer.from_pretrained(model_name)
model=BartForConditionalGeneration.from_pretrained(model_name)

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


In [None]:
text = """
Artificial intelligence (AI) has rapidly evolved over the past decade, transforming industries and daily life in remarkable ways.
 From virtual assistants that can understand natural language to self-driving cars capable of navigating complex environments,
 AI systems are becoming increasingly sophisticated. One of the key drivers behind this progress is the availability of massive amounts of data and
 improvements in computational power. Companies across sectors, including healthcare, finance, and education, are leveraging AI to enhance efficiency
 and decision-making. However, this rapid advancement also raises concerns about privacy, job displacement, and ethical use. For instance, the use of facial
 recognition technology has sparked debates over surveillance and individual rights. As AI continues to integrate into society, finding the balance between
 innovation and responsibility becomes essential. Governments, researchers, and organizations must collaborate to establish regulations and best practices
 that ensure AI benefits humanity while minimizing potential harm.
"""

In [None]:
# Tokenize and summarize
inputs=tokenizer([text],max_length=1024,return_tensors='pt')
summary_ids=model.generate(inputs['input_ids'])
summary=tokenizer.decode(summary_ids[0],skip_special_tokens=True)

print("Summary :\n",summary )

Summary :
 Artificial intelligence (AI) has rapidly evolved over the past decade, transforming industries and daily life in remarkable ways. From virtual assistants that can understand natural language to self-driving cars capable of navigating complex environments, AI systems are becoming increasingly sophisticated. Governments, researchers, and organizations must collaborate to establish regulations and best practices that ensure AI benefits humanity while minimizing potential harm.


In [None]:
# you can simplify the above with the pipeline API
from transformers import pipeline

summarizer = pipeline("summarization", model="Falconsai/text_summarization")
print(summarizer(text, do_sample=False)[0]['summary_text'])

Device set to use cpu
Your max_length is set to 200, but your input_length is only 186. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=93)


AI has rapidly evolved over the past decade, transforming industries and daily life in remarkable ways . From virtual assistants that can understand natural language to self-driving cars capable of navigating complex environments, AI systems are becoming increasingly sophisticated . This rapid advancement also raises concerns about privacy, job displacement, and ethical use .
