<a href="https://colab.research.google.com/github/aljebraschool/-Car-Evaluation-Using-KNN-Classification-Algorithm/blob/master/NER_QA_Text_summarization_practice.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
from transformers import pipeline
import pandas as pd


## Name Entity Recognition


In [None]:
#create ner object using ner as pipeline argument
ner_nlp = pipeline('ner')

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision f2482bf (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [None]:
#pass some context "argument" to the ner object to identity name
result = ner_nlp('I am a machine learning instructor, and my name is Ridwan. ')

print('Result: ', result)

Result:  [{'entity': 'I-PER', 'score': 0.99924266, 'index': 12, 'word': 'R', 'start': 51, 'end': 52}, {'entity': 'I-PER', 'score': 0.9638671, 'index': 13, 'word': '##id', 'start': 52, 'end': 54}, {'entity': 'I-PER', 'score': 0.9941451, 'index': 14, 'word': '##wan', 'start': 54, 'end': 57}]


In [None]:
#better visualize the result in dataframe
df = pd.DataFrame(result)

In [None]:
print(df)#it recognized I-PER as individual person with its score, index, word and position (start, end)

  entity     score  index   word  start  end
0  I-PER  0.999243     12      R     51   52
1  I-PER  0.963867     13   ##id     52   54
2  I-PER  0.994145     14  ##wan     54   57


## Question and Answering

In [None]:
#create QA  object using question-answering as pipeline argument

qa_nlp = pipeline('question-answering')

No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 626af31 (https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/473 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/261M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/29.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/436k [00:00<?, ?B/s]

In [None]:
#give context to pass to it

context = f'''
Nigeria is the most populous black nation on earth. It has the population of more than 200 million people. It has more than 360
ethnic groups with Yoruba, Hausa and Igbo being the major ethnic group in the country. The country is facing lots of challenges ranging from corruption,
bad leadership, mismanagement of resources, poverty and insecurity.
'''

In [None]:
#pass the needed arguments to the QA object, including the question and context parameters
result = qa_nlp(question = 'Describe Nigeria in few sentences', context = context)

In [None]:
print(result)

{'score': 0.051765672862529755, 'start': 1, 'end': 51, 'answer': 'Nigeria is the most populous black nation on earth'}


In [None]:
#test your model
result = qa_nlp(question = 'what are all the problems in Nigeria', context = context)

print(result)

{'score': 0.03296797350049019, 'start': 328, 'end': 350, 'answer': 'poverty and insecurity'}


## Text Summarization


In [None]:

#create context for text summarization

context = '''Our mission is to help prepare young professionals and college interns for their choice
careers and the future of work. It is a known fact that getting the first professional job,
post college education, is universally challenging. Depending on the country, it may be
the most difficult challenge young graduates will ever face. Whether one is successful at
it or not could determine the course of one’s life. This is where Hamoye comes in. We
offer world class, specially designed, internship style learning programs, to help
prospective interns acquire real world work experience; all for free. What’s the catch?
There is no catch! Our goal is to help struggling young professionals or college
students who want to set themselves up for success with some of the tools they need to
differentiate themselves in the tough tech job market.
Our internship programs simulate the real work environment, while cutting out all of the
unnecessary distractions that come with traditional internships. Not only do we have the
capacity to provide everyone with an opportunity to acquire job-ready expertise, we
have the special sauce to ensure our interns get the best out of the experience. Also,
interns will work alone and with peers, in a fast paced, competitive environment, to keep
them motivated throughout their choice program.
Our internship programs are offered 3 to 4 times a year and involve:
(i) specially designed short courses with real-world projects to introduce interns to the
tools they need to stand out
(ii) exposure to industry experts via webinars
(iii) career coaching and
(iv) presentation project with peers.
Our Social Learning Management Platform (hamoye.com) is designed to help interns
participate from anywhere in the world, track their accomplishments, and showcase
themselves to potential employers.
'''

In [None]:
#check it original length
len(context)

1822

In [None]:
#create texr summarization  object using summarization as pipeline argument
summarizer_nlp = pipeline('summarization')

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.


In [None]:
#pass the context to summarization object
summary = summarizer_nlp(context)

In [None]:
#visualize your text
print(summary)

[{'summary_text': ' Hamoye.com offers world class, internship style learning programs, to help prospective interns acquire real world work experience . Our goal is to help struggling young professionals or college students who want to set themselves up for success with some of the tools they need to differentiate themselves in the tough tech job market . Our internship programs simulate the real work environment, while cutting out all of the unnecessary distractions that come with traditional internships .'}]


In [None]:
#since its a list get the first item
result = summary[0]

In [None]:
#visualize again
print(result)

{'summary_text': ' Hamoye.com offers world class, internship style learning programs, to help prospective interns acquire real world work experience . Our goal is to help struggling young professionals or college students who want to set themselves up for success with some of the tools they need to differentiate themselves in the tough tech job market . Our internship programs simulate the real work environment, while cutting out all of the unnecessary distractions that come with traditional internships .'}


In [None]:
#see result of summary
result = result['summary_text']

In [None]:
#check the length
print(len(result))

492


In [None]:
#remove trailing space on both size
result = result.strip()

In [None]:
#see result
print(result)

Hamoye.com offers world class, internship style learning programs, to help prospective interns acquire real world work experience . Our goal is to help struggling young professionals or college students who want to set themselves up for success with some of the tools they need to differentiate themselves in the tough tech job market . Our internship programs simulate the real work environment, while cutting out all of the unnecessary distractions that come with traditional internships .


In [None]:
#join them back and see result
result = ' '.join(result.split())

print(result)

Hamoye.com offers world class, internship style learning programs, to help prospective interns acquire real world work experience . Our goal is to help struggling young professionals or college students who want to set themselves up for success with some of the tools they need to differentiate themselves in the tough tech job market . Our internship programs simulate the real work environment, while cutting out all of the unnecessary distractions that come with traditional internships .


In [None]:
print(len(result))

491
