# Natural Language Processing
Field of linguistics and machine learning focused on understanding everything related to human language.

The following is a list of common NLP tasks, with some examples of each:
1. Classifying whole sentences
2. Classifying each word in a sentence
3. Generating text content
4. Extracting an answer from a text
5. Generating a new sentence from an input text

## Why is it a challenge?
The text needs to be processed in a way that enables the model to learn from it. And because language is complex, we need to think carefully about how this processing must be done.

# Trasformers
Install the Transformers, Datasets, and Evaluate libraries

In [None]:
!pip install datasets evaluate transformers[sentencepiece]

The most basic object in the 🤗 Transformers library is the pipeline() function. It connects a model with its necessary preprocessing and postprocessing steps, allowing us to directly input any text and get an intelligible answer:

In [None]:
from transformers import pipeline

classifier = pipeline("sentiment-analysis")
classifier("I've been waiting for a RoboAI course on ML and AI, my whole life!")

We can even pass several sentences!

In [None]:
classifier(
    ["I've been waiting for a RoboAI course on ML and AI, my whole life!",
     "I hate this course so much, I just wanna keave this Zoom meeting! Ughh!"]
)

There are three main steps involved when you pass some text to a pipeline:

1. The text is preprocessed into a format the model can understand.
2. The preprocessed inputs are passed to the model.
3. The predictions of the model are post-processed, so you can make sense of them.

## Zero-shot classification

You’ve already seen how the model can classify a sentence as positive or negative using those two labels — but it can also classify the text using any other set of labels you like

In [None]:
from transformers import pipeline

classifier = pipeline("zero-shot-classification")
classifier(
    "This is a course about the Transformers library",
    candidate_labels=["education", "politics", "business"],
)

## Text-generation


In [None]:
from transformers import pipeline

generator = pipeline("text-generation")
generator("In this course on ML and AI by RoboAI, we will teach you how to")

In [None]:
# you can also choose the model that you wish to use! - here, distilgpt2
from transformers import pipeline

generator = pipeline("text-generation", model="distilgpt2")
generator(
    "In this course on ML and AI by RoboAI, we will teach you how to",
    max_length=30,
    num_return_sequences=2,
)

## Mask filling

In [None]:
from transformers import pipeline

unmasker = pipeline("fill-mask")
unmasker("This course will teach you all about <mask> models.", top_k=2)

## Named entity recognition

In [None]:
from transformers import pipeline

ner = pipeline("ner", grouped_entities=True)
ner("My name is Keivalya and I and the CTO of QuickBot Tech in Vadodara.")

## Question Answering

In [None]:
from transformers import pipeline

question_answerer = pipeline("question-answering")
question_answerer(
    question="Where do I work?",
    context="My name is Keivalya and I and the CTO of QuickBot Tech in Vadodara.",
)

## Summarization

In [None]:
from transformers import pipeline

summarizer = pipeline("summarization")
summarizer(
    """
    America has changed dramatically during recent years. Not only has the number of
    graduates in traditional engineering disciplines such as mechanical, civil,
    electrical, chemical, and aeronautical engineering declined, but in most of
    the premier American universities engineering curricula now concentrate on
    and encourage largely the study of engineering science. As a result, there
    are declining offerings in engineering subjects dealing with infrastructure,
    the environment, and related issues, and greater concentration on high
    technology subjects, largely supporting increasingly complex scientific
    developments. While the latter is important, it should not be at the expense
    of more traditional engineering.

    Rapidly developing economies such as China and India, as well as other
    industrial countries in Europe and Asia, continue to encourage and advance
    the teaching of engineering. Both China and India, respectively, graduate
    six and eight times as many traditional engineers as does the United States.
    Other industrial countries at minimum maintain their output, while America
    suffers an increasingly serious decline in the number of engineering graduates
    and a lack of well-educated engineers.
"""
)

Like with text generation, you can specify a `max_length` or a `min_length` for the result.

## Translation

In [None]:
from transformers import pipeline

translator = pipeline("translation", model="Helsinki-NLP/opus-mt-en-hi")
translator("My name is Keivalya, and I live in Vadodara.")