Text Classification

In [1]:
from transformers import pipeline

In [2]:
# Load the sentiment-analysis pipeline
classifier = pipeline('sentiment-analysis')

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


In [3]:
# Sample text
texts = [
    "I love this product! It's amazing.",
    "I'm very disappointed. It broke after one day.",
    "This movie was a waste of time.",
    "Such a wonderful experience, highly recommended!"
]

In [4]:
# Classify the sentiment of the text
results = classifier(texts)

In [5]:
# Display the results
for text, result in zip(texts, results):
    print(f"Text: {text}\nSentiment: {result['label']}, Confidence: {result['score']:.2f}\n")

Text: I love this product! It's amazing.
Sentiment: POSITIVE, Confidence: 1.00

Text: I'm very disappointed. It broke after one day.
Sentiment: NEGATIVE, Confidence: 1.00

Text: This movie was a waste of time.
Sentiment: NEGATIVE, Confidence: 1.00

Text: Such a wonderful experience, highly recommended!
Sentiment: POSITIVE, Confidence: 1.00



Name Entity Recognition

In [6]:
# Load the NER pipeline
ner_classifier = pipeline('ner', grouped_entities=True)

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision f2482bf (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/998 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.33G [00:00<?, ?B/s]

Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


tokenizer_config.json:   0%|          | 0.00/60.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


In [7]:
# Sample text
text = "Apple is looking at buying U.K. startup for $1 billion. Elon Musk, CEO of SpaceX, gave a speech in San Francisco."

In [8]:
# Perform NER on the text
ner_results = ner_classifier(text)

In [9]:
# Display the results
for entity in ner_results:
    print(f"Entity: {entity['word']}, Type: {entity['entity_group']}, Confidence: {entity['score']:.2f}")

Entity: Apple, Type: ORG, Confidence: 1.00
Entity: U, Type: LOC, Confidence: 1.00
Entity: K, Type: LOC, Confidence: 1.00
Entity: Elon Musk, Type: PER, Confidence: 1.00
Entity: SpaceX, Type: ORG, Confidence: 1.00
Entity: San Francisco, Type: LOC, Confidence: 1.00


Summarization


In [17]:
from transformers import pipeline

# Load the summarization pipeline
summarizer = pipeline('summarization')

# Sample text to summarize
text = """
Artificial Intelligence (AI) is transforming various industries, from healthcare to transportation.
By leveraging machine learning algorithms and vast amounts of data, AI systems can perform tasks that
traditionally require human intelligence. These include recognizing speech, making decisions, and even
completing complex creative tasks. As AI continues to evolve, it holds the promise of unlocking new possibilities,
though concerns about ethics and job displacement are often discussed in parallel.
"""

# Generate the one-line summary
summary = summarizer(text, max_length=20, min_length=10, do_sample=False)

# Display the one-line summary
print("Summary:", summary[0]['summary_text'])


No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.
Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


Summary:  Artificial Intelligence (AI) is transforming various industries, from healthcare to transportation . AI systems


Question Answering

In [18]:
from transformers import pipeline

# Load the question-answering pipeline
qa_pipeline = pipeline('question-answering')

# Define the context (text from which the answer will be extracted)
context = """
The Eiffel Tower is located in Paris, France. It was constructed in 1887–1889 as the entrance arch to the 1889 World's Fair.
It is one of the most iconic structures in the world, attracting millions of tourists annually. The tower was designed
by engineer Gustave Eiffel and stands at a height of 330 meters (1,083 feet). Initially criticized by some artists and
intellectuals for its design, it has since become a symbol of French engineering and culture.
"""

# Define the question
question = "Who designed the Eiffel Tower?"

# Get the answer
answer = qa_pipeline(question=question, context=context)

# Display the result
print(f"Question: {question}")
print(f"Answer: {answer['answer']}")


No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 626af31 (https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/473 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/261M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/49.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/436k [00:00<?, ?B/s]

Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


Question: Who designed the Eiffel Tower?
Answer: Gustave Eiffel
