In [1]:
import pandas as pd
from transformers import pipeline

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
# src: https://www.msn.com/en-us/news/technology/us-researchers-develop-unhackable-computer-chip-that-works-on-light/ar-BB1in4AY?ocid=socialshare&pc=ASTS&cvid=1256c5647de9453a9666fe00fb8faa13&ei=47
text = """Researchers at the University of Pennsylvania have developed a new computer chip that uses light instead of electricity. This could improve the training of artificial intelligence (AI) models by improving the speed of data transfer and, more efficiently, reducing the amount of electricity consumed.
Humanity is building the exascale supercomputers today that can carry out a quintillion computations per second. While the scale of the computation may have increased, computing technology is still working on the principles that were first used in the 1960s.
Researchers have been working on developing computing systems based on quantum mechanics, too, but these computers are at least a few years from becoming widely available if not more. The recent explosion of AI models in technology has resulted in a demand for computers that can process large sets of information. The inefficient computing systems, though, result in high consumption of energy.
"""

In [3]:
# Text Classification
classifier = pipeline("text-classification")

print(classifier(text))

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


[{'label': 'NEGATIVE', 'score': 0.8756259083747864}]


In [4]:
# Named Entity Recognition
ner = pipeline("ner")
print(ner(text))

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision f2482bf (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[{'entity': 'I-ORG', 'score': 0.9950655, 'index': 4, 'word': 'University', 'start': 19, 'end': 29}, {'entity': 'I-ORG', 'score': 0.9960758, 'index': 5, 'word': 'of', 'start': 30, 'end': 32}, {'entity': 'I-ORG', 'score': 0.9958604, 'index': 6, 'word': 'Pennsylvania', 'start': 33, 'end': 45}, {'entity': 'I-MISC', 'score': 0.9550863, 'index': 29, 'word': 'AI', 'start': 181, 'end': 183}, {'entity': 'I-MISC', 'score': 0.97647524, 'index': 142, 'word': 'AI', 'start': 767, 'end': 769}]


In [5]:
# Question Answering
question_answerer = pipeline("question-answering")
question = "Why the current computing systems are inefficient?"
print(question_answerer(question=question, context=text))

No model was supplied, defaulted to distilbert-base-cased-distilled-squad and revision 626af31 (https://huggingface.co/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.


{'score': 0.7762266397476196, 'start': 927, 'end': 953, 'answer': 'high consumption of energy'}


In [6]:
# Summarization
summarizer = pipeline("summarization")
print(summarizer(text, max_length=50, clean_up_tokenization_spaces=True))

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.
Your min_length=56 must be inferior than your max_length=50.


[{'summary_text': ' Researchers at the University of Pennsylvania have developed a new computer chip that uses light instead of electricity. This could improve the training of artificial intelligence (AI) models by improving the speed of data transfer and, more efficiently, reducing the amount of'}]


In [7]:
# Translation
translator = pipeline("translation", model="Helsinki-NLP/opus-mt-en-ro")
outputs = translator(text, min_length=60, clean_up_tokenization_spaces=True)
print(outputs)

source.spm: 100%|██████████| 789k/789k [00:00<00:00, 7.66MB/s]
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
target.spm: 100%|██████████| 817k/817k [00:00<00:00, 9.47MB/s]
vocab.json: 100%|██████████| 1.39M/1.39M [00:00<00:00, 19.3MB/s]


[{'translation_text': 'Cercetătorii de la Universitatea din Pennsylvania au dezvoltat un nou cip de calculator care utilizează lumina în loc de electricitate. Acest lucru ar putea îmbunătăți formarea modelelor de inteligență artificială (AI) prin îmbunătățirea vitezei transferului de date și, mai eficient, reducerea cantității de energie electrică consumată. Umanitatea construiește supercomputer-urile exascale astăzi, care pot efectua un cvintilioane de calcule pe secundă. În timp ce scara de calcul poate să fi crescut, tehnologia de calcul este încă de lucru pe principiile care au fost utilizate pentru prima dată în anii 1960. Cercetătorii au lucrat la dezvoltarea sistemelor de calcul bazate pe mecanica cuantică, de asemenea, dar aceste calculatoare sunt de cel puțin câțiva ani de la a deveni disponibile pe scară largă, dacă nu mai mult. Explozia recentă a modelelor AI în tehnologie a dus la o cerere de calculatoare care pot procesa seturi mari de informații. Sistemele de calcul inefi

In [9]:
# Text Generation
generator = pipeline("text-generation")
response = "Yes, the new computer chip is a great innovation."
prompt = text + "\n\n You are right :\n" + response
outputs = generator(prompt, max_length=200)
print(outputs[0]['generated_text'])

No model was supplied, defaulted to gpt2 and revision 6c0e608 (https://huggingface.co/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Researchers at the University of Pennsylvania have developed a new computer chip that uses light instead of electricity. This could improve the training of artificial intelligence (AI) models by improving the speed of data transfer and, more efficiently, reducing the amount of electricity consumed.
Humanity is building the exascale supercomputers today that can carry out a quintillion computations per second. While the scale of the computation may have increased, computing technology is still working on the principles that were first used in the 1960s.
Researchers have been working on developing computing systems based on quantum mechanics, too, but these computers are at least a few years from becoming widely available if not more. The recent explosion of AI models in technology has resulted in a demand for computers that can process large sets of information. The inefficient computing systems, though, result in high consumption of energy.


 You are right :
Yes, the new computer chip