## <U>Pipelines</U>

### <u>Sentiment Analysis</u>

In [1]:
from transformers import pipeline

sentiment_classifier_pipeline = pipeline("sentiment-analysis")
sentiment_classifier_pipeline("The dhurandhar movie was really good, a bit gory and having blood shed but great nevertheless.")

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.





Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

Device set to use cpu


[{'label': 'POSITIVE', 'score': 0.9997439980506897}]

### <u>Zero shot text classification</u>

In [3]:
zero_shot_txcl_pipeline = pipeline("zero-shot-classification")
zero_shot_txcl_pipeline("The ball is hit with a bat. Feet are only used to run between the wickets.\
            Batter is not allowed to hit to ball with his feet unlike football.", 
                           candidate_labels=["cricket", "football", "hockey"],)

No model was supplied, defaulted to facebook/bart-large-mnli and revision d7645e1 (https://huggingface.co/facebook/bart-large-mnli).
Using a pipeline without specifying a model name and revision in production is not recommended.
Device set to use cpu


{'sequence': 'The ball is hit with a bat. Feet are only used to run between the wickets.            Batter is not allowed to hit to ball with his feet unlike football.',
 'labels': ['football', 'cricket', 'hockey'],
 'scores': [0.6336216330528259, 0.2596712112426758, 0.10670717805624008]}

### <u>Generating text</u>

In [4]:
text_gen_pipeline = pipeline("text-generation")
text_gen_pipeline("Usually after my works ends I")

No model was supplied, defaulted to openai-community/gpt2 and revision 607a30d (https://huggingface.co/openai-community/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use cpu
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'Usually after my works ends I think about it, what does it mean to be a writer? Do I feel like I\'m writing more or less the same, to be a good writer? Do I have some kind of inner voice in my head that I can manipulate as I see fit?\n\nIt\'s almost like a kind of a question.\n\nSo you\'re in the process of trying to create something, a story that\'s really different from anything that\'s already out there and that really sets the tone for the rest of your life.\n\n"That\'s what I\'m trying to do. That\'s the big question. That\'s how you create something."\n\nCan you explain how you came up with your writing style and what drew you to writing?\n\nI started writing when I was 10 or 11. I just started to write because I always wanted to be a writer and I wanted to be free. I always wanted to be free of my writing and I didn\'t want to be a musician or an artist. I was always working on my writing. I started to write when I was around 11 or 12. Then I started to writ

### <u>Generating text : with specific model </u>

In [5]:
generate_text_using_distilgpt2_pipeline = pipeline("text-generation", model="distilgpt2")
generate_text_using_distilgpt2_pipeline("Usually after my work ends I")

config.json:   0%|          | 0.00/762 [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/353M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Device set to use cpu
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': "Usually after my work ends I have to set up a few days early to prepare for this week's show.\n\nI set up the night before, with a small set of set sets starting at 8pm, and a night before.\nI set up the night before, with a small set of sets starting at 8pm, and a night before.\nI set up the night before, with a small set of sets starting at 8pm, and a night before.\nThe show is just a few weeks away from its first run in the show, and I have plans to cover it with a couple of days later.\nI have plans to cover it with a couple of days later.\nThe show is just a few weeks away from its first run in the show, and I have plans to cover it with a couple of days later.\nThe show is just a few weeks away from its first run in the show, and I have plans to cover it with a couple of days later.\nI have plans to cover it with a couple of days later.\nThe show is just a few weeks away from its first run in the show, and I have plans to cover it with a couple of days later.

### <u>Fill the masked token</u>

In [6]:
fill_in_the_blank_pipeline = pipeline("fill-mask")
fill_in_the_blank_pipeline("I go to the gym and lift <mask>.")

No model was supplied, defaulted to distilbert/distilroberta-base and revision fb53ab8 (https://huggingface.co/distilbert/distilroberta-base).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/480 [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/331M [00:00<?, ?B/s]

Some weights of the model checkpoint at distilbert/distilroberta-base were not used when initializing RobertaForMaskedLM: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


tokenizer_config.json:   0%|          | 0.00/25.0 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use cpu


[{'score': 0.9833672046661377,
  'token': 23341,
  'token_str': ' weights',
  'sequence': 'I go to the gym and lift weights.'},
 {'score': 0.002324433298781514,
  'token': 2408,
  'token_str': ' weight',
  'sequence': 'I go to the gym and lift weight.'},
 {'score': 0.001323976437561214,
  'token': 2185,
  'token_str': ' myself',
  'sequence': 'I go to the gym and lift myself.'},
 {'score': 0.00048466766020283103,
  'token': 62,
  'token_str': ' up',
  'sequence': 'I go to the gym and lift up.'},
 {'score': 0.0004324756737332791,
  'token': 8698,
  'token_str': ' muscle',
  'sequence': 'I go to the gym and lift muscle.'}]

### <u>Classify into entities</u>

In [8]:
named_entity_recog_pipeline = pipeline("ner")
named_entity_recog_pipeline("Robert works at Google in Mountain View, California.", grouped_entities=True)

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision 4c53496 (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Device set to use cpu


[{'entity_group': 'PER',
  'score': np.float32(0.99709904),
  'word': 'Robert',
  'start': 0,
  'end': 6},
 {'entity_group': 'ORG',
  'score': np.float32(0.99873585),
  'word': 'Google',
  'start': 16,
  'end': 22},
 {'entity_group': 'LOC',
  'score': np.float32(0.9961477),
  'word': 'Mountain View',
  'start': 26,
  'end': 39},
 {'entity_group': 'LOC',
  'score': np.float32(0.998719),
  'word': 'California',
  'start': 41,
  'end': 51}]