# Huggingface Pipeline

Running a Hugging Face pipeline locally on your PC allows you to leverage state-of-the-art models for a variety of tasks, such as text classification, question answering, text generation, and more, without the need for deep knowledge in natural language processing or machine learning model architectures. Here's a step-by-step guide to get you started:

The first time you run a pipeline for a specific model, the transformers library will download the necessary model and tokenizer files, which might take some time depending on your internet connection. Subsequent uses of the same model will be much faster since the model will be cached locally.

**Considerations:**

**Model Download Size:** Some models can be quite large, requiring significant disk space and bandwidth to download.
**Computational Resources:** Running large models, especially for tasks like text generation or deep learning-based analysis, can be resource-intensive. Ensure your PC has adequate memory and, if supported, a compatible GPU to accelerate computations.

**Environment Management:** Consider using a virtual environment (e.g., via venv or conda) to manage dependencies and avoid conflicts between different projects.
By following these steps, you'll be able to run Hugging Face pipelines locally on your PC, enabling access to a wide range of pre-trained models for various natural language processing tasks.



In [None]:
# !pip install transformers

In [16]:
from transformers import pipeline

In [None]:
# %pip install tf-keras

In [5]:
# load a pipeline for sentiment analysis

classifier= pipeline("sentiment-analysis")

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.





model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]




All PyTorch model weights were used when initializing TFDistilBertForSequenceClassification.

All the weights of TFDistilBertForSequenceClassification were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFDistilBertForSequenceClassification for predictions without further training.


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

In [6]:
# Use the pipeline to classify the sentiment of a sentence
result = classifier("I love using natural language processing tools!")
print(result)

[{'label': 'POSITIVE', 'score': 0.9996317625045776}]


In [10]:
# Use the pipeline to classify the sentiment of a sentence
result = classifier("I did full efort to teach but students have average performance")
print(result)

[{'label': 'NEGATIVE', 'score': 0.6452227234840393}]


# explore text generation

In [27]:
# pip install torch torchvision torchaudio

generator = pipeline('text-generation')
result = generator("In this course, we will teach you how to")
print(result)

No model was supplied, defaulted to openai-community/gpt2 and revision 6c0e608 (https://huggingface.co/openai-community/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

All PyTorch model weights were used when initializing TFGPT2LMHeadModel.

All the weights of TFGPT2LMHeadModel were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFGPT2LMHeadModel for predictions without further training.


tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'In this course, we will teach you how to create and maintain a secure network of servers of the sort you will probably see in an office or workroom, using our software or other tools.\n\nThe purpose of this course is to make this'}]


In [28]:
# question answering
question_answerer = pipeline('question-answering')
context = "The name of the course is Natural Language Processing"
result = question_answerer(question="What is the name of the course to learn Generative AI?", context=context)
print(result)

No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 626af31 (https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/473 [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


model.safetensors:   0%|          | 0.00/261M [00:00<?, ?B/s]

All PyTorch model weights were used when initializing TFDistilBertForQuestionAnswering.

All the weights of TFDistilBertForQuestionAnswering were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFDistilBertForQuestionAnswering for predictions without further training.


tokenizer_config.json:   0%|          | 0.00/49.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/436k [00:00<?, ?B/s]

{'score': 0.9824060797691345, 'start': 26, 'end': 53, 'answer': 'Natural Language Processing'}


In [29]:
# ner (named entity recognition) pipeline
ner = pipeline('ner')


No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision f2482bf (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/998 [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


model.safetensors:   0%|          | 0.00/1.33G [00:00<?, ?B/s]

All PyTorch model weights were used when initializing TFBertForTokenClassification.

All the weights of TFBertForTokenClassification were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertForTokenClassification for predictions without further training.


tokenizer_config.json:   0%|          | 0.00/60.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

[{'entity': 'I-PER', 'score': 0.99917185, 'index': 8, 'word': 'Muhammad', 'start': 28, 'end': 36}, {'entity': 'I-PER', 'score': 0.99955934, 'index': 9, 'word': 'A', 'start': 37, 'end': 38}, {'entity': 'I-PER', 'score': 0.9983216, 'index': 10, 'word': '##am', 'start': 38, 'end': 40}, {'entity': 'I-PER', 'score': 0.9992617, 'index': 11, 'word': '##mar', 'start': 40, 'end': 43}, {'entity': 'I-PER', 'score': 0.9995291, 'index': 12, 'word': 'Tu', 'start': 44, 'end': 46}, {'entity': 'I-PER', 'score': 0.9364727, 'index': 13, 'word': '##fa', 'start': 46, 'end': 48}, {'entity': 'I-PER', 'score': 0.98786885, 'index': 14, 'word': '##il', 'start': 48, 'end': 50}]


In [30]:
result = ner("The course is taught by Dr. Muhammad Adeel munir")
print(result)

[{'entity': 'I-PER', 'score': 0.99894696, 'index': 8, 'word': 'Muhammad', 'start': 28, 'end': 36}, {'entity': 'I-PER', 'score': 0.9993374, 'index': 9, 'word': 'Ad', 'start': 37, 'end': 39}, {'entity': 'I-PER', 'score': 0.9985158, 'index': 10, 'word': '##eel', 'start': 39, 'end': 42}, {'entity': 'I-PER', 'score': 0.9798789, 'index': 11, 'word': 'm', 'start': 43, 'end': 44}, {'entity': 'I-PER', 'score': 0.87590235, 'index': 12, 'word': '##uni', 'start': 44, 'end': 47}, {'entity': 'I-PER', 'score': 0.8741402, 'index': 13, 'word': '##r', 'start': 47, 'end': 48}]


In [31]:
# print the result in a more readable format
for entity in result:
    print(f"{entity['entity']} : {entity['word']}")

I-PER : Muhammad
I-PER : Ad
I-PER : ##eel
I-PER : m
I-PER : ##uni
I-PER : ##r
