By default, this pipeline selects a particular pretrained model that has been fine-tuned for sentiment analysis in English. The model is downloaded and cached when you create the classifier object. If you rerun the command, the cached model will be used instead and there is no need to download the model again

In [1]:
from transformers import pipeline

classifier = pipeline("sentiment-analysis")
classifier("I've been waiting for a HuggingFace course my whole life.")

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

Device set to use cpu


[{'label': 'POSITIVE', 'score': 0.9598049521446228}]

In [2]:
classifier(
    ["I've been waiting for a HuggingFace course my whole life.", "I hate this so much!"]
)

[{'label': 'POSITIVE', 'score': 0.9598049521446228},
 {'label': 'NEGATIVE', 'score': 0.9994558691978455}]

## Available pipelines for different modalities

### Text pipelines
* text-generation: Generate text from a prompt
* text-classification: Classify text into predefined categories
* summarization: Create a shorter version of a text while preserving key information
* translation: Translate text from one language to another
* zero-shot-classification: Classify text without prior training on specific labels
* feature-extraction: Extract vector representations of text
### Image pipelines
* image-to-text: Generate text descriptions of images
* image-classification: Identify objects in an image
* object-detection: Locate and identify objects in images
### Audio pipelines
* automatic-speech-recognition: Convert speech to text
* audio-classification: Classify audio into categories
* text-to-speech: Convert text to spoken audio
### Multimodal pipelines
* image-text-to-text: Respond to an image based on a text prompt

In [3]:
classifier = pipeline("zero-shot-classification")
classifier(
    "This is a course about the Transformers library",
    candidate_labels=["education", "politics", "business"],
)

No model was supplied, defaulted to facebook/bart-large-mnli and revision d7645e1 (https://huggingface.co/facebook/bart-large-mnli).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json: 0.00B [00:00, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use cpu


{'sequence': 'This is a course about the Transformers library',
 'labels': ['education', 'business', 'politics'],
 'scores': [0.8445992469787598, 0.11197402328252792, 0.043426740914583206]}

## Using any model from the Hub in a pipeline with custom path

In [4]:
from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer

model_id = "HuggingFaceTB/SmolLM2-360M"
custom_path = "My_AI_Models/SmolLM2" # for example, a custom path on disk

model = AutoModelForCausalLM.from_pretrained(model_id, cache_dir=custom_path)
tokenizer = AutoTokenizer.from_pretrained(model_id, cache_dir=custom_path)

generator = pipeline(
    "text-generation", 
    model=model, 
    tokenizer=tokenizer
)

generator(
    "In this course, we will teach you how to",
    max_length=30,
    num_return_sequences=2,
)

config.json:   0%|          | 0.00/689 [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/724M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/111 [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/831 [00:00<?, ?B/s]

Device set to use cpu
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=30) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


[{'generated_text': 'In this course, we will teach you how to do all kinds of math problems, including the type of math you have to do in the real world.\n\nThis course is for all you math students who want to know how to get better at math.\n\nIt’s not just what you learn in school, but the way you learn it.\n\nIt also includes tips for how to do better in math and how you can learn math faster.\n\nYou’ll learn how to solve math problems, how to use math to solve problems, and how you can use math to study for a math test.\n\nYou’ll also learn how many times you’ll need to study for math, how many times you’ll need an exam to get a good grade, and how to study for the exam.\n\nThe course is designed to help you learn math faster and more efficiently.\n\nIt’s not a course for everyone, but it can help you get better at math, and it can help you be more successful in math.\n\nThe course has been designed by people who have been in math for many years, including some of the top math teac

In [5]:
generator(
    "In this course, we will teach you how to",
    max_length=30,
    num_return_sequences=2,
)

Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=30) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


[{'generated_text': "In this course, we will teach you how to teach yourself to code!\n\nWe'll teach you the fundamentals of programming, including how to use a programming language that's easy to learn and understand, and how to build your own code.\n\nWe'll start with the basics, like how to read a program and use it to solve problems. We'll also cover how to write code.\n\nYou'll learn how to write code for a wide range of programming languages. You'll also learn how to use programming languages to solve problems.\n\nWe cover a variety of programming languages, including C, C++, JavaScript, and Python. We'll teach you how to use these languages to build your own programs.\n\nThe goal of this course is to help you learn how to code, and to teach you how to use programming languages to solve problems. We'll also provide you with the resources you need to get started.\n\nWe'll cover the basics, including how to read code and use it to solve problems. We'll also teach you how to write c