<a href="https://colab.research.google.com/github/radve88/Learning-AI/blob/main/huggingface3_pynb.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Here's a complete version of the pipeline without shortcuts — so you can clearly see the tokenizer, model, forward pass, and probability computation all laid out step-by-step.

AutoTokenizer turns raw text → token IDs.

AutoModelForSequenceClassification runs the model.

softmax converts logits → probabilities.

You get full transparency and control compared to the pipeline() abstraction.



In [1]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import torch.nn.functional as F

# 1. Load the pre-trained model and tokenizer
model_name = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# 2. Text input
text = "I love Hugging Face!"

# 3. Tokenize input
inputs = tokenizer(text, return_tensors="pt")

# 4. Disable gradient calculations (inference mode)
with torch.no_grad():
    outputs = model(**inputs)

# 5. Extract logits and apply softmax to get probabilities
logits = outputs.logits
probs = F.softmax(logits, dim=-1)

# 6. Get label and confidence
predicted_class = torch.argmax(probs).item()
confidence = probs[0][predicted_class].item()

# 7. Map class index to label name
labels = model.config.id2label
predicted_label = labels[predicted_class]

# 8. Print results
print(f"Text: {text}")
print(f"Label: {predicted_label}, Score: {confidence:.4f}")


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

Text: I love Hugging Face!
Label: POSITIVE, Score: 0.9999
