## 📚 1. Installing Essential Libraries

Before we can start working with powerful language models, we need to install the necessary libraries. This cell handles the installation of key packages from the Python Package Index (PyPI).

- **`transformers`**: This is the core library from Hugging Face. It provides the tools and models we need for various NLP tasks, including the `pipeline` API which makes it incredibly simple to use pre-trained models.
- **`sentencepiece`**: This is a tokenizer/detokenizer library, required by many modern transformer models (like T5, ALBERT, XLNet) to properly process text.
- **`sacremoses`**: Another dependency for tokenization, this library is often used as an underlying tool by the `transformers` library for handling text in different languages.

In [None]:
!pip install -U transformers
!pip install -U sentencepiece
!pip install -U sacremoses

## 📂 2. Setting a Custom Cache Directory (Optional)

Hugging Face models can be quite large (sometimes several gigabytes!). When you use a model for the first time, the `transformers` library downloads it and saves it to a cache directory. By default, this is usually in your user home folder.

This code block is useful for managing storage. It sets an environment variable `HF_HOME` to a custom path (`X:\AI-learin\courss\Fine-Tuning-LLM-with-HuggingFace-main\models`). This tells Hugging Face to download and store all models in this specific folder. It's a great way to keep your main drive clean and organize your model files, especially if you're working with many different models.

In [None]:
import os
new_cache_dir = """X:\AI-learin\courss\Fine-Tuning-LLM-with-HuggingFace-main\models"""
os.environ['HF_HOME'] = new_cache_dir

## 📦 3. Importing Necessary Modules

Now that our environment is set up, we import the specific tools we'll use in this script.

- **`pipeline` from `transformers`**: This is a high-level, easy-to-use API that simplifies the process of using models for inference. It handles all the complex background steps like tokenization, feeding data to the model, and decoding the output.
- **`pandas` as `pd`**: Pandas is a powerful data analysis and manipulation library. We'll use it here to display the model's output in a clean, readable table called a DataFrame.

In [1]:
from transformers import pipeline
import pandas as pd

## ✨ 4. Performing Text Classification for Emotion Detection

This is where the magic happens! We'll use the `pipeline` to classify the emotion of a given text.

1.  **Model Selection**: We specify the model we want to use from the Hugging Face Hub. Here, we're using `"SamLowe/roberta-base-go_emotions"`. This is a RoBERTa model that has been specifically fine-tuned on a dataset of emotions, making it excellent for this task.

2.  **Creating the Pipeline**: We instantiate the `pipeline` for `"text-classification"`. We pass our chosen model and specify `device="cuda"` to instruct the pipeline to use a GPU for computation. Using a GPU significantly speeds up the process. If you don't have a GPU or CUDA is not configured, you can remove this argument or set it to `device=-1` to use the CPU.

3.  **Inference**: We provide a sample sentence, `text = "wow! we have come across this far"`, to our `classifier`.

4.  **Displaying Results**: The classifier returns a list of dictionaries, where each dictionary contains a predicted `label` (the emotion) and a `score` (the model's confidence). We wrap this output in a `pd.DataFrame()` to present the results in a clear and organized table.

In [2]:

model = "SamLowe/roberta-base-go_emotions"
model = "tabularisai/multilingual-sentiment-analysis"
model = "cardiffnlp/twitter-roberta-base-sentiment-latest"

classifier = pipeline("text-classification", model = model, device="cuda")

text = "wow! we have come across this far"
outputs = classifier(text)
pd.DataFrame(outputs)

config.json: 0.00B [00:00, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/499M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/380 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/280 [00:00<?, ?B/s]

Device set to use cuda


Unnamed: 0,label,score
0,surprise,0.630752
