# My first Hugging Face test

## Installing transformers, tokenizers, datasets packages in main Python environment

I used this page to get going with Hugging Face and transformers. 

https://www.freecodecamp.org/news/get-started-with-hugging-face/

So on my main Python installation I ran:

In [None]:
#remember to use the main Python installation kernel at this point!
%pip install transformers

%pip install tokenizers, datasets

I'm not sure if this is really necessary to do this on the main Python installation, since I'm later on is using a virtual environment where I installed Torch and I believe it automatically installed transformers and tokenizers in that virtual environment (see steps below).


## Creating a virtual Python environment

I created the virtual environment as described on this page:

https://docs.python.org/3/library/venv.html

In a cmd window I ran the below to create the virtual environment:

In [None]:
REM remeber to use the main Python installation at this point!
python -m venv "C:\Temp\PythonEnvironments\HuggingFaceTest1"

Then I ran the below in the same cmd window to activate the virtual environment:

In [None]:
C:\Temp\PythonEnvironments\HuggingFaceTest1\Scripts\activate

Then I ran the below in the same cmd window to install IPython Kernel:

In [None]:
pip install ipykernel

Then I ran the below in the same cmd window to add the virtual environment as a kernel in Jupyter:

In [None]:
python -m ipykernel install --user --name=HuggingFaceTest1 --display-name="Python (HuggingFaceTest1)"

Now, from this Jupyter notebook, you can switch the kernel by clicking on the kernel name (usually displayed at the top right of the notebook interface).  
You should see the new kernel in the list of available kernels.

**Note that a restart of the Jupyter notebook (or Visual Code) might be necessary to see the new kernel in the list of available kernels.** 

To make sure we are using our new virtual environment we can run the below to see where pip is installed.  
In my case I saw two entries for pip, one in the virtual environment and one in the main Python installation:  
```console
c:\Temp\PythonEnvironments\HuggingFaceTest1\Scripts\pip.exe  
C:\Users\dingvars\AppData\Local\Programs\Python\Python312\Scripts\pip.exe
```

In [None]:
#here we should switch to the new virtual Python environment kernel
!where pip

We can also run the below to see which pip is being used.  
It should be the one in the virtual environment.

In [None]:
!pip -V 


We can the use the below to see which packages are installed in the virtual environment.  
I did not see transformers or tokenizers packages installed in the virtual environment at this point. 

In [None]:
!pip list

## Installing transformers package in the virtual environment

Then I installed Hugging Face Transformers in the virtual environment with the below command.


In [None]:
%pip install transformers

Using the below command I could see that transformers (and tokenizers) were now installed in the virtual environment.

In [None]:
!pip list

## Installing misc packages in the virtual environment

At this point I tried to run the test code below but received a number of error messages.  
In the end I needed to run the below pip install commands to install misc packages to get the code to execute without errors.

**After these packages are installed you must restart the virtual environment kernel to use updated packages.**

In [None]:
%pip install torch

%pip install tensorflow

%pip install ipywidgets

%pip install tf-keras

# Running a model from Hugging Face

At this point we are ready for our first test of the huggingface transformers library.

In [None]:
from transformers import pipeline

# Load a pre-trained pipeline for sentiment analysis
sentiment_analysis = pipeline("sentiment-analysis")

# Use the pipeline
result = sentiment_analysis("I love using Hugging Face Transformers!")
print(result)

## Running the models on the GPU

I wanted to use the Nvidia RTX A2000 GPU that I have in my computer to run the models.

To achive that I had to (re)install the Torch package with CUDA support.  
https://en.wikipedia.org/wiki/CUDA  

First I uninstalled the existing Torch package:

In [None]:
%pip uninstall -y torch torchvision torchaudio


Then I (re)installed the Torch package with CUDA support.  
I used this page to find the correct command:  
https://pytorch.org/get-started/locally/

**After these packages are installed you must restart the virtual environment kernel to use updated packages.**

In [None]:
%pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
#the cu118 part defines CUDA 11.8

At this point I received errors when trying to execute the code below to verify the Torch installation.  
To solve the issue I had to manually download and put the libomp140.x86_64.dll in the  
C:\Temp\PythonEnvironments\HuggingFaceTest1\Lib\site-packages\torch\lib folder.
  
I downloaded the dll from:  
https://www.dllme.com/dll/files/libomp140_x86_64/00637fe34a6043031c9ae4c6cf0a891d/download

**After doing this I had to restart the kernel for the code below to work.**

The error message I received did not directly mention the libomp140.x86_64.dll file.  
It looked something like:  
"*WinError 126, error loading fbgemm.dll or dependencies*".  
I used the Dependency tool to find out that it was actually the libomp140.x86_64.dll file that was missing.  
https://github.com/lucasg/Dependencies

After this I could run the code below to verify that Torch was installed correctly and that the GPU was available.

In [1]:
import torch

print(torch.__version__) # Should print the version of PyTorch you have installed
print(torch.cuda.is_available())  # Should return True if CUDA is available and correctly set up
print(torch.cuda.current_device())  # Should return 0 for the first GPU device
print(torch.cuda.get_device_name(0))  # Should print the name of your GPU

2.4.0+cu118
True
0
NVIDIA RTX A2000 Laptop GPU


The below code should now work OK, but in the output you should see that the GPU is not being used.  
The message should be something like:  
*"Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU."*

In [None]:
from transformers import pipeline

generator = pipeline('text-generation', model='gpt2')

generator("In this course, we will teach you how to", 
                   max_length=30, 
                   num_return_sequences=100, 
                   truncation=True)

The below code does the same thing as the code above but here we specify that the first GPU device should be used for the model.  
In my case this is my Nvidia RTX A2000 GPU.  
This code also runs considerably faster than the code above since it uses the GPU for parallel processing.  
The gain is larger the more sequences we process.

In [3]:
from transformers import pipeline

# Check if GPU is available and set the device accordingly
device = 0 if torch.cuda.is_available() else -1  # 0 for GPU, -1 for CPU

# Initialize the pipeline with the device argument
generator = pipeline('text-generation', model='gpt2', device=device)

# Call the generator with truncation=True
generator("In this course, we will teach you how to", 
                   max_length=30, 
                   num_return_sequences=100, 
                   truncation=True)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'In this course, we will teach you how to create a powerful web library. You will then move on to the next subject.\n\nOur goal'},
 {'generated_text': 'In this course, we will teach you how to create reusable content, how to share it, and how to create scalable web applications.'},
 {'generated_text': 'In this course, we will teach you how to build a custom dashboard for React.js and a web app.\n\nYou will also learn a'},
 {'generated_text': 'In this course, we will teach you how to use the built-in C++11 template functions.\n\nFor the C++11 template classes'},
 {'generated_text': 'In this course, we will teach you how to build your first project on the ground. Here you will be able to view some of the basics of'},
 {'generated_text': "In this course, we will teach you how to develop the confidence level of your skills, the mindset of your partner's thinking and the overall decision making"},
 {'generated_text': 'In this course, we will teach you how to use differen

## Finding other models to play around with

https://huggingface.co/models