<a href="https://www.kaggle.com/code/kapilstp84/guide-to-run-llama2-using-hugging-face?scriptVersionId=140144058" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# Beginner's Guide to run llama2 using hugging face

### Running llama 2 on kaggle after getting the access from meta

### Below notebook also provides a comparison of output from Llama2 7b & ChatGPT

* Make sure you have necessary permissions to access the hugging face repository name. If not, follow the below steps:
* Go to meta website & login/sign-up. Use the same email id/username to get permissions to use Llama2 via hugging face. It takes 1-2 days for permissions to be granted by meta team (generally takes few hours)

### In this example we run pre-trained model "meta-llama/Llama-2-7b-chat-hf"

## Creating token to login to Hugging Face

* To access private repositories or specific resources on the Hugging Face model hub, you'll need an authentication token. 
* This token verifies your identity and permissions. You can generate a token from your Hugging Face account settings. 
* Visit https://huggingface.co/settings/tokens while logged in to your Hugging Face account.
* Click on "New Token" to generate a new token.Provide a name for the token (e.g., "My Token for Hugging Face CLI"). Click "Create" to generate the token.

In [1]:
pip install transformers

Note: you may need to restart the kernel to use updated packages.


In [2]:
from huggingface_hub import login
login()

#Pass the hugging face token below, when prompted

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [3]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification, AutoModelForCausalLM, TrainingArguments, Trainer

model_name = "meta-llama/Llama-2-7b-chat-hf"
Tokenizer = AutoTokenizer.from_pretrained(model_name)

caused by: ['/opt/conda/lib/python3.10/site-packages/tensorflow_io/python/ops/libtensorflow_io_plugins.so: undefined symbol: _ZN3tsl6StatusC1EN10tensorflow5error4CodeESt17basic_string_viewIcSt11char_traitsIcEENS_14SourceLocationE']
caused by: ['/opt/conda/lib/python3.10/site-packages/tensorflow_io/python/ops/libtensorflow_io.so: undefined symbol: _ZTVN10tensorflow13GcsFileSystemE']


Downloading (…)okenizer_config.json:   0%|          | 0.00/776 [00:00<?, ?B/s]

Downloading tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/414 [00:00<?, ?B/s]

In [4]:
model = AutoModelForCausalLM.from_pretrained(model_name, use_auth_token=True)

Downloading (…)lve/main/config.json:   0%|          | 0.00/614 [00:00<?, ?B/s]

Downloading (…)fetensors.index.json:   0%|          | 0.00/26.8k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading (…)of-00002.safetensors:   0%|          | 0.00/9.98G [00:00<?, ?B/s]

Downloading (…)of-00002.safetensors:   0%|          | 0.00/3.50G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading (…)neration_config.json:   0%|          | 0.00/188 [00:00<?, ?B/s]

### This below "input" part of the code is related to tokenizing text using the Hugging Face Transformers library, likely using a pre-trained tokenizer for natural language processing tasks.

* Input_text is a variable that holds a text string. What ever you want to input into the model.
* Tokenizer refers to a tokenizer object from the Hugging Face Transformers library. A tokenizer is used to process text data and convert it into a format that machine learning models can understand. It typically divides the text into tokens, converts them into numerical values (IDs)
* The encode method of the tokenizer is being used to process the Input_text. It takes the input text and converts it into a format that can be fed into a machine learning model.
* The return_tensors parameter specifies the format in which the encoded data should be returned. In this case, 'pt' stands for PyTorch tensors.

### In Summary, the code takes the Input_text, encodes it using a tokenizer, and returns the encoded text as PyTorch tensors. This encoded text can then be used as input for various natural language processing tasks, such as language generation, sentiment analysis, or text classification, using pre-trained models available in the Hugging Face Transformers library.

In [13]:
Input_text = "My favourite movie is The Departed. Can you recommend me other similar movies i might like?"
Inputs = Tokenizer.encode(Input_text, return_tensors = 'pt')

### This below "output" part of the code is related to using a language generation model from the Hugging Face Transformers library to generate text based on a given input.

> * Outputs is a variable that will hold the generated text sequences. It will store the text sequences that are generated by the model.
> * model refers to a language generation model from the Hugging Face Transformers library.
> * The generate method of the model is being used to generate text. This method takes various parameters to control the generation process.
> * Inputs likely holds the encoded input text that you've previously prepared using a tokenizer. This input text serves as a seed or prompt for the language generation model.
> * The max_length parameter specifies the maximum length of the generated text sequences.
> * The num_return_sequences parameter specifies how many different text sequences you want the model to generate. 
> * The temperature parameter controls the randomness of the generation process. A higher value (e.g., 1.0) makes the generated text more random, while a lower value (e.g., 0.7) makes it more focused and deterministic.

In [14]:
# Generate text
outputs = model.generate(Inputs, max_length=100, num_return_sequences=5, temperature=0.7)

# Print generated text
print("Generated text:")
for i, output in enumerate(outputs):
    # Decode the generated output text
    decoded_output = Tokenizer.decode(output)
    
    # Find the index where the decoded output text starts
    start_index = decoded_output.find(Input_text) + len(Input_text)
    
    # Extract and print the generated text without the input/question
    generated_text = decoded_output[start_index:]
    print(f"{i}: {generated_text}")

Generated text:
0: 

I'm looking for crime dramas with complex characters, intricate plots and a good mix of action and suspense.

You could try:

1. Infernal Affairs (2002) - A Hong Kong crime drama that explores the cat-and-mouse game between an undercover cop and a crime boss.
2. The
1: 
The Departed is a crime drama film directed by Martin Scorsese, released in 2006. It follows an undercover cop (Leonardo DiCaprio) who infiltrates a Boston crime syndicate, while a mole within the police department (Matt Damon) works to uncover his identity. The movie features a star-studded cast
2: 
The Departed is a crime drama directed by Martin Scorsese, starring Leonardo DiCaprio, Matt Damon, and Jack Nicholson. It tells the story of an undercover cop who infiltrates a Boston crime syndicate, while a mole within the police department works to uncover his identity. The movie is known for its gripping plot,
3: 
The Departed is a crime drama directed by Martin Scorsese, starring Leonardo DiCaprio,

# CHATGPT Response to the same prompt:

### My favourite movie is The Departed. Can you recommend me other similar movies i might like?
### Return only 5 sequences with max length = 500

### For the above prompt on **ChatGPT**, i got the follwing response:

Certainly! If you loved "The Departed," you might also enjoy these:

Infernal Affairs (2002) - Hong Kong thriller, undercover cops.

Heat (1995) - Crime, detective vs. criminal.

Donnie Brasco (1997) - FBI agent infiltrates mafia.

The Town (2010) - Bank robbers, moral conflicts.

American Gangster (2007) - Detective vs. drug lord.