# Text Generation using LLama Model

### This Task a straightforward way to utilize the LLaMA model for generating text based on a given prompt. By leveraging the capabilities of the Hugging Face Transformers library, users can easily load pre-trained models, encode inputs, generate text, and decode the outputs. This approach is applicable in various natural language processing tasks, including storytelling, dialogue generation, and more. The flexibility of the model allows for adjustments in prompt and generation parameters to tailor the output to specific needs.

<figure>
        <img src="https://debuggercafe.com/wp-content/uploads/2023/11/Text-Generation-with-Transformers-e1700010212188.png" alt ="Audio Art" style='width:800px;height:500px;'>
        <figcaption>

In [1]:
import torch
from transformers import LlamaForCausalLM, LlamaTokenizer

# Load pre-trained model and tokenizer
model_name = "openlm-research/open_llama_3b"  # You can use other LLaMA models as available
model = LlamaForCausalLM.from_pretrained(model_name)
tokenizer = LlamaTokenizer.from_pretrained(model_name)

# Set the device to GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

# Define the prompt
prompt = "Once upon a time in a faraway land, there was a small village where people lived in harmony with nature. One day,"

# Encode the input prompt
inputs = tokenizer.encode(prompt, return_tensors="pt").to(device)

# Generate text
max_length = 200  # Adjust the length of the generated text as needed
outputs = model.generate(inputs, max_length=max_length, num_return_sequences=1)

# Decode the generated text
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

# Print the generated text
print("Generated Text:")
print(generated_text)


config.json:   0%|          | 0.00/506 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/6.85G [00:00<?, ?B/s]

  return self.fget.__get__(instance, owner)()


generation_config.json:   0%|          | 0.00/137 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/593 [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/534k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/330 [00:00<?, ?B/s]

You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message


Generated Text:
Once upon a time in a faraway land, there was a small village where people lived in harmony with nature. One day, a group of greedy men came to the village and said, “We want to build a dam and flood your village. We will give you a lot of money to move to a new place.” The people of the village were very happy to move to a new place, but they were sad to leave their old home.
The greedy men built the dam and flooded the village. The people of the village were sad and angry. They said, “We want to go back to our old home. We want to rebuild our village.” The greedy men said, “No, you can’t go back. You have to move to a new place.” The people of the village were very sad and angry. They said, “We want to go back to our old home. We want to rebuild our village.” The greedy men said, “
