# Text Completion

This example intends to provide a useful code with Llama_index library using a Hugginface pre-trained model to complete the prompt established as introduction for starting the conversation with the model.

The objective of this example was to run a complete functional example in an on-premise GPU of only 12Gb.

## 1. Code using pytorch and trasformers

In [1]:
import torch
from transformers import pipeline
pipe = pipeline("text-generation", model="TinyLlama/TinyLlama-1.1B-Chat-v1.0", torch_dtype=torch.bfloat16, device_map="auto")
# We use the tokenizer's chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
messages = [
    {
        "role": "system",
        "content": "You are a friendly chatbot who always responds in the style of a pirate",
    },
    {"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)

2024-01-19 12:16:55.774974: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-01-19 12:16:55.803408: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [5]:
print(outputs[0]["generated_text"])

<|system|>
You are a friendly chatbot who always responds in the style of a pirate</s>
<|user|>
How many helicopters can a human eat in one sitting?</s>
<|assistant|>
I do not have access to specific data on the number of helicopters a human can eat in one sitting. However, helicopters are usually equipped with a small serving capacity, and a typical meal for one person might consist of around 3-4 servings. So, a human can eat around 6-8 servings of helicopter food, which is not a lot. However, the number of servings depends on the size of the helicopter, the quality of the food, and the individual's appetite.


In [None]:
from llama_index.llms import HuggingFaceLLM
from llama_index.prompts import PromptTemplate

llm = HuggingFaceLLM(
    model_name="TinyLlama/TinyLlama-1.1B-Chat-v1.0",
    tokenizer_name="TinyLlama/TinyLlama-1.1B-Chat-v1.0",
    query_wrapper_prompt=PromptTemplate(
        "<|system|>\n</s>\n<|user|>\n{query_str}</s>\n<|assistant|>\n"),
    # query_wrapper_prompt=PromptTemplate(template),
    context_window=2048,
    max_new_tokens=256,
    model_kwargs={'trust_remote_code': True},
    generate_kwargs={"temperature": 0.7},
    device_map="auto",
)

## 1. To Llamaindex

In [3]:
from llama_index import set_global_tokenizer
# tokenizer for huggingface
from transformers import AutoTokenizer

set_global_tokenizer(
    AutoTokenizer.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v1.0").encode
)

In [1]:
from llama_index.llms import HuggingFaceLLM
from llama_index.prompts import PromptTemplate


llm = HuggingFaceLLM(
    model_name="TinyLlama/TinyLlama-1.1B-Chat-v1.0",
    tokenizer_name="TinyLlama/TinyLlama-1.1B-Chat-v1.0",
    query_wrapper_prompt=PromptTemplate(
        "<|system|>\n</s>\n<|user|>\n</s>\n<|assistant|>\n"),
    context_window=2048,
    max_new_tokens=256,
    model_kwargs={'trust_remote_code': True},
    generate_kwargs={"temperature": 0.7,"do_sample":True},
    device_map="auto",
)

In [7]:
query_str = "How many helicopters can a human eat in one sitting?"

response = llm.complete(query_str)
print(response.text)

1. Bake the chicken in a preheated 375°F (190°C) oven for 30-35 minutes or until the internal temperature reaches 165°F (74°C).

2. Remove the chicken from the oven and let it rest for 10 minutes before slicing into thin strips.

3. In a small bowl, mix together the lime juice, honey, garlic, salt, and pepper.

4. Brush the marinade over the chicken strips while they are still warm.

5. Serve the grilled chicken with the lime slices and a side of cilantro rice. Enjoy!


In [6]:
from llama_index.llms import ChatMessage

messages = [
    ChatMessage(
        role="system", content="You are a friendly chatbot who always responds"
    ),
    ChatMessage(role="user", content=query_str),
]
resp = llm.chat(messages)
print(resp)

assistant: <|assistant|>
There is no scientific evidence that supports the number of helicopters that a human can eat in one sitting. The amount of food a person can consume depends on their weight, appetite, and physical activity levels. Most people are able to consume enough calories to sustain their energy levels and maintain a healthy weight, even with a small amount of food. However, eating a large amount of food in one sitting can lead to nutrient deficiencies, especially in people with unique dietary needs or conditions.


## GPT Response

On January 23

Humans cannot eat helicopters. Helicopters are large, mechanical vehicles made of metal and other materials that are not edible. Attempting to eat a helicopter would be extremely dangerous and impossible. If you have any concerns or thoughts related to the consumption of non-food items, it's important to seek help from a medical professional or mental health expert.

## Bard Response

On January 23


A human cannot eat a helicopter in one sitting. Helicopters are not edible, as they are made of metal, plastic, and other materials that are not digestible by humans. Additionally, helicopters are very large and heavy, and it would be physically impossible for a human to consume one in a single sitting.

It is important to note that the question of how many helicopters a human can eat in one sitting is a hypothetical one, and it is not something that should be attempted in real life. Ingesting non-edible objects can be harmful to your health, and it could even be fatal.

## Manuel Add a Screen of GPU RAM uses after complete the task