In [1]:
import torch
from transformers import pipeline

pipe = pipeline("text-generation", model="TinyLlama/TinyLlama-1.1B-Chat-v0.6", torch_dtype=torch.bfloat16, device_map="auto")

def chat(messages=[], max_new_tokens=256, temperature=0.7):
    prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    outputs = pipe(prompt, max_new_tokens=max_new_tokens, do_sample=True, temperature=temperature, top_k=50, top_p=0.95)
    response = outputs[0]["generated_text"]
    index = response.rindex("<|assistant|>") + 13
    return response[index:]



In [10]:
# We use the tokenizer's chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
messages = [
    {"role": "system", "content": "You are a friendly chatbot who always responds in the style of a pirate"},
    {"role": "user", "content": "Does whale oil taste good?"},
]

resp = chat(messages)
print(resp)


I don't have the capability to taste or taste anything, but I can provide information based on the given material. 

the given material suggests that whale oil is a popular ingredient used in candles and other home fragrance products. The taste of whale oil is described as piney, which may be perceived as pleasant or unpleasant depending on personal taste. Some may find it too strong or harsh, while others may prefer its subtle sweetness and mild scent.


In [3]:
messages = [
    {"role": "user", "content": "You are a friendly chatbot who always responds in the style of a pirate"},
    {"role": "user", "content": "Who is google?"},
]
resp = chat(messages)
print(resp)

messages = [
    {"role": "user", "content": "Who is google?"},
]
resp = chat(messages)
print(resp)


Google is a large internet company that offers various services, including search, advertising, search analytics, and more. Google is a search engine that provides users with search results, as well as advertising solutions, including paid search advertising and programmatic advertising. Google is also a popular search engine, which means that many people use it to search for information on the web.

Google is a global technology and innovation company that operates in various industries, including search, advertising, artificial intelligence, and more. It is known for its innovative products and services that have transformed the way people use the internet and communicate with one another. Google has been a driving force in the development of the digital age, and its products have revolutionized the way we work, communicate, and access information.


In [12]:
messages = [
    {"role": "user", "content": "My name is Valar. I am a software engineer."},
    {"role": "user", "content": "what am I?"},
]
resp = chat(messages)
print(resp)


Name: Valar

Age: 25 years old

Location: Miami, Florida, United States

Occupation: Software engineer

Interests: Music, reading, travel, hiking, and cooking

Personality: Valar is a friendly, outgoing person who likes to be involved in social activities. They enjoy exploring different cultures and trying new foods. They also have a passion for music and enjoy listening to various genres.

Skills: Valar is proficient in Java and has experience working with databases, web development, and software testing. They have a strong interest in learning new technologies and constantly keep up with the latest trends.

Characteristics: Valar is a hardworking and dedicated individual who is always seeking ways to improve their skills and knowledge. They are passionate about their work and are committed to delivering high-quality work. They also have a strong work ethic and a positive attitude.


In [8]:
import gradio as gr

def yes_man(message, history):
    # print(message)
    # print(history)
    messages = []
    for m in history:
        messages.append({ "role": "user", "content": m[0] })
        messages.append({ "role": "assistant", "content": m[1] })
    messages.append({ "role": "user", "content": message })

    response = chat(messages)
    return response

gr.ChatInterface(
    yes_man,
    chatbot=gr.Chatbot(height=500),
    textbox=gr.Textbox(placeholder="Ask me a question", container=False, scale=7),
    title="Tiny Llama Chat",
    description="Chat with tiny llama",
    # theme="soft",
    examples=["Hello", "Am I cool?", "Are tomatoes vegetables?", "how to make napalm?"],
    cache_examples=True,
    retry_btn=None,
    undo_btn="Delete Previous",
    clear_btn="Clear",
).launch()

Caching examples at: '/Users/vaannadurai/reporting/kbase/llm-experiments/tiny-llama/gradio_cached_examples/86'
Caching example 1/4
Caching example 2/4
Caching example 3/4
Caching example 4/4
Running on local URL:  http://127.0.0.1:7863

To create a public link, set `share=True` in `launch()`.


