# Gai/Gen: Text-to-Text Generation using Gai with ExLlama

This is useful for running LLM on CPU with consumer grade graphics card.

## Setting Up

1. Create a conda environment called `TTT`, if not already created, and install the dependencies:

    ```bash
    sudo apt update -y && sudo apt install ffmpeg git git-lfs -y
    conda create -n TTT2 python=3.10.10 -y
    conda activate TTT2
    cd ../../gai-gen
    pip install -e ".[TTT2]"
    ```

2. Download Mistral 7B 4-Bit Quantized model into `~/gai/models` directory.


In [None]:
%%bash
huggingface-cli download turboderp/Llama-3-8B-Instruct-exl2 \
    --revision 97d15f8fb9808afd51d7bccc3c3204ef3714f65a \
    --local-dir ~/gai/models/Llama-3-8B-Instruct-exl2 \
    --local-dir-use-symlinks False


In [None]:
%%bash
huggingface-cli download bartowski/Mistral-7B-Instruct-v0.3-exl2 \
    --revision 1a09a351a5fb5a356102bfca2d26507cdab11111 \
    --local-dir ~/gai/models/Mistral-7B-Instruct-v0.3-exl2 \
    --local-dir-use-symlinks False


---
## Chat Completion


### Generating

In [5]:
from gai.gen import Gaigen
gen = Gaigen.GetInstance("../../../gai-gen/gai.json").load('llama3-exllama2')
response = gen.create(messages=[
    {'role':'system','content':'You are a helpful assistant that can generate short stories. You can generate a short story based on a prompt.'},
    {'role':'user','content':'Tell me a one paragraph short story.'},
    {'role':'assistant','content':''}], max_tokens=1000,stream=False)
gen.unload()
response

2024-06-08 23:21:11 INFO gai.gen.Gaigen:[32mGaigen: Loading generator llama3-exllama2...[0m
2024-06-08 23:21:11 INFO gai.gen.ttt.TTT:[32mUsing engine ExLlamaV2_TTT...[0m
2024-06-08 23:21:11 INFO gai.gen.ttt.TTT:[32mLoading model from models/Meta-Llama-3-8B-Instruct-EXL2[0m
2024-06-08 23:21:11 INFO gai.gen.ttt.ExLlamav2_TTT:[32mExLlama_TTT2.load: Loading model from /home/roylai/gai/models/Meta-Llama-3-8B-Instruct-EXL2[0m


ChatCompletion(id='chatcmpl-326f0d90-c6af-4dfe-ac55-3bc76ab24196', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='As the last rays of sunlight faded from the sky, a lone figure emerged from the dense forest, her eyes fixed on the crumbling mansion looming before her. The wind whispered secrets in her ear, and she felt an inexplicable pull towards the ancient structure, as if the very stones held a tale waiting to be unearthed. With each step, the creaking of the wooden floorboards beneath her feet seemed to echo through the stillness, like a gentle summons to explore the mysteries within. And yet, as she pushed open the creaking door, a faint light flickered to life, illuminating the dusty halls, revealing the secrets of the forgotten mansion, and beckoning her deeper into its depths.', role='assistant', function_call=None, tool_calls=None))], created=1717860093, model='llama3-exllama2', object='chat.completion', system_fingerprint=

In [4]:
from gai.gen import Gaigen
gen = Gaigen.GetInstance("../../../gai-gen/gai.json").load('mistral7b-exllama2')
response = gen.create(messages=[
    {'role':'system','content':'You are a helpful assistant that can generate short stories. You can generate a short story based on a prompt.'},
    {'role':'user','content':'Tell me a one paragraph short story.'},
    {'role':'assistant','content':''}], max_tokens=1000,stream=False)
gen.unload()
response

2024-06-08 23:20:20 INFO gai.gen.Gaigen:[32mGaigen: Loading generator mistral7b-exllama2...[0m
2024-06-08 23:20:20 INFO gai.gen.ttt.TTT:[32mUsing engine ExLlamaV2_TTT...[0m
2024-06-08 23:20:20 INFO gai.gen.ttt.TTT:[32mLoading model from models/Mistral-7B-Instruct-v0.3-exl2[0m
2024-06-08 23:20:21 INFO gai.gen.ttt.ExLlamav2_TTT:[32mExLlama_TTT2.load: Loading model from /home/roylai/gai/models/Mistral-7B-Instruct-v0.3-exl2[0m


ChatCompletion(id='chatcmpl-d017a330-6793-48ef-b901-3e2145779b8d', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="In the quiet town of Meadowbrook, a peculiar little store called The Curiosity Shop was nestled between quaint houses. The shop, with its old wooden sign and cobweb-covered windows, was known for its mysterious allure. One day, a young girl named Lily wandered in, her eyes wide with curiosity. As she explored the shop, she stumbled upon a small, antique music box. When she opened it, a beautiful melody filled the quiet room, and the world outside seemed to pause. As the melody ended, a golden key appeared within the box. The key was said to unlock the door to an enchanted garden, but only the one who truly loved the music could find it. Lily closed her eyes, listened to the melody once more, and felt a warm, golden light fill the room. When she opened her eyes, the key was in her hand. With a heart full of hope, she foll

### Streaming

In [3]:
from gai.gen import Gaigen
gen = Gaigen.GetInstance("../../../gai-gen/gai.json").load('llama3-exllama2')
response = gen.create(messages=[
    {'role':'user','content':'Tell me a one paragraph short story.'},
    {'role':'assistant','content':''}
    ], max_tokens=1000,stream=True)
for chunk in response:
    chunk=chunk.choices[0].delta.content
    if chunk:
        print(chunk, end='', flush=True)    
print()
gen.unload()


2024-06-08 23:19:47 INFO gai.gen.Gaigen:[32mGaigen: Loading generator llama3-exllama2...[0m
2024-06-08 23:19:47 INFO gai.gen.ttt.TTT:[32mUsing engine ExLlamaV2_TTT...[0m
2024-06-08 23:19:47 INFO gai.gen.ttt.TTT:[32mLoading model from models/Meta-Llama-3-8B-Instruct-EXL2[0m
2024-06-08 23:19:47 INFO gai.gen.ttt.ExLlamav2_TTT:[32mExLlama_TTT2.load: Loading model from /home/roylai/gai/models/Meta-Llama-3-8B-Instruct-EXL2[0m


As the last rays of sunlight faded from the small town of Willow Creek, a lone figure emerged from the shadows. It was Emily, a young woman with a mop of curly brown hair and a smile that could light up the darkest of rooms. She had returned to her hometown after years away, seeking solace in the familiar streets and sounds of her childhood. But as she walked down Main Street, the once-familiar buildings now seemed to loom over her like giants, their windows like empty eyes staring back. Emily quickened her pace, her heart pounding in her chest, until she reached the old oak tree where she had carved her initials with her high school sweetheart all those years ago. As she touched the scars, a warm breeze rustled the leaves, and for a moment, she felt the weight of her past lift, like the first whispered promise of a new beginning.


<gai.gen.Gaigen.Gaigen at 0x7f7d3ce1d540>

In [1]:
from gai.gen import Gaigen
gen = Gaigen.GetInstance("../../../gai-gen/gai.json").load('mistral7b-exllama2')
response = gen.create(messages=[
    {'role':'user','content':'Tell me a one paragraph short story.'},
    {'role':'assistant','content':''}
    ], max_tokens=1000,stream=True)
for chunk in response:
    chunk=chunk.choices[0].delta.content
    if chunk:
        print(chunk, end='', flush=True)
print()
gen.unload()

  from .autonotebook import tqdm as notebook_tqdm
2024-06-08 23:17:59 INFO gai.gen.Gaigen:[32mGaigen: Loading generator mistral7b-exllama2...[0m
2024-06-08 23:17:59 INFO gai.gen.ttt.TTT:[32mUsing engine ExLlamaV2_TTT...[0m
2024-06-08 23:17:59 INFO gai.gen.ttt.TTT:[32mLoading model from models/Mistral-7B-Instruct-v0.3-exl2[0m
2024-06-08 23:17:59 INFO gai.gen.ttt.ExLlamav2_TTT:[32mExLlama_TTT2.load: Loading model from /home/roylai/gai/models/Mistral-7B-Instruct-v0.3-exl2[0m


Once upon a time, in a small coastal town, a young fisherman named Juan discovered a mysterious, glowing stone while mending his net. Overwhelmed by the peculiar artifact, he sought help from the wise old lighthouse keeper, who revealed it was the legendary Sea Stone, said to summon the mighty sea serpent Leviathan. Fearing the unknown power, Juan embarked on a journey to find the ancient seer who could guide him, hoping to harness the stone's power to protect his beloved town and sea creatures.




<gai.gen.Gaigen.Gaigen at 0x7f7d3ce1d540>

---
## JSON Mode

In [None]:
from lmformatenforcer import JsonSchemaParser
from pydantic import BaseModel
from gai.gen import Gaigen
gen = Gaigen.GetInstance("../../../gai-gen/gai.json").load('mistral7b-exllama2')

# Define Schema
class Book(BaseModel):
    title: str
    summary: str
    author: str
    published_year: int

text = """Foundation is a science fiction novel by American writer
Isaac Asimov. It is the first published in his Foundation Trilogy (later
expanded into the Foundation series). Foundation is a cycle of five
interrelated short stories, first published as a single book by Gnome Press
in 1951. Collectively they tell the early story of the Foundation,
an institute founded by psychohistorian Hari Seldon to preserve the best
of galactic civilization after the collapse of the Galactic Empire.
"""

response = gen.create(messages=[{'role':'USER','content':text},{'role':'ASSISTANT','content':''}], schema=Book.schema(), max_tokens=1000,stream=False)
print(response)
gen.unload()

---
## Function Calling

In [None]:
from gai.gen import Gaigen
gen = Gaigen.GetInstance("../../../gai-gen/gai.json").load('mistral7b-exllama2')
tools = [
    {
        "type": "function",
        "function": {
            "name": "google",
            "description": "The 'google' function is a powerful tool that allows the AI to gather external information from the internet using Google search. It can be invoked when the AI needs to answer a question or provide information that requires up-to-date, comprehensive, and diverse sources which are not inherently known by the AI. For instance, it can be used to find current date, current news, weather updates, latest sports scores, trending topics, specific facts, or even the current date and time. The usage of this tool should be considered when the user's query implies or explicitly requests recent or wide-ranging data, or when the AI's inherent knowledge base may not have the required or most current information. The 'search_query' parameter should be a concise and accurate representation of the information needed.",
            "parameters": {
                "type": "object",
                "properties": {
                    "search_query": {
                        "type": "string",
                        "description": "The search query to search google with. For example, to find the current date or time, use 'current date' or 'current time' respectively."
                    }
                },
                "required": ["search_query"]
            }
        }
    }
]

from gai.common.notebook import highlight

highlight("Model decided to use tool: ")
user_prompt = "What time is it in Singapore right now?"
response = gen.create(
    messages=[
        {'role':'user','content':user_prompt},
        {'role':'assistant','content':''}],
    tools=tools,
    stream=False,
    max_new_tokens=200)
print(response)

highlight("Model decided not to use tool: ")
user_prompt = "Tell me a one paragraph story."
response = gen.create(
    messages=[
        {'role':'user','content':user_prompt},
        {'role':'assistant','content':''}],
    tools=tools,
    stream=False,
    max_new_tokens=200)
print(response)
