# Gai/Gen: Text-to-Code (TTC)

## 1.1 Setting Up

We will create a seperate virtual environment for this to avoid conflicting dependencies that each underlying model requires.

```sh
conda create -n TTC python=3.10.10 -y
conda activate TTC
pip install -e ".[TTC]"
```

NOTE: The installation depends on requirements_ttc.txt which is based on https://raw.githubusercontent.com/deepseek-ai/DeepSeek-Coder/main/requirements.txt

Download the deepseek coder model below.

In [None]:
%%bash
huggingface-cli download TheBloke/deepseek-coder-6.7B-instruct-GPTQ \
        --local-dir ~/gai/models/deepseek-coder-6.7b-instruct \
        --local-dir-use-symlinks False

install autogptq

In [None]:
%%bash
pip install auto-gptq

---

### Create a new virtual environment TTC

Based on TTT, except that we will be using auto-gptq instead of exllama

Refer to requirements_ttc.txt

### config

```json
        "deepseek-gptq": {
            "type": "ttc",
            "model_name": "deepseek-coder-6.7B",
            "engine": "Deepseek_TTC",
            "model_path": "models/deepseek-coder-6.7B-instruct-GPTQ",
        ...
```


### Create ttc dir and TTC generator

1. Create a new dir called ttc
2. Create ExLlama2_TTC.py based on ExLlama_TTT.py (only difference is change the import of ExLlama to ExLlama2)
3. Create TTC.py generator based on TTT.py. Update the reference to point to the new config section "deepseek-exllama" and ExLlama2 generator.

## 1.2 Running as a Library

In [1]:
# Load Instance
from gai.gen import Gaigen
gen = Gaigen.GetInstance().load('deepseek-gptq')
response = gen.create(messages=[{'role':'USER','content':'Create a quick sort function in python.'},{'role':'ASSISTANT','content':''}],max_new_tokens=100,stream=False)

  from .autonotebook import tqdm as notebook_tqdm
  _torch_pytree._register_pytree_node(
  _torch_pytree._register_pytree_node(
2024-04-21 15:39:57 INFO gai.gen.Gaigen:[32mGaigen: Loading generator deepseek-gptq...[0m
2024-04-21 15:39:57 INFO gai.gen.ttc.TTC:[32mUsing ttc model Deepseek_TTC...[0m
2024-04-21 15:39:57 INFO gai.gen.ttc.Deepseek_TTC:[32mLoading model from /home/roylai/gai/deepseek-coder-6.7B-instruct-GPTQ/model.safetensors[0m
[33mSpecial tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.[0m
INFO - [32mThe layer lm_head is not quantized.[0m
2024-04-21 15:39:58 INFO auto_gptq.modeling._base:[32m[32mThe layer lm_head is not quantized.[0m[0m
2024-04-21 15:40:12 DEBUG gai.gen.ttc.Deepseek_TTC:[35mDeepseek_TTC.create: model_params={'temperature': 1.2, 'top_p': 0.15, 'min_p': 0.0, 'top_k': 50, 'max_new_tokens': 100, 'typical': 0.0, 'token_repetition_penalty_max': 1.25, 'token_repetition_penalty_sustain': 25

In [3]:
# Load Instance
from gai.gen import Gaigen
gen = Gaigen.GetInstance().load('deepseek-gptq')
response = gen.create(messages=[{'role':'USER','content':'Create a python websocket server using fastAPI and connect to it using python websockets.'},
                                {'role':'ASSISTANT','content':''}]
                                ,max_new_tokens=4000,
                                stream=False)

2024-04-21 15:42:09 DEBUG gai.gen.Gaigen:[35mGaigen.load: Generator is already loaded. Skip loading.[0m
2024-04-21 15:42:09 DEBUG gai.gen.ttc.Deepseek_TTC:[35mDeepseek_TTC.create: model_params={'temperature': 1.2, 'top_p': 0.15, 'min_p': 0.0, 'top_k': 50, 'max_new_tokens': 4000, 'typical': 0.0, 'token_repetition_penalty_max': 1.25, 'token_repetition_penalty_sustain': 256, 'token_repetition_penalty_decay': 128, 'beams': 1, 'beam_length': 1}[0m
2024-04-21 15:42:09 DEBUG gai.gen.ttc.Deepseek_TTC:[35mDeepseek_TTC: prompt=USER: Create a python websocket server using fastAPI and python websocket client that connects to it.
ASSISTANT:[0m
2024-04-21 15:42:09 DEBUG gai.gen.ttc.Deepseek_TTC:[35mDeepseek_TTC: {'max_new_tokens': 4000, 'do_sample': True, 'early_stopping': False, 'encoder_repetition_penalty': 1, 'eos_token_id': 32021, 'length_penalty': 1, 'logits_processor': [], 'min_length': 0, 'no_repeat_ngram_size': 0, 'num_beams': 1, 'penalty_alpha': 0, 'repetition_penalty': 1.17, 'temper