# Demo: LLMs Comparison between Open-source LLMs

`pykoi` provides simple API to compare between LLMs, including your own finetuned LLM, a pretrained LLM from huggingface, or OpenAI/Anthropic/Bedrock APIs. This demo shows how to create and launch an LLM comparison app between Open-source LLMs from huggingface. Let's get started!

### Prerequisites
To run this jupyter notebook, you need a `pykoi` environment with the `huggingface` option. You can follow [the installation guide](https://github.com/CambioML/pykoi/tree/install#option-2-rag-gpu) to set up the environment. 

You may also need `pip install ipykernel` to run the kernel environment.


### (Optional) Developer setup
If you are a normal user of `pykoi`, you can skip this step. However, if you modify the pykoi code and want to test your changes, you can uncomment the code below.

In [1]:
# %reload_ext autoreload
# %autoreload 2

# import sys

# sys.path.append(".")
# sys.path.append("..")
# sys.path.append("../..")

### Import Libraries

In [2]:
from pykoi import Application
from pykoi.chat import ModelFactory
from pykoi.chat import QuestionAnswerDatabase
from pykoi.component import Chatbot, Dashboard, Compare

### Load LLMs
#### Creating a Huggingface model (requires at least EC2 `g4dn.xlarge` or GPU with at least 16G memory)

In [3]:
## requires a GPU with at least 16GB memory (e.g. g4dn.xlarge)
huggingface_model_1 = ModelFactory.create_model(
    model_source="huggingface",
    pretrained_model_name_or_path="tiiuae/falcon-rw-1b",
)

  from .autonotebook import tqdm as notebook_tqdm
  _torch_pytree._register_pytree_node(


[HuggingfaceModel] loading model...


  _torch_pytree._register_pytree_node(
  _torch_pytree._register_pytree_node(


[HuggingfaceModel] loading tokenizer...


In [4]:
## requires a GPU with at least 16GB memory (e.g. g4dn.2xlarge)
huggingface_model_2 = ModelFactory.create_model(
    model_source="huggingface",
    pretrained_model_name_or_path="databricks/dolly-v2-3b",
)

[HuggingfaceModel] loading model...
[HuggingfaceModel] loading tokenizer...


In [5]:
## requires a GPU with at least 24GB memory (e.g. g5.2xlarge)
huggingface_model_3 = ModelFactory.create_model(
    model_source="huggingface",
    pretrained_model_name_or_path="tiiuae/falcon-7b",
)

[HuggingfaceModel] loading model...




Loading checkpoint shards: 100%|██████████| 2/2 [01:52<00:00, 56.27s/it]
Downloading generation_config.json: 100%|██████████| 117/117 [00:00<00:00, 1.03MB/s]


[HuggingfaceModel] loading tokenizer...


Downloading tokenizer_config.json: 100%|██████████| 287/287 [00:00<00:00, 2.68MB/s]
Downloading tokenizer.json: 100%|██████████| 2.73M/2.73M [00:00<00:00, 70.6MB/s]
Downloading (…)cial_tokens_map.json: 100%|██████████| 281/281 [00:00<00:00, 2.27MB/s]


### Create a chatbot comparator

#### Add `nest_asyncio` 
Add `nest_asyncio` to avoid error. Since we're running another interface inside a Jupyter notebook where an asyncio event loop is already running, we'll encounter the error. (since The uvicorn.run() function uses asyncio.run(), which isn't compatible with a running event loop.)

In [6]:
# !pip install -q nest_asyncio
import nest_asyncio
nest_asyncio.apply()

In [7]:
# pass in a list of models to compare
chatbot_comparator = Compare(models=[huggingface_model_1, huggingface_model_2, huggingface_model_3])
# chatbot_comparator.add(openai_model_3)

#### Add ngrok auth (TODO: change to bash file)

In [8]:
# !ngrok config add-authtoken xxxxxxxx

In [9]:
app = Application(debug=False, share=False)
app.add_component(chatbot_comparator)
app.run()

INFO:     Started server process [11390]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:5000 (Press CTRL+C to quit)


Now we can launch the ranking chatbot. Click this above link and you can similar interface:

<p align="center">
    <img src="../image/drag_and_rank_crop_2x.gif" width="75%" height="75%" />
</p>



You can also check the dashboard for your ranking of the model answers:

<p align="center">
    <img src="../image/comparisonDemoSmall_2x.gif" width="75%" height="75%" />
</p>