# ModelArena WalkThrough Section 2 - Inference

In [1]:
import warnings

from model_arena import ModelArena

# it is never the best practice to ignore warnings!
# however bytedmysql never tries to solve the warnings
# for better presentation here, we ignore these warnings
warnings.filterwarnings("ignore")
ma = ModelArena()

## Dataset and Model

We use a demo dataset and gpt-4-0613 to walk through the infernece process.

In [2]:
dataset = "demo"
model = "gpt-4-0613"
model_path = ma.models.get_model_path(model)

In [3]:
ma.datasets.get(datasets=dataset)

Unnamed: 0,dataset_name,dataset_id,raw_dataset_id,tag,instruction,output
0,demo,a051970d3095432f967e68c3049313dd,a19b72fbaa5e4bc4a3405dfed904650d,nl2code,write a quick sort in python.,
1,demo,95d7220b5e814c2eadbabaab4decc4f7,2290052ff9ea4be8bb38095247713cf0,nl2code,write a bubble sort in c.,


## Define the LLMEngine

To use model_arena infer the result for you. You have to initialize a proper *LLMEngine* according to your model framework.

**WIP**: In future, model_arena will initalize the *LLMEngine* automatically using the framework registed in model meta information.

### BytedChatGPTEngine

BytedChatGPTEngine is an engine that calls the ChatGPT model through Bytedance authentication.

In [4]:
# register the api token in environment
import os

os.environ["BYTED_GPT_TOKEN"] = ""

In [5]:
from model_arena.core import BytedChatGPTEngine

engine = BytedChatGPTEngine(
    model=model,
    model_path=model_path,
    generation_kwargs={},
    show_progress=False
)

Just call *infer* function, model_arena will do most of the work for you!

In [6]:
# set upload=False to have debug view of inference result
df = ma.infer(dataset=dataset, model=model, engine=engine, upload=False)
df

Unnamed: 0,dataset_name,dataset_id,model_id,prompt,output
0,demo,95d7220b5e814c2eadbabaab4decc4f7,b62e8a8ce26e4b3cb9e208be609c1a5d,write a bubble sort in c.,Here is an example implementation of bubble so...
1,demo,a051970d3095432f967e68c3049313dd,b62e8a8ce26e4b3cb9e208be609c1a5d,write a quick sort in python.,Here's an implementation of quick sort in Pyth...


In [7]:
# you can manually check the result and then upload the result
# ma.add_inferences(df)

# if you think everything work perfectly, you can do the whole
# process automatically
# ma.infer(dataset=dataset, model=model, engine=engine, upload=True)

Once you have uploaded the inference results, you can always use *get* method to retrieve your history inference results!

In [8]:
ma.get_inferences(datasets=dataset, models=model)

Unnamed: 0,dataset_name,dataset_id,tag,model_name,prompt,output
0,demo,95d7220b5e814c2eadbabaab4decc4f7,nl2code,gpt-4-0613,write a bubble sort in c.,Here's an implementation of bubble sort in C:\...
1,demo,a051970d3095432f967e68c3049313dd,nl2code,gpt-4-0613,write a quick sort in python.,Here's an implementation of quick sort in Pyth...


Let's change the model to *gpt-3.5-turbo-1106*.

In [9]:
model = "gpt-3.5-turbo-1106"
ma.infer(
    dataset=dataset,
    model=model,
    engine=BytedChatGPTEngine(model=model, model_path=ma.models.get_model_path(model), generation_kwargs={}),
    upload=False,
)

Unnamed: 0,dataset_name,dataset_id,model_id,prompt,output
0,demo,95d7220b5e814c2eadbabaab4decc4f7,dd078c34445049879fbcb5ae72f1d9d5,write a bubble sort in c.,Here is an implementation of bubble sort in C:...
1,demo,a051970d3095432f967e68c3049313dd,dd078c34445049879fbcb5ae72f1d9d5,write a quick sort in python.,Here is an implementation of quick sort in Pyt...


### HuggingFaceEngine

HuggingFaceEngine is an engine that calls the model through huggingface text generation pipeline.

In [None]:
from model_arena.core import HuggingFaceEngine

model = "deepseek-coder-6.7b-instruct"
model_path = ma.models.get_model_path(model)

engine = HuggingFaceEngine(model=model, model_path=model_path, generation_kwargs={"max_new_tokens": 512})

### vLLMEngine

vLLMEngine is an engine that calls the model through vLLM.

In [None]:
from model_arena.core import VLLMEngine

model = "deepseek-coder-6.7b-instruct-awq"
model_path = ma.models.get_model_path(model)

engine = VLLMEngine(model=model, model_path=model_path, generation_kwargs={"max_new_tokens": 512})

Let's retrieve all these inference result back!

In [10]:
ma.get_inferences(datasets=dataset, models="all")

Unnamed: 0,dataset_name,dataset_id,tag,model_name,prompt,output
0,demo,a051970d3095432f967e68c3049313dd,nl2code,deepseek-coder-6.7b-instruct,"You are an AI programming assistant, utilizing...","Sure, here is a simple implementation of the Q..."
1,demo,a051970d3095432f967e68c3049313dd,nl2code,gpt-4-0613,write a quick sort in python.,Here's an implementation of quick sort in Pyth...
2,demo,a051970d3095432f967e68c3049313dd,nl2code,deepseek-coder-6.7b-instruct-awq,"You are an AI programming assistant, utilizing...",Here is a basic implementation of Quick Sort i...
3,demo,a051970d3095432f967e68c3049313dd,nl2code,gpt-3.5-turbo-1106,write a quick sort in python.,Here's an implementation of quick sort in Pyth...
4,demo,95d7220b5e814c2eadbabaab4decc4f7,nl2code,deepseek-coder-6.7b-instruct,"You are an AI programming assistant, utilizing...","Sure, here is a simple implementation of Bubbl..."
5,demo,95d7220b5e814c2eadbabaab4decc4f7,nl2code,gpt-4-0613,write a bubble sort in c.,Here's an implementation of bubble sort in C:\...
6,demo,95d7220b5e814c2eadbabaab4decc4f7,nl2code,deepseek-coder-6.7b-instruct-awq,"You are an AI programming assistant, utilizing...","Sure, here is a basic implementation of Bubble..."
7,demo,95d7220b5e814c2eadbabaab4decc4f7,nl2code,gpt-3.5-turbo-1106,write a bubble sort in c.,Here is an implementation of bubble sort in C:...
