Skip to content

Latest commit

 

History

History
76 lines (49 loc) · 3.88 KB

README.md

File metadata and controls

76 lines (49 loc) · 3.88 KB

Gorilla Inference

Get Started

You can either run Gorilla through our hosted Colab or chat with it using cli. We also provide instructions for evaluating batched prompts. Here, are the instructions to run it locally.

New: We release gorilla-mpt-7b-hf-v0 and gorilla-falcon-7b-hf-v0 - two Apache 2.0 licensed models (commercially usable).

gorilla-7b-hf-v0 is the first set of weights we released 🎉 It chooses from 925 HF APIs in a 0-shot fashion (without any retrieval). Update: We released gorilla-7b-th-v0 with 94 (exhaustive) APIs from Torch Hub and gorilla-7b-tf-v0 with 626 (exhaustive) APIs from Tensorflow. In spirit of openess, we do not filter, nor carry out any post processing either to the prompt nor response 🎁 Keep in mind that the current gorilla-7b-* models do not have any geenric chat capability. We do have a model with all the 1600+ APIs which also has chat capability, which we release slowly to accommodate server demand.

All gorilla weights hosted at https://huggingface.co/gorilla-llm/.

Install Dependencies

You should install dependencies using the following command:

conda create -n gorilla python=3.10
conda activate gorilla
pip install -r requirements.txt

We release the weights for gorilla-mpt-7b-hf-v0 and gorilla-falcon-7b-hf-v0 on Huggingface. You can directly download them! For the llama-finetuned models we release the weights as a delta to be compliant with the LLaMA model license. You can apply the delta weights using the following commands below:

Downloading Gorilla Delta Weights

We release the delta weights of Gorilla to comply with the LLaMA model license. You can prepare the Gorilla weights using the following steps:

  1. Get the original LLaMA weights using the link here.
  2. Download the Gorilla delta weights from our Hugging Face.

Applying Delta Weights

Run the following python command to apply the delta weights to your LLaMA model:

python3 apply_delta.py 
--base-model-path path/to/hf_llama/ 
--target-model-path path/to/gorilla-7b-hf-v0 
--delta-path path/to/models--gorilla-llm--gorilla-7b-hf-delta-v0

Inference using CLI

Simply run the command below to start chatting with Gorilla:

python3 serve/gorilla_cli.py --model-path path/to/gorilla-7b-{hf,th,tf}-v0

For the falcon-7b model, you can use the following command:

python3 serve/gorilla_falcon_cli.py --model-path path/to/gorilla-falcon-7b-hf-v0

[Optional] Batch Inference on a Prompt File

After downloading the model, you need to make a jsonl file containing all the question you want to inference through Gorilla. Here is one example:

{"question_id": 1, "text": "I want to generate image from text."}
{"question_id": 2, "text": "I want to generate text from image."}

After that, using the following command to get the results:

python3 gorilla_eval.py --model-path path/to/gorilla-7b-hf-v0
--question-file path/to/questions.jsonl
----answer-file path/to/answers.jsonl

You could use your own questions and get Gorilla responses. We also provide a set of questions that we used for evaluation.