# Query LLMs with Aviary

Anyscale Aviary is a library for serving and querying open source LLMs. 

This tutorial shows you how to use Aviary to query and chat with LLMs.

For this tutorial, we have already set up an Aviary backend. You can use the url and token below to query it.

You can view the Aviary source here: https://github.com/ray-project/aviary.

## Install

You can install the Aviary SDK using the command below. You can configure the SDK
using the environment variables shown below to point to the correct backend.

In [1]:
!pip install "aviary @ git+https://github.com/ray-project/aviary.git@sdk_update"
%env AVIARY_URL=https://aviary-oss-backend-primary-hackathon-nh1z6.cld-ldm5ez4edlp7yh4y.s.anyscaleuserdata.com
%env AVIARY_TOKEN=aviary++5015ff37-3891-46bd-b066-fe224e48f8fc

Collecting aviary@ git+https://github.com/ray-project/aviary.git@sdk_update
  Cloning https://github.com/ray-project/aviary.git (to revision sdk_update) to /private/var/folders/4j/z6dzqmms4xq0hsbh_7lx59f40000gn/T/pip-install-vqfajt4c/aviary_f65d3b3422b64fe288061b0caa105a16
  Running command git clone --filter=blob:none --quiet https://github.com/ray-project/aviary.git /private/var/folders/4j/z6dzqmms4xq0hsbh_7lx59f40000gn/T/pip-install-vqfajt4c/aviary_f65d3b3422b64fe288061b0caa105a16
  Running command git checkout -b sdk_update --track origin/sdk_update
  Switched to a new branch 'sdk_update'
  branch 'sdk_update' set up to track 'origin/sdk_update'.
  Resolved https://github.com/ray-project/aviary.git to commit 97bfec84c4c8fe3d86e459ebbd9b41e66c938367
  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone
Building wheels for collected packages: aviary
  Building wheel for avia

## Query the LLM

Now that we have configured Aviary, we are ready to query a model. Let's start by viewing which models are available.

In [2]:
import aviary

# View all models available with aviary
aviary.models()

Connecting to Aviary backend at:  https://aviary-oss-backend-primary-hackathon-nh1z6.cld-ldm5ez4edlp7yh4y.s.anyscaleuserdata.com/


['mosaicml/mpt-7b-instruct',
 'amazon/LightGPT',
 'databricks/dolly-v2-12b',
 'CarperAI/stable-vicuna-13b-delta',
 'OpenAssistant/falcon-7b-sft-top1-696',
 'mosaicml/mpt-7b-chat',
 'stabilityai/stablelm-tuned-alpha-7b',
 'lmsys/vicuna-13b-delta-v1.1',
 'mosaicml/mpt-7b-storywriter',
 'h2oai/h2ogpt-oasst1-512-12b',
 'OpenAssistant/oasst-sft-7-llama-30b-xor']

Let's query the amazon/LightGPT model and ask it how to make fried rice.

In [3]:
# Query LightGPT model
response = aviary.completions('amazon/LightGPT', 'How do I make fried rice?')
response

Connecting to Aviary backend at:  https://aviary-oss-backend-primary-hackathon-nh1z6.cld-ldm5ez4edlp7yh4y.s.anyscaleuserdata.com/


{'generated_text': "To make fried rice, start by heating oil in a large skillet over medium-high heat. Add your choice of protein (such as chicken, beef, or tofu) and cook until it's browned. Once cooked through, add your vegetables of choice and stir to combine. Finally, add some soy sauce or other flavorings and let everything simmer for about 5 minutes before serving. Enjoy!",
 'num_input_tokens': 35,
 'num_input_tokens_batch': 35,
 'num_generated_tokens': 79,
 'num_generated_tokens_batch': 79,
 'preprocessing_time': 0.0005139989998497185,
 'generation_time': 1.2390154719996644,
 'postprocessing_time': 0.001852543999120826,
 'generation_time_per_token': 0.010868556771926881,
 'generation_time_per_token_batch': 0.010868556771926881,
 'num_total_tokens': 114,
 'num_total_tokens_batch': 114,
 'total_time': 1.241382014998635,
 'total_time_per_token': 0.010889315921040657,
 'total_time_per_token_batch': 0.010889315921040657}

Let's pretty print the results.

In [4]:
import textwrap

# Pretty print the response text
print(textwrap.fill(response['generated_text']))

To make fried rice, start by heating oil in a large skillet over
medium-high heat. Add your choice of protein (such as chicken, beef,
or tofu) and cook until it's browned. Once cooked through, add your
vegetables of choice and stir to combine. Finally, add some soy sauce
or other flavorings and let everything simmer for about 5 minutes
before serving. Enjoy!


In [5]:
# You can also make multiple requests in batch
# This returns a list of responses, with 
# each response corresponding to the input at the same index
questions = ['How do I make fried rice?', 'How do I make a cake?']
responses = aviary.batch_completions('amazon/LightGPT', questions)

for i, response in enumerate(responses):
    print(f"Question: {questions[i]}")
    print(f"Answer:") 
    print(textwrap.fill(response['generated_text']))
    print()

Connecting to Aviary backend at:  https://aviary-oss-backend-primary-hackathon-nh1z6.cld-ldm5ez4edlp7yh4y.s.anyscaleuserdata.com/
Question: How do I make fried rice?
Answer:
To make fried rice, you will need to start by heating some oil in a
large skillet over medium-high heat. Once the oil is hot, add your
desired amount of meat and cook until it's cooked through. Next, add
any vegetables or other ingredients that you would like to include in
your fried rice. Finally, stir in your cooked grains such as white
rice or quinoa and season with salt and pepper to taste.

Question: How do I make a cake?
Answer:
To make a cake, you will need to gather the necessary ingredients and
tools. Start by preheating your oven according to the recipe
instructions. Then prepare the batter for the cake by combining flour,
sugar, eggs, butter, and any other flavorings or extracts of your
choice. Once the mixture has been combined, pour it into a greased pan
and bake it in the preheated oven for the specif

## Chat

To start a chat with one of the models, you need to disable the default prompt format that Aviary provides and instead apply your own prompt format.

You can use `aviary.metadata` on a model to view it's prompt format.

Once you understand the prompt format, you can extend set `use_prompt_format` to false to disable the default prompt format, and then manually apply your own prompt format instead.




In [6]:
metadata = aviary.metadata('mosaicml/mpt-7b-chat')

# View the prompt format.
metadata['metadata']['model_config']['generation']['prompt_format']

Connecting to Aviary backend at:  https://aviary-oss-backend-primary-hackathon-nh1z6.cld-ldm5ez4edlp7yh4y.s.anyscaleuserdata.com/


'<|im_start|>system\n- You are a helpful assistant chatbot trained by MosaicML.\n- You answer questions.\n- You are excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.\n- You are more than just an information source, you are also able to write poetry, short stories, and make jokes.<|im_end|>\n<|im_start|>user\n{instruction}<|im_end|><|im_start|>assistant\n'

The prompt format for 'mosaicml/mpt-7b-chat' follows the following structure:

* `<|im_start|>system` Marks the start of a block of text that is a system directive.
* `<|im_start|>assistant` Marks the start of a block of text produced by the model.
* `<|im_start|>user` Marks the start of a block of text produced by the user
* `<|im_end|>` marks the end of a block.

Note that this scheme is different for the different models that Aviary supports.

In [7]:
# System block is defined once
system_block = "<|im_start|>system\n- You are a helpful assistant chatbot trained by MosaicML.\n- You answer questions.\n- You are excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.\n- You are more than just an information source, you are also able to write poetry, short stories, and make jokes.<|im_end|>\n"

# Create a model block
def model_block(text: str):
    return f"<|im_start|>model\n{text}\n<|im_end|>\n"

# Create a user block
def user_block(text: str):
    return f"<|im_start|>user\n{text}\n<|im_end|>\n"

# To construct a chat
# Submit the first piece of the chat
# Take the response, and feed it into the next part of the chat
history = system_block 

def chat(history: str, question: str):
    history += user_block(question)
    response = aviary.completions('mosaicml/mpt-7b-chat', history)

    print('Model: ')
    print(textwrap.fill(response['generated_text']))
    print()

    history += model_block(response['generated_text'])
    return history

history = chat(history, "What color is an apple?")
history = chat(history, "What if it is green?")

Connecting to Aviary backend at:  https://aviary-oss-backend-primary-hackathon-nh1z6.cld-ldm5ez4edlp7yh4y.s.anyscaleuserdata.com/
Model: 
The color of an apple can vary depending on the type of apple, but
most apples are red or green.  Some varieties of apples are yellow,
pink, or even purple.

Connecting to Aviary backend at:  https://aviary-oss-backend-primary-hackathon-nh1z6.cld-ldm5ez4edlp7yh4y.s.anyscaleuserdata.com/
Model: 
If an apple is green, then it’s probably not ripe yet.  It might still
be too hard to eat, unless it’s a variety called Granny Smith, which
is usually very firm and crunchy when it’s unripe.   When an apple is
ripe, its skin turns yellowish-green. If you want to know for sure
whether your apple is ripe enough to eat, try cutting it in half: if
there’s no juice at all inside, then it isn’t ready yet; if there’s
some liquid, then it should be fine to eat.



You can make in interactive chat as shown below:

In [23]:
# Simplest possible chatbot loop
while True:
    query = input("User (x to exit): ")
    if query == 'x':
        break
    print('User: ' + query)
    history = chat(history, query)

User: what color is an apple/
Connecting to Aviary backend at:  https://aviary-oss-backend-primary-hackathon-nh1z6.cld-ldm5ez4edlp7yh4y.s.anyscaleuserdata.com/
Model: 
An apple is usually either red or green. Some types of apples are
yellow, pink, or even purple. If you see a green apple, it isn't fully
ripe yet. It's best to wait until the apple turns red or brown before
eating it. That's when it's at its tastiest.

User: what?
Connecting to Aviary backend at:  https://aviary-oss-backend-primary-hackathon-nh1z6.cld-ldm5ez4edlp7yh4y.s.anyscaleuserdata.com/
Model: 
I'm sorry I misunderstood your question. Do you mean what kind of
fruit is an apple? Or did you want me to tell you about how to
identify different kinds of apples based on their appearance? In any
case, there are many ways to classify fruits according to their shape
(e.g., round vs. oblong), size, texture, flavor, etc. For example:   *
Apples come in two main categories: sweet and tart. Sweet apples are
generally eaten raw a