In [2]:
import time

import kscope

### Conecting to the Service
First we connect to the Kaleidoscope service through which we'll interact with the LLMs and see which models are avaiable to us

In [23]:
# Establish a client connection to the kscope service
client = kscope.Client(gateway_host="llm.cluster.local", gateway_port=3001)

Show all model instances that are currently active

In [27]:
client.model_instances

[]

To start, we obtain a handle to a model. In this example, let's use the LLaMA2-7B model.

In [25]:
model = client.load_model("llama2-7b")
# If this model is not actively running, it will get launched in the background.
# In this case, wait until it moves into an "ACTIVE" state before proceeding.
while model.state != "ACTIVE":
    time.sleep(1)

KeyboardInterrupt: 

In [5]:
moderate_generation_config = {"max_tokens": 50, "top_k": 4, "top_p": 1.0, "rep_penalty": 1.2, "temperature": 0.9}

Let's ask the model some questions

In [6]:
generation = model.generate("What is the capital of Canada?", moderate_generation_config)
# Extract the text from the returned generation
generation.generation["text"]

["\nI don't know and I don't care about the capitals of other countries."]

# Few-Shot Chain of Thought Prompting

We'll start by prompting LLaMA2-7B to solve some word problems using the Few-Shot CoT method proposed in ["Chain-of-Thought Prompting Elicits Reasoning
in Large Language Models"](https://arxiv.org/pdf/2201.11903.pdf)

First, let's see what happens if we try to solve some word problems with a zero-shot prompt.

In [28]:
zero_shot_prompt = (
    "The cafeteria had 23 apples. If they used 20 to make lunch and bought 6 more, how many apples do " "they have?"
)

print(zero_shot_prompt)

The cafeteria had 23 apples. If they used 20 to make lunch and bought 6 more, how many apples do they have?


In [None]:
generation_example = model.generate(zero_shot_prompt, generation_config=moderate_generation_config)
print(generation_example.generation["text"])

The correct answer to this word problem is 6.

Now let's try performing a few-shot prompt to see if that helps the model provide the correct answer in a format that we can extract.

In [29]:
few_shot_prompt = (
    "Q: Roger has 5 tennis balls. He buys 2 more cans of tennis balls. Each can has 3 tennis balls. How many tennis "
    "balls does he have now?\n\nA: The answer is 11.\n\nQ: The cafeteria had 23 apples. If they used 20 to make lunch "
    "and bought 6 more, how many apples do they have?\n\nA: "
)
print(few_shot_prompt)

Q: Roger has 5 tennis balls. He buys 2 more cans of tennis balls. Each can has 3 tennis balls. How many tennis balls does he have now?

A: The answer is 11.

Q: The cafeteria had 23 apples. If they used 20 to make lunch and bought 6 more, how many apples do they have?

A: 


In [None]:
generation_example = model.generate(few_shot_prompt, generation_config=moderate_generation_config)
print(generation_example.generation["text"])

Now, let's try prompting the model with a few-shot CoT prompt, where we provide an example of the kind of reasoning required to answer the question.

In [30]:
few_shot_cot_prompt = (
    "Q: Roger has 5 tennis balls. He buys 2 more cans of tennis balls. Each can has 3 tennis balls. How many tennis "
    "balls does he have now?\n\nA: Roger started with 5 balls. 2 cans of 3 tennis balls each is 6 tennis balls. "
    "5 + 6 = 11. The answer is 11.\n\nQ: The cafeteria had 23 apples. If they used 20 to make lunch and bought 6 "
    "more, how many apples do they have?\n\nA: "
)
print(few_shot_cot_prompt)

Q: Roger has 5 tennis balls. He buys 2 more cans of tennis balls. Each can has 3 tennis balls. How many tennis balls does he have now?

A: Roger started with 5 balls. 2 cans of 3 tennis balls each is 6 tennis balls. 5 + 6 = 11. The answer is 11.

Q: The cafeteria had 23 apples. If they used 20 to make lunch and bought 6 more, how many apples do they have?

A: 


In [None]:
generation_example = model.generate(few_shot_cot_prompt, generation_config=moderate_generation_config)
print(generation_example.generation["text"])

Let's try compare few-shot prompting with few-shot CoT for slightly different kind of problem. This example is drawn from the AQuA: Algebraic Word Problems task.

In [31]:
few_shot_prompt = (
    "Q: John found that the average of 15 numbers is 40. If 10 is added to each number then the mean of "
    "the numbers is? Answer Choices: (a) 50 (b) 45 (c) 65 (d) 78 (e) 64\n\nA: The answer is (a).\n\nQ: If a / b = 3/4 "
    "and 8a + 5b = 22, then find the value of a. Answer Choices: (a) 1/2 (b) 3/2 (c) 5/2 (d) 4/2 (e) 7/2\n\nA: "
)
print(few_shot_prompt)

Q: John found that the average of 15 numbers is 40. If 10 is added to each number then the mean of the numbers is? Answer Choices: (a) 50 (b) 45 (c) 65 (d) 78 (e) 64

A: The answer is (a).

Q: If a / b = 3/4 and 8a + 5b = 22, then find the value of a. Answer Choices: (a) 1/2 (b) 3/2 (c) 5/2 (d) 4/2 (e) 7/2

A: 


In [None]:
generation_example = model.generate(few_shot_prompt, generation_config=moderate_generation_config)
print(generation_example.generation["text"])

The correct choice for this problem is (b)

Let's see if we can extract the correct answer with few-shot CoT

In [33]:
few_shot_cot_prompt = (
    "Q: John found that the average of 15 numbers is 40. If 10 is added to each number then the mean of the numbers "
    "is? Answer Choices: (a) 50 (b) 45 (c) 65 (d) 78 (e) 64\n\nA: If 10 is added to each number, then the mean of the "
    "numbers also increases by 10. So the new mean would be 50. The answer is (a).\n\nQ: If a / b = 3/4 and 8a + 5b "
    "= 22,then find the value of a. Answer Choices: (a) 1/2 (b) 3/2 (c) 5/2 (d) 4/2 (e) 7/2 \n\nA: "
)
print(few_shot_cot_prompt)

Q: John found that the average of 15 numbers is 40. If 10 is added to each number then the mean of the numbers is? Answer Choices: (a) 50 (b) 45 (c) 65 (d) 78 (e) 64

A: If 10 is added to each number, then the mean of the numbers also increases by 10. So the new mean would be 50. The answer is (a).

Q: If a / b = 3/4 and 8a + 5b = 22,then find the value of a. Answer Choices: (a) 1/2 (b) 3/2 (c) 5/2 (d) 4/2 (e) 7/2 

A: 


In [None]:
generation_example = model.generate(few_shot_cot_prompt, generation_config=moderate_generation_config)
print(generation_example.generation["text"])

# Zero-Shot Chain of Thought Prompting

It can be tedious and tricky to form useful and effective reasoning examples. Some research has shown that the choice of reasoning examples in CoT prompting can have a large impact on how well the model accomplishes the downstream task. So let's try a zero-shot CoT approach devised in ["Large Language Models are Zero-Shot Reasoners"](https://arxiv.org/pdf/2205.11916.pdf)

In [34]:
few_shot_prompt = (
    "Q: Roger has 5 tennis balls. He buys 2 more cans of tennis balls. Each can has 3 tennis balls. How many tennis "
    "balls does he have now?\n\nA: The answer is 11.\n\nQ: A juggler can juggle 16 balls. Half of the balls are golf "
    "balls, and half of the golf balls are blue. How many blue golf balls are there?\n\nA: "
)
print(few_shot_prompt)

Q: Roger has 5 tennis balls. He buys 2 more cans of tennis balls. Each can has 3 tennis balls. How many tennis balls does he have now?

A: The answer is 11.

Q: A juggler can juggle 16 balls. Half of the balls are golf balls, and half of the golf balls are blue. How many blue golf balls are there?

A: 


In [None]:
generation_example = model.generate(few_shot_prompt, generation_config=moderate_generation_config)
print(generation_example.generation["text"])

The correct answer to this problem is 4.

Perhaps we can extract the correct answer with zero-Shot CoT is split into two stages:
1) Reasoning Generation
2) Answer Extraction

In [35]:
reasoning_generation_prompt = (
    "Q: A juggler can juggle 16 balls. Half of the balls are golf balls, and half of the golf balls are blue. How "
    "many blue golf balls are there?\n\nA: Let’s think step by step."
)
print(reasoning_generation_prompt)

Q: A juggler can juggle 16 balls. Half of the balls are golf balls, and half of the golf balls are blue. How many blue golf balls are there?

A: Let’s think step by step.


In [None]:
reasoning_generation = model.generate(
    reasoning_generation_prompt, generation_config=moderate_generation_config
).generation["text"]
print(reasoning_generation)

In [None]:
answer_extraction_prompt = f"{reasoning_generation_prompt}\n{reasoning_generation} Therefore, the answer is "
print(answer_extraction_prompt)

In [None]:
answer_generation = model.generate(answer_extraction_prompt, generation_config=moderate_generation_config).generation[
    "text"
]
print(answer_generation)