In [1]:
import json
import time
import kscope

# Getting Started

There is a bit of documentation on how to interact with the large models [here](https://kaleidoscope-sdk.readthedocs.io/en/latest/). The relevant github links to the SDK are [here](https://github.com/VectorInstitute/kaleidoscope-sdk) and underlying code [here](https://github.com/VectorInstitute/kaleidoscope).

First we connect to the service through which we'll interact with the LLMs and see which models are available to us

In [2]:
# Establish a client connection to the kscope service
client = kscope.Client(gateway_host="llm.cluster.local", gateway_port=3001)

Show all supported models

In [3]:
client.models

['gpt2',
 'llama2-7b',
 'llama2-7b_chat',
 'llama2-13b',
 'llama2-13b_chat',
 'llama2-70b',
 'llama2-70b_chat',
 'falcon-7b',
 'falcon-40b',
 'sdxl-turbo']

Show all model instances that are currently active

In [4]:
client.model_instances

[{'id': 'a33c0f4d-da2b-4861-8c3d-91e66955e879',
  'name': 'falcon-7b',
  'state': 'ACTIVE'},
 {'id': '7389b196-9637-4a42-adca-7bfb4f59733d',
  'name': 'llama2-7b',
  'state': 'ACTIVE'},
 {'id': 'b486d208-a570-47ed-bb35-9de310f9cd02',
  'name': 'llama2-70b',
  'state': 'ACTIVE'},
 {'id': 'b3871a00-4848-49be-a1c8-c8f6c47ad8b2',
  'name': 'falcon-40b',
  'state': 'ACTIVE'},
 {'id': '99bee87e-abc4-44fd-b4d3-ea2c527bb93e',
  'name': 'llama2-13b',
  'state': 'ACTIVE'}]

To start, we obtain a handle to a model. In this example, let's use the Falcon 7B parameter model.

In [5]:
model = client.load_model("llama2-7b")
while model.state != "ACTIVE":
    time.sleep(1)

We need to configure the model to generate in the way we want it to. So we set a number of important parameters. For a discussion of the configuration parameters see: `src/reference_implementations/prompting_vector_llms/CONFIG_README.md`

**NOTE**: We'll be doing deterministic sampling for most of this notebook ("do_sample" is False by default). However, you can turn sampling on, as we did in the `short_generation_config` by setting `"do_sample": True` in the `long_generation_config` that is used throughout this notebook.

In [6]:
short_generation_config = {"max_tokens": 20, "top_p": 1.0, "temperature": 0.7, "do_sample": True}
long_generation_config = {"max_tokens": 50, "top_p": 1.0}

In [7]:
def create_rag_prompt(question: str, reference: str) -> str:
    non_reference = f"Question: {question}\n\nAnswer:"
    if len(reference) > 0:
        return f"Reference: {reference}\n\n{non_reference}"
    return non_reference

### Daniel RAG

In [8]:
reference = '''
Investigation case number: A5678910. The customer, a grocery store and its owner, are suspected of intentionally 
structuring cash deposits to circumvent federal reporting requirements. The customer is also engaged in activity 
indicative of an informal value transfer operation: deposits of bulk cash, third party out of state personal 
checks and money orders, and engaging in aggregate wire transfers to Dubai, UAE. John Doe opened a personal checking account, #12345-6789, in March of 1994. Doe indicated that he was born in 
Yemen, presented a Virginia driver's license as identification, and claimed he was the self-employed owner of a 
grocery store identified as Acme, Inc. A business checking account, #23456-7891, was opened in January of 1998 
for Acme, Inc. Between January 17, 2003, and March 21, 2003, John Doe was the originator of nine wires totaling 
$225,000. The wire transfers were always conducted at the end of each week in the amount of $25,000. 
All of the wires were remitted to the Bank of Anan in Dubai, UAE, to benefit Kulkutta Building Supply Company, 
account #3489728.
'''

In [9]:
QUESTION_1 = 'Where does the client send international wires?'

In [10]:
rag_prompt = create_rag_prompt(reference=reference, question=QUESTION_1)
print(rag_prompt)
generation = model.generate(rag_prompt, long_generation_config)
# Extract the text from the returned generation
print(generation.generation["sequences"][0])

Reference: 
Investigation case number: A5678910. The customer, a grocery store and its owner, are suspected of intentionally 
structuring cash deposits to circumvent federal reporting requirements. The customer is also engaged in activity 
indicative of an informal value transfer operation: deposits of bulk cash, third party out of state personal 
checks and money orders, and engaging in aggregate wire transfers to Dubai, UAE. John Doe opened a personal checking account, #12345-6789, in March of 1994. Doe indicated that he was born in 
Yemen, presented a Virginia driver's license as identification, and claimed he was the self-employed owner of a 
grocery store identified as Acme, Inc. A business checking account, #23456-7891, was opened in January of 1998 
for Acme, Inc. Between January 17, 2003, and March 21, 2003, John Doe was the originator of nine wires totaling 
$225,000. The wire transfers were always conducted at the end of each week in the amount of $25,000. 
All of the wires w

In [11]:
QUESTION_2 = 'What is the investigation case number?'

In [12]:
rag_prompt = create_rag_prompt(reference=reference, question=QUESTION_2)
print(rag_prompt)
generation = model.generate(rag_prompt, long_generation_config)
# Extract the text from the returned generation
print(generation.generation["sequences"][0])

Reference: 
Investigation case number: A5678910. The customer, a grocery store and its owner, are suspected of intentionally 
structuring cash deposits to circumvent federal reporting requirements. The customer is also engaged in activity 
indicative of an informal value transfer operation: deposits of bulk cash, third party out of state personal 
checks and money orders, and engaging in aggregate wire transfers to Dubai, UAE. John Doe opened a personal checking account, #12345-6789, in March of 1994. Doe indicated that he was born in 
Yemen, presented a Virginia driver's license as identification, and claimed he was the self-employed owner of a 
grocery store identified as Acme, Inc. A business checking account, #23456-7891, was opened in January of 1998 
for Acme, Inc. Between January 17, 2003, and March 21, 2003, John Doe was the originator of nine wires totaling 
$225,000. The wire transfers were always conducted at the end of each week in the amount of $25,000. 
All of the wires w

In [13]:
QUESTION_3 = 'Who is the customer under the investigation case number: A5678910?'

In [14]:
rag_prompt = create_rag_prompt(reference=reference, question=QUESTION_3)
print(rag_prompt)
generation = model.generate(rag_prompt, long_generation_config)
# Extract the text from the returned generation
print(generation.generation["sequences"][0])

Reference: 
Investigation case number: A5678910. The customer, a grocery store and its owner, are suspected of intentionally 
structuring cash deposits to circumvent federal reporting requirements. The customer is also engaged in activity 
indicative of an informal value transfer operation: deposits of bulk cash, third party out of state personal 
checks and money orders, and engaging in aggregate wire transfers to Dubai, UAE. John Doe opened a personal checking account, #12345-6789, in March of 1994. Doe indicated that he was born in 
Yemen, presented a Virginia driver's license as identification, and claimed he was the self-employed owner of a 
grocery store identified as Acme, Inc. A business checking account, #23456-7891, was opened in January of 1998 
for Acme, Inc. Between January 17, 2003, and March 21, 2003, John Doe was the originator of nine wires totaling 
$225,000. The wire transfers were always conducted at the end of each week in the amount of $25,000. 
All of the wires w

In [15]:
QUESTION_4 = '''A clinet can conduct different types of transactions including cash, cheque, email money transfer,
                or wires. A wire transaction can be send to another country. For example, a client may use wires
                to send money from Canada to Korea. Which country does the client in the case number A5678910 
                send international wires?'''

In [16]:
rag_prompt = create_rag_prompt(reference=reference, question=QUESTION_4)
print(rag_prompt)
generation = model.generate(rag_prompt, long_generation_config)
# Extract the text from the returned generation
print(generation.generation["sequences"][0])

Reference: 
Investigation case number: A5678910. The customer, a grocery store and its owner, are suspected of intentionally 
structuring cash deposits to circumvent federal reporting requirements. The customer is also engaged in activity 
indicative of an informal value transfer operation: deposits of bulk cash, third party out of state personal 
checks and money orders, and engaging in aggregate wire transfers to Dubai, UAE. John Doe opened a personal checking account, #12345-6789, in March of 1994. Doe indicated that he was born in 
Yemen, presented a Virginia driver's license as identification, and claimed he was the self-employed owner of a 
grocery store identified as Acme, Inc. A business checking account, #23456-7891, was opened in January of 1998 
for Acme, Inc. Between January 17, 2003, and March 21, 2003, John Doe was the originator of nine wires totaling 
$225,000. The wire transfers were always conducted at the end of each week in the amount of $25,000. 
All of the wires w

In [17]:
QUESTION_5 = 'What is the total count and value of the wires transactions send by the customer?'

In [18]:
# B_INST, E_INST = "<s>[INST]", "[/INST]"
# B_SYS, E_SYS = "<<SYS>>\n", "\n<</SYS>>\n\n"
# DEFAULT_SYSTEM_PROMPT = """\
# You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.

# If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information."""

# SYSTEM_PROMPT = B_SYS + DEFAULT_SYSTEM_PROMPT + E_SYS

# def get_prompt(instruction):
#     prompt_template =  B_INST + SYSTEM_PROMPT + instruction + E_INST
#     return prompt_template

In [19]:
rag_prompt = create_rag_prompt(reference=reference, question=QUESTION_5)
# rag_prompt = get_prompt(rag_prompt)
print(rag_prompt)

Reference: 
Investigation case number: A5678910. The customer, a grocery store and its owner, are suspected of intentionally 
structuring cash deposits to circumvent federal reporting requirements. The customer is also engaged in activity 
indicative of an informal value transfer operation: deposits of bulk cash, third party out of state personal 
checks and money orders, and engaging in aggregate wire transfers to Dubai, UAE. John Doe opened a personal checking account, #12345-6789, in March of 1994. Doe indicated that he was born in 
Yemen, presented a Virginia driver's license as identification, and claimed he was the self-employed owner of a 
grocery store identified as Acme, Inc. A business checking account, #23456-7891, was opened in January of 1998 
for Acme, Inc. Between January 17, 2003, and March 21, 2003, John Doe was the originator of nine wires totaling 
$225,000. The wire transfers were always conducted at the end of each week in the amount of $25,000. 
All of the wires w

In [20]:
# generation = model.generate(prompt, long_generation_config)
# # Extract the text from the returned generation
# print(generation.generation["sequences"][0])

In [21]:
generation = model.generate(rag_prompt, long_generation_config)
# Extract the text from the returned generation
print(generation.generation["sequences"][0])


The total count of the wire transactions send by the customer is 1.
The total value of the wire transactions send by the customer is $225,000.


## Reference

Sources:


