# Working with Aleph Alpha Technology
Hi, great to see you working with Aleph Alpha Technology. 

This notebook will support you on your journey by providing you with some examples on how to use our API and how to solve tasks with it.

## Prerequisites:
- You have an Aleph Alpha account and API key (you can sign up here: https://app.aleph-alpha.com/)
- You have glanced over our documentation (https://docs.aleph-alpha.com/)
- YOu have played around with our playground (https://app.aleph-alpha.com/playground)


## Content

This notebook will contain information on the following topics:
1. Use our LLMs to generate text and solve tasks
2. Using embeddings to find similar relevant information
3. Use semantic embeddings and completion to answer questions
4. Chaining multiple requests to solve complex tasks

Let's get started!

In [1]:
# When using a colab notebook:
!pip install aleph-alpha-client scipy qdrant-client



##### These are just some imports to start working with our API
If you are interested, here is what the individual imports do:

| Import | Description |
| --- | --- |
| ``Client`` | This is the main class that you will use to authenticate with the API. |
| ``Prompt`` | We use this class to format information correctly for our models |
| ``CompletionRequest`` | CompletionRequests are used to reuqest our models to generate text, e.g. for solving tasks |
| ``SemanticEmbeddingRequest`` | SemanticEmbeddingRequests are used to request our models to generate embeddings for text, e.g. for searching for information or for classification |
| ``ExplanationRequest`` | ExplanationRequests are used to request our models to generate explanations for text, e.g. for explaining a where an answer comes from |


In [2]:
from aleph_alpha_client import Client, Prompt, CompletionRequest, CompletionResponse, SemanticEmbeddingRequest, SemanticEmbeddingResponse, SemanticRepresentation, ExplanationRequest
from scipy import spatial
import numpy as np
import json
import os

### Step 0: Using the client to authenticate with the API
First, we need to authenticate with the API. To do this, we need to create a ``Client`` object and pass it our API key. You can create your API key in your [account settings](https://app.aleph-alpha.com/profile).

If you want to use the local API, you need to also pass the ``host`` parameter to the client.

In [3]:
# Authenticate with the API by using the client class
client = Client(token="")

# Step 1: Using LLMs to generate text and solve tasks
In this section, we will use our LLMs to generate text and solve tasks.

We will use the same LLM for both tasks. This is because our LLMs are trained to solve many different tasks. This means that we can use the same LLM for many different tasks.

We will use the completion endpoint to generate text and solve tasks. You can find more information about this endpoint in the [Completion Documentation](https://docs.aleph-alpha.com/docs/tasks/complete/).

With completions we prompt the model to generate text. Depending on the prompt, the model will generate different text. This is a very powerful universal tool to generate text and solve tasks.

However, to get the best results, we need to formulate our prompts correctly. We need to keep in mind the structure that the model expects and also how to word our requests so that the model understands what we want.

### 1.1 Generating text
First, let's just start with generating text. While our API offers different models, we will start with our ``Control-models``. These models are specifically optimized to solve tasks that you give them.

We will stick to the structure that these models expect. This is a good starting point to get familiar with the API.

```markdown
### Instruction:
INPUT YOUR INSTRUCTION HERE

### Input:
YOUR INPUT

### Response:
```

Try to vary the input and see how the model responds. You can also try to change the instruction and see how the model responds.

In [4]:
# Write a prompt, so that the model knows what to do
prompt_text = """### Instruction: 
Complete the sentence below with a continuation that makes sense.

### Input:
An apple a day

### Response:"""

# Create the completion request
request = CompletionRequest(
    prompt=Prompt.from_text(prompt_text), 
    maximum_tokens=20, # Parameter to control the maximum length of the completion
    temperature=0.0, # Parameter to control the randomness of the completion
    stop_sequences=["\n"]) # Parameter to control the stopping criteria of the completion

# Send the prompt to the API
response = client.complete(request=request, model="luminous-base-control")

response_text = response.completions[0].completion

# Print the response
print(f"The model returned: `{response_text}`")

PermissionError: [Errno 401] {"error":"No token provided","code":"UNAUTHENTICATED"}

### 1.2 Solving specific tasks
Now that we have seen how to generate text, let's try to solve a specific task. We will use the same model as before, but we will give it a different instruction.

This time, we want to create a product text for a new product. We will give the model a short description of the product and ask it to generate a product text.

We will be using both the ``Control-models`` as well as the ``foundation-models``. The ``foundation-models`` are trained on a large amount of data and are able to generate text that is more fluent and coherent. However, they are not as good at solving specific tasks as the ``Control-models``.

While control models work with a specific structure, the foundation models are more flexible. This means that we can use them to generate text in a more natural way. However, they require a ``few-shot`` prompt. This means that we need to give them a few examples of what we want them to do.

In [None]:
# Here is a control model prompt for you to try out
control_prompt_text = """### Instruction:
Generate a product description for the following product.
Only use information from the product description.

### Input:
Name: Multifunctional Yoga Mat
Color: Blue
Material: Rubber
Size: 180 x 60 x 0.5 cm
Uses: Yoga, Pilates, Fitness, Gymnastics, Camping, Picnic, Sleep, Play, etc.

### Response:"""

# Let's send the prompt to the API and see what the model returns
request = CompletionRequest(
    prompt=Prompt.from_text(control_prompt_text),
    maximum_tokens=100,
    temperature=0.0,
    stop_sequences=[])

response = client.complete(request=request, model="luminous-base-control")

response_text = response.completions[0].completion

print(f"The model returned: `{response_text}`")


In [None]:
# This is how we would write the prompt as a few-shot learning prompt
few_shot_prompt_text = """Task: Generate a product description for the following product.
Only use information from the product description.
###
Product:
- Name: Ergonomic Office Chair
- Color: Black
- Material: Plastic, Metal, Fabric
- Functions: Height adjustable, 360 degree swivel, seat tilt, back tilt
- Uses: Office, Home, Gaming, etc.
Description: This ergonomic office chair is made of high-quality materials, such as plastic, metal, and fabric and is very comfortable to sit on. 
It is height adjustable, can swivel 360 degrees, and has a seat and back tilt. 
It is suitable for use in the office, at home, or for gaming.
###
Product:
- Name: Multifunctional Yoga Mat
- Color: Blue
- Material: Rubber
- Size: 180 x 60 x 0.5 cm
- Uses: Yoga, Pilates, Fitness, Gymnastics, Camping, Picnic, Sleep, Play, etc.
Description:"""

# Let's send the prompt to the API and see what the model returns
request = CompletionRequest(
    prompt=Prompt.from_text(few_shot_prompt_text),
    maximum_tokens=100,
    temperature=0.5, # We can use a higher temperature to make the model more creative
    stop_sequences=["###"] # with the foundation models we need to specify the stop sequence
    )

response = client.complete(request=request, model="luminous-extended")


response_text = response.completions[0].completion

print(f"The model returned: `{response_text}`")


### 1.3 Experiment with completions LLMs yourself

Now you can go ahead and experiment with completions yourself. 

Try to solve different tasks with the LLMs. 

Experiment with ``Control-models`` and ``foundation-models``. 
See how they differ in their responses and how they solve tasks.

In [None]:
# TODO Change the prompt to be solve a different task
control_prompt_text = """Try to write your own prompt here."""

# Send the prompt to the API and see what the model returns
request = CompletionRequest(    
    prompt=Prompt.from_text(control_prompt_text),
    maximum_tokens=100,
    temperature=0.0,
    stop_sequences=[])

response = client.complete(request=request, model="luminous-base-control")
response_text = response.completions[0].completion

print(f"The model returned: `{response_text}`")

--------------------

### Step 2: Using Embeddings to search for information
In many cases, the relevant information to solve a task may not be available or known to the model.

With Semantic Search, we can use the embeddings to search for relevant information in a corpus of documents. The idea is that LLMs are able to understand the meaning of a question and the meaning of a document, and thus, can find the most relevant document to answer the question.

We do this by first encoding the question and the documents into embeddings. Then, we compute the similarity between the question embedding and the document embeddings. Finally, we return the document with the highest similarity score.

You can find more information about this technique in the [Semantic Embedding Documentation](https://docs.aleph-alpha.com/docs/tasks/semantic_embed/).

Let's see how this works in practice.

### 2.1 Creating embeddings for text.
In Order to find the correct documents, we need to turn our text into numbers.
We do that with semnatic embeddings. These are vectors that represent the meaning of the data.

Let's use Aleph Alpha technology to create embeddings for our text.

In [None]:
# Two texts and a question to be embedded and searched for
text_1 = "With our semantic_embed-endpoint you can create semantic embeddings for your text. This functionality can be used in a myriad of ways. For more information please check out our blog-post on Luminous-Explore, introducing the model behind the semantic_embed-endpoint. In order to effectively search through your own documents, it is important to ensure that they can be easily compared to each other. Our asymmetric embeddings are designed to help find the pieces of your documents that are most relevant to a query shorter than the documents in the database. Here we will use short queries and longer splits of law texts."
text_2 = "You can interact with a Luminous model by sending it a text. We call this a prompt. It will then produce text that continues your input and return it to you. This is what we call a completion. Generally speaking, our models attempt to find the best continuation for a given input. Practically, this means that the model first recognizes the style of the prompt and then attempts to continue it accordingly."

question = "How can I search through my documents with embeddings?"

In [None]:
# using the API to embed the text

# We embed the texts as Documents, as the contain a lot of information
request_1 = SemanticEmbeddingRequest(prompt=Prompt.from_text(text_1), representation=SemanticRepresentation.Document)
request_2 = SemanticEmbeddingRequest(prompt=Prompt.from_text(text_2), representation=SemanticRepresentation.Document)

# We embed the question as a Query, as it is a short text
request_question = SemanticEmbeddingRequest(prompt=Prompt.from_text(question), representation=SemanticRepresentation.Query)

# We send the requests to the API
embedding_1 = client.semantic_embed(request_1, model="luminous-base").embedding
embedding_2 = client.semantic_embed(request_2, model="luminous-base").embedding
embedding_question = client.semantic_embed(request_question, model="luminous-base").embedding

### 2.2 Calculating the similarity between embeddings
Now that we have embeddings for our question and our documents, we can calculate the similarity between them.
For that we use the cosine similarity. This is a measure of how similar two vectors are. The higher the value, the more similar the vectors are.

In [None]:
# We calculate the cosine similarity between the question and the texts
similarity_1 = 1 - spatial.distance.cosine(embedding_1, embedding_question)
similarity_2 = 1 - spatial.distance.cosine(embedding_2, embedding_question)

# We print the results
print("The similarity between the question and text 1 is: " + str(similarity_1))
print("The similarity between the question and text 2 is: " + str(similarity_2))

We can see that the document with the highest similarity score is the one that we are looking for.
This semantic search is a very powerful tool to find relevant information.

### 2.3 Experiment with embeddings yourself
Now you can go ahead and experiment with embeddings yourself. 

When do they work well? 

When do they not work well?

In [None]:
# TODO Change the text to be embedded and searched for
test_text = "The quick brown fox jumps over the lazy dog."

# TODO Change the question to be embedded and searched for
test_question = "What does the fox do?"

# run the code to embed the text and question and calculate the similarity
request_test_text = SemanticEmbeddingRequest(prompt=Prompt.from_text(test_text), representation=SemanticRepresentation.Document)
request_test_question = SemanticEmbeddingRequest(prompt=Prompt.from_text(test_question), representation=SemanticRepresentation.Query)
embedding_test_text = client.semantic_embed(request_test_text, model="luminous-base").embedding
embedding_test_question = client.semantic_embed(request_test_question, model="luminous-base").embedding
similarity_test = 1 - spatial.distance.cosine(embedding_test_text, embedding_test_question)
print("The similarity between the question and text 1 is: " + str(similarity_test))

--------------------

## Step 3: Using Semantic Embeddings and Completions together to answer questions
In this section, we will use the search and completions endpoints together to answer questions.

With `semantic search`, we can find relevant information. With `completions`, we can generate text and solve tasks.

Our application logic is as follows:
1. We use `semantic search` to make information searchable.
2. We select the most similar document as background information.
3. We use `completions` to generate the answer to the question.

In [None]:
# Let's reuse the texts from the previous task
text_1 = "With our semantic_embed-endpoint you can create semantic embeddings for your text. This functionality can be used in a myriad of ways. For more information please check out our blog-post on Luminous-Explore, introducing the model behind the semantic_embed-endpoint. In order to effectively search through your own documents, it is important to ensure that they can be easily compared to each other. Our asymmetric embeddings are designed to help find the pieces of your documents that are most relevant to a query shorter than the documents in the database. Here we will use short queries and longer splits of law texts."
text_2 = "You can interact with a Luminous model by sending it a text. We call this a prompt. It will then produce text that continues your input and return it to you. This is what we call a completion. Generally speaking, our models attempt to find the best continuation for a given input. Practically, this means that the model first recognizes the style of the prompt and then attempts to continue it accordingly."

question = "How can I search through my documents with embeddings?"


# Creatting the embedding for the texts and the question
request_1 = SemanticEmbeddingRequest(prompt=Prompt.from_text(text_1), representation=SemanticRepresentation.Document)
request_2 = SemanticEmbeddingRequest(prompt=Prompt.from_text(text_2), representation=SemanticRepresentation.Document)
request_question = SemanticEmbeddingRequest(prompt=Prompt.from_text(question), representation=SemanticRepresentation.Query)
embedding_1 = client.semantic_embed(request_1, model="luminous-base").embedding
embedding_2 = client.semantic_embed(request_2, model="luminous-base").embedding
embedding_question = client.semantic_embed(request_question, model="luminous-base").embedding

### Step 3.1: Using a vectordatabase to store embeddings
Instead of doin g everything ourselves, we can use a Vectordatabase to store the embeddings for us. This makes it easier to search for information.

We will be using qdrant as our vectordatabase. 
Qdrant is an open-source vectordatabase that is easy to use and fast.
You can find more information about qdrant [here](https://qdrant.tech/).

In [None]:
# First we spin up the Qdrant server
from qdrant_client import QdrantClient
from qdrant_client.http.models import Distance, VectorParams, Batch

q_client = QdrantClient(path="db")

q_client.recreate_collection(
    collection_name="test_collection",
    vectors_config=VectorParams(size=128, distance=Distance.COSINE),
)

Now we need to store the documents in the vectordatabase

In [None]:
# Let's create embeddings for each of the texts and store them in a list
texts = [text_1, text_2]
embeddings = []
for text in texts:
    # embed the texts
    embeddings.append(client.semantic_embed(SemanticEmbeddingRequest(prompt=Prompt.from_text(text), representation=SemanticRepresentation.Document, compress_to_size=128), model="luminous-base").embedding)
    
    
# now we can upsert the data into Qdrant
ids = list(range(len(texts)))
payloads = [{"text": text} for text in texts]

q_client.upsert(
     collection_name="test_collection",
     points=Batch(
     ids=ids,
     payloads=payloads,
     vectors=embeddings
     )
)


### Step 3.2 Using semantic search to find relevant information

Now that we have stored our documents in the vectordatabase, we can use semantic search to find relevant information.

For that we just have to send the embeddings of our question to the vectordatabase and it will return the most similar documents.

In [None]:
# embedding the question

# We embed the question as a Query, as it is a short text
request_question = SemanticEmbeddingRequest(
    prompt=Prompt.from_text(question), 
    representation=SemanticRepresentation.Query, 
    compress_to_size=128)

embedding_question = client.semantic_embed(request_question, model="luminous-base").embedding

search_result = q_client.search(
        collection_name="test_collection",
        query_vector=embedding_question,
        limit=5, # Parameter to control the number of results
    )

for result in search_result:
    print(f"Score: {result.score}, Text: {result.payload['text']}")
    
    
# Let's select the first result to answer the question
background_text = search_result[0].payload["text"]

### Step 3.3 Using completions to generate the answer

Now that we have found the most relevant document, we can use completions to generate the answer.

We will use the same model as before. However, this time we will give it a different instruction.

TODO:
- Add additional documents to the vectordatabase
- Add additional questions
- Try to modify the prompt to get better results

In [None]:
qa_prompt = f"""### Instruction:
{question}

### Input:
{background_text}

### Response:"""

# Let's send the prompt to the API and see what the model returns
request = CompletionRequest(
    prompt=Prompt.from_text(qa_prompt),
    maximum_tokens=100,
    temperature=0.0,
    stop_sequences=["\n"])

response = client.complete(request=request, model="luminous-supreme-control")

response_text = response.completions[0].completion

print(f"The model returned: `{response_text}`")

--------------------

## Step 4: Chaining multiple requests to solve complex tasks
Sometimes, we need to solve complex tasks. For that, we can chain multiple requests together.

Similar to Humans, LLMs produce more robust results if they are able to solve a task in multiple steps. This is because they can focus on one task at a time and do not have to solve everything at once (end-to-end).

While solving tasks end-to-end may be very convenient, it is not always the best solution. This is because the model may not be able to focus on the most important parts of the task. This can lead to worse results.

It is also much more difficult to debug and understand what the model is doing. This is because the model is solving the task in one step and we cannot see what it is doing.

In this example, we will be summarizing a support request. However, instead of just doing this end-to-end, we will first find the relevant information and then summarize it.

### Step 4.1: Extracting the relevant Information from the support request

First, let's extract some general information from the support request.
This will help us generate a more structured output.

In [None]:
# Here is the support request
support_request = """Dear support Team,
I was trying to use one of these cool language models from Aleph Alpha, but I am having trouble with the API.
I am getting the following error message:
```
AttributeError: 'CompletionResponse' object has no attribute 'text'
```
Could you please help me with this?

Best,
Markus Schmitz"""

We will write a control prompt for extracting the relevant information.
Please keep in mind that this is just an example.

You might need to adjust the prompt to your specific use case.

In [None]:
# First, let's use luminous to extract the most important information from the support request

data_extraction_prompt = f"""Extract the most important information from the support request as a JSON object.
These are:
- The "name" of the person
- The "task" that the person is trying to accomplish
- Error message
###
Request: Hey, Timothy Barnes here. I was just spinning up a virtual machine, but that did not work.
I attached the error message below. Could you please help me with this?

```
2023-08-21T22:14:33.974 app[32874367f36228] ams [info] ValueError: (400, '{{"error":"Json deserialize error: invalid type: null, expected a boolean at line 1 column 150","code":"PAYLOAD_ERROR"}}')
```
JSON: {{
    "name": "Timothy Barnes",
    "task": "spinning up a virtual machine",
    "error_message": "ValueError: (400, '{{\"error\":\"Json deserialize error: invalid type: null, expected a boolean at line 1 column 150\",\"code\":\"PAYLOAD_ERROR\"}}')"
}}
###
Request: {support_request}
JSON:"""

# Let's send the prompt to the API and see what the model returns
request = CompletionRequest(
    prompt=Prompt.from_text(data_extraction_prompt),
    maximum_tokens=100,
    temperature=0.0,
    stop_sequences=["###"])
    
response = client.complete(request=request, model="luminous-extended")
extracted_data = response.completions[0].completion

# Let's parse the response as a JSON object to make it easier to work with
extracted_data = json.loads(extracted_data)

print(f"The model returned: \n{extracted_data}")

In [None]:
summary_prompt = f"""### Instruction:
Write a summary of the support request in one sentence.

### Input:
{support_request}

### Response:
Summary:""" # here we write summary to indicate that the model should write a summary

# Let's send the prompt to the API and see what the model returns
request = CompletionRequest(
    prompt=Prompt.from_text(summary_prompt),
    maximum_tokens=100,
    temperature=0.0,
    stop_sequences=[])

response = client.complete(request=request, model="luminous-extended-control")
summary = response.completions[0].completion

print(f"The model returned the summary: \n{summary}")

### Step 4.3: Putting it all together

Now that we have extracted the relevant information, and created a summary, we can put it all together.

In [None]:
# Creating a ticket using the extracted data and the summary

ticket = {
    "metadata": {
        "name": extracted_data["name"],
        "task": extracted_data["task"],
        "error_message": extracted_data["error_message"]
    },
    "summary": summary,
    "text": support_request
}
# saving the ticket as a json file
with open("ticket.json", "w") as f:
    json.dump(ticket, f)
    
ticket

## AtMan: Understanding the model's decisions
This section will show you how to use AtMan to understand the model's decisions.

With our `explain`-endpoint you can get an explanation of the model's output. In more detail, we return how much the log-probabilites of the already generated completion would change if we supress indivdual parts (based on the granularity you chose) of a prompt. Please refer to this part of our documentation if you would like to know more about our explainability method in general.

In [None]:
prompt_text = """Answer the question based on the context.

Context: According to tradition, on April 21, 753 BC, Romulus and his twin brother Remus founded Rome in the place where they had been suckled as orphans by a she-wolf.

Q: In which month was Rome founded?

A:"""

params = {
    "prompt": Prompt.from_text(prompt_text),
    "maximum_tokens": 1,
}
request = CompletionRequest(**params)
response = client.complete(request=request, model="luminous-supreme")
completion = response.completions[0].completion

exp_req = ExplanationRequest(Prompt.from_text(prompt_text), completion, prompt_granularity="paragraph")
response_explain = client.explain(exp_req, model="luminous-supreme")

explanations = response_explain[1][0].items[0][0]

for item in explanations:
    start = item.start
    end = item.start + item.length
    print(f"""EXPLAINED TEXT: {prompt_text[start:end]}
Score: {np.round(item.score, decimals=3)}""")

As you can see from the example. The explanation helps us locate the relevant information that Luminous used.
Please keep in mind, that especially the control models will have a very high explainability on the instructions. This is because they are trained to solve specific tasks. This means that they will always use the same parts of the instructions to solve the task.

This can be easily managed by only looking at the explainability of the input. This will give us a better understanding of what the model is doing.

---------------
## Conclusion
In this tutorial, we have seen how to use our API to generate text, search for information, and solve tasks.

We have also seen how to chain multiple requests together to solve complex tasks.

We hope that this tutorial was helpful to you. If you have any questions, please do not hesitate ask us.