In [1]:
from aleph_alpha_client import Client, Prompt, CompletionRequest, CompletionResponse, SemanticEmbeddingRequest, SemanticEmbeddingResponse, SemanticRepresentation
from scipy import spatial
import numpy as np
import os
from dotenv import load_dotenv

In [2]:
load_dotenv()

client = Client(token=os.getenv("AA_TOKEN"))

### Step 1: Let's do a really simple QA prompt
In this step we will ask luminous a question from the manual. Let's see how it answers.
Write a completionrequest and make luminous answer it. Don't forget the stop sequences.
You can find the documentation for the completionrequest here: https://docs.aleph-alpha.com/docs/tasks/complete/

In [3]:
prompt = "Q: What is stop category 0?\nA:"

# TODO: Write the completion request
request = CompletionRequest(prompt=Prompt.from_text(prompt), maximum_tokens=32, stop_sequences=["\n"])

# TODO: Call the client to run the completionrequest
response = client.complete(request, model="luminous-base-control")
print(response.completions[0].completion)

 The world is not flat. It is a sphere.


#### Step 2: Let's do a QA prompt with a context
Obviously, we can't expect luminous to answer the question if we don't give it any context. Let's see how it answers when we give it some context.
Write the new prompt and make luminous answer it. Don't forget the stop sequences.

In [5]:
context = """The robot has protective and emergency stop functions (stop category 0 or 1, in accordance with IEC 60204-1).

| Stop category 0 | As defined in IEC 60204-1, stopping by immediate removal of power to the machine actuators. |  
|---|---|  
| Stop category 1 | As defined in IEC 60204-1, a controlled stop with power avail- able to the machine actuators to achieve the stop and then re- moval of power when the stop is achieved. |  """

# TODO write a prompt that makes luminous answer the question based on the context
prompt = f"""### Instructions: Answer the question based on the provided Context. If the answer is not in the text return "not answerable".
### Input: {context}
### Question: How is Stop Category 0 defined?
Short Answer:"""

request = CompletionRequest(prompt=Prompt.from_text(prompt), maximum_tokens=32, stop_sequences=["###"])

response = client.complete(request, model="luminous-base-control")
print(response.completions[0].completion)



 Stop Category 0 is defined as stopping by immediate removal of power to the machine actuators.


#### Step 3: Let's use our search we coded in the previous notebook to find the relevant context for the question and then use that context to answer the question
This step is a bit long, but it is a good exercise to see how we can use the search we coded in the previous notebook to find the relevant context for the question and then use that context to answer the question.

In [6]:
# Read the data in the data.md file
with open("data.md", "r") as f:
    data = f.read()
    
# Split the data into a list of texts
texts = data.split("#####")

print(f"data: {data[:100]}")
print(f"texts: {texts[10][:100]}")

data: SUMMARY

# Most OECD countries have developed and implemented national sustainable development strat
texts:  INDICATORS AND TARGETS

The development and incorporation of quantitative indicators can help remov


In [8]:
question = "What are special commissions and councils that stakeholders can have?"

#TODO: embed the contexts

embedded_texts = []
for text in texts:
    embedded_texts.append(client.semantic_embed(SemanticEmbeddingRequest(prompt=Prompt.from_text(text), representation=SemanticRepresentation.Document, compress_to_size=128), model="luminous-base").embedding)

# TODO: embed the question
embedded_question = client.semantic_embed(SemanticEmbeddingRequest(prompt=Prompt.from_text(question), representation=SemanticRepresentation.Query, compress_to_size=128), model="luminous-base").embedding

# TODO: calculate the cosine similarity between the question and the contexts
similarities = []
for embedded_text in embedded_texts:
    similarities.append(1 - spatial.distance.cosine(embedded_text, embedded_question))

# Get the text in the most similar context
selected_context = texts[np.argmax(similarities)]

# TODO: write a prompt that makes luminous answer the question based on the context
prompt = f"""### Instructions: Answer the question briefly based on the provided Input. If the answer is not in the text return "not answerable".
### Input: {selected_context}
### Question: {question}
### Response:"""

# TODO: write the CompletionRequest
request = CompletionRequest(prompt=Prompt.from_text(prompt), maximum_tokens=64, stop_sequences=["###"])

response = client.complete(request, model="luminous-extended-control")

print(response.completions[0].completion)
print(selected_context)
print(similarities)

 Some countries include stakeholders in special commissions and councils which provide advice to but are separate from the government bodies which implement the strategy.
 STAKEHOLDER PARTICIPATION

Active stakeholder participation *(e.g.,* business, trade unions, non- governmental organisations, indigenous peoples) in the development and implementation of national strategies for sustainable development should be an inherent feature. Sustainable development involves trade-offs among economic, social and ecological objectives which cannot be determined by governments alone. These value judgments require participatory approaches to sustainable development which engage the public through effective communication. However, the extent to which stakeholders are involved in policy processes reflects national institutional settings and preferences. Structures vary widely across OECD countries in terms of the status, timing and breadth of involvement of stakeholders.

Several countries have impl

### Let's try this with a more complex setup and a vector database

In [9]:
# First we spin up the Qdrant server
from qdrant_client import QdrantClient
from qdrant_client.http.models import Distance, VectorParams, Batch

q_client = QdrantClient(path="db")

q_client.recreate_collection(
    collection_name="test_collection",
    vectors_config=VectorParams(size=128, distance=Distance.COSINE),
)



True

In [10]:
# Let's create embeddings for each of the texts and store them in a list
embeddings = []
for text in texts:
    # TODO: embed the texts
    embeddings.append(client.semantic_embed(SemanticEmbeddingRequest(prompt=Prompt.from_text(text), representation=SemanticRepresentation.Document, compress_to_size=128), model="luminous-base").embedding)

In [11]:
# now we can upsert the data into Qdrant
ids = list(range(len(texts)))
payloads = [{"text": text} for text in texts]

q_client.upsert(
     collection_name="test_collection",
     points=Batch(
     ids=ids,
     payloads=payloads,
     vectors=embeddings
     )
)

UpdateResult(operation_id=0, status=<UpdateStatus.COMPLETED: 'completed'>)

In [14]:

# TODO write a function that takes a question and returns an answer by searching in the Qdrant database
def search_and_answer(question):
    # TODO First we embed the question
    embedded_question = client.semantic_embed(SemanticEmbeddingRequest(prompt=Prompt.from_text(question), representation=SemanticRepresentation.Query, compress_to_size=128), model="luminous-base").embedding
    
    # Then we search for the most similar text
    search_result = q_client.search(
        collection_name="test_collection",
        query_vector=embedded_question,
        filter=None,
        top=1
    )
    
    print(search_result)
    
    # Then we get the text from the search result
    text = search_result[0].payload["text"]
    
    # TODO Finally we ask luminous to answer the question based on the text
    prompt = f"""### Instructions: Answer the question briefly based on the provided Input. If the answer is not in the text return "not answerable".
    
    ### Input: {text}
    
    ### Question: {question}
    
    ### Response:"""
    
    # TODO write the CompletionRequest
    request = CompletionRequest(prompt=Prompt.from_text(prompt), maximum_tokens=64, stop_sequences=["###"])
    
    # TODO get the response from luminous
    response = client.complete(request, model="luminous-extended-control")
    
    return response.completions[0].completion

In [15]:
search_and_answer("What is the capital of France?")

[ScoredPoint(id=6, version=0, score=0.11298372325754442, payload={'text': ' ANALYSIS AND ASSESSMENTS\n\nSound analysis is important in helping to identify the underlying trade- offs between economic, environmental and social objectives in priority- setting and policymaking for sustainable development. Such assessments seek to develop information on changing economic, environmental and social conditions, pressures and responses, and their correlations with strategy objectives and indicators. These can build on existing tools including environmental assessments, cost-benefit analyses, accounting frameworks, etc. However, with the exception of a few countries, most national strategies lack provisions for systematically assessing the costs and benefits of alternative actions and informing trade-offs across the full range of sustainable development issues.\n\nThe most commonly used tools come from the environmental policy field. Environmental impact assessments (EIA) attempt to gauge the po

' Paris. It is the capital of France.'

In [16]:
search_and_answer("What are special commissions and councils that stakeholders can have?")

[ScoredPoint(id=9, version=0, score=0.5451891976132429, payload={'text': ' STAKEHOLDER PARTICIPATION\n\nActive stakeholder participation *(e.g.,* business, trade unions, non- governmental organisations, indigenous peoples) in the development and implementation of national strategies for sustainable development should be an inherent feature. Sustainable development involves trade-offs among economic, social and ecological objectives which cannot be determined by governments alone. These value judgments require participatory approaches to sustainable development which engage the public through effective communication. However, the extent to which stakeholders are involved in policy processes reflects national institutional settings and preferences. Structures vary widely across OECD countries in terms of the status, timing and breadth of involvement of stakeholders.\n\nSeveral countries have implemented *ad hoc* participation processes, where stakeholders were consulted in the developmen

' Some countries include stakeholders in special commissions and councils which provide advice to but are separate from the government bodies which implement the strategy. These include the Federal Sustainable Development Council (CFDD) in Belgium, the National Council for Sustainable Development (CNDD) in France, the Council on Sustainable Development (RNE) in Germany'