* https://huggingface.co/learn/cookbook/en/rag_evaluation
* https://sap-my.sharepoint.com/:x:/r/personal/sabine_loss_sap_com/_layouts/15/Doc.aspx?sourcedoc=%7B2F78859D-06EF-413D-9E7A-250936C7B556%7D&file=GoldenDataSet_RAG.xlsx&action=default&mobileredirect=true

* use ollama models and huggingface embeddings (need 2 LLM models)
* use help docs from the first 10 rows of the golden dataset - see if can bsoup it else just use text, copy paste to txt file or sth
* setup generator critic llm according to tut
* generate q-a
* evaluate q-a and filter for good scores
* human evaluation

# setup llms, embedding_model, and process pdfs

In [4]:
# setup ollama model

from langchain_ollama import ChatOllama

llm_model_name = "llama3.1"

generator_llm = ChatOllama(
    model=llm_model_name,
    temperature=0 # increase temp for more creative answers
) 

critic_llm = ChatOllama(
    model=llm_model_name,
    temperature=0 # increase temp for more creative answers
) 

# test
response = generator_llm.invoke("what is pythagoras theorem")
print(response)

response = critic_llm.invoke("what is pythagoras theorem")
print(response)



content="Pythagoras' Theorem, also known as the Pythagorean Theorem, is a fundamental concept in geometry that describes the relationship between the lengths of the sides of a right-angled triangle. It states:\n\n**a² + b² = c²**\n\nwhere:\n\n* **a** and **b** are the lengths of the two sides (called legs) that form the right angle.\n* **c** is the length of the hypotenuse (the side opposite the right angle).\n\nIn other words, if you square the lengths of the two shorter sides of a right-angled triangle and add them together, the result is equal to the square of the length of the longest side (the hypotenuse).\n\nHere's an example:\n\nSuppose we have a right-angled triangle with one leg that's 3 inches long and another leg that's 4 inches long. Using Pythagoras' Theorem, we can calculate the length of the hypotenuse as follows:\n\n**a² + b² = c²**\n**(3)² + (4)² = c²**\n**9 + 16 = c²**\n**25 = c²**\n\nTo find **c**, we take the square root of both sides:\n\n**c = √25**\n**c = 5 inches

In [14]:
from langchain_community.embeddings import HuggingFaceEmbeddings

embedding_model_name = "sentence-transformers/all-mpnet-base-v2"
model_kwargs = {'device': 'cpu'}
encode_kwargs = {'normalize_embeddings': False}

hf_embedding_model = HuggingFaceEmbeddings(
    model_name=embedding_model_name,
    model_kwargs=model_kwargs,
    encode_kwargs=encode_kwargs
)

  hf_embedding_model = HuggingFaceEmbeddings(
  from tqdm.autonotebook import tqdm, trange


In [25]:
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from tqdm import tqdm

def load_pdfs(file_paths):
    """
    file_paths must end with .pdf
    PyPDFLoader auto splits the pdf into pages, each page is 1 Document object split by page number
    note that the splitting by page number is not perfect, the actual page number might be +/- 1-2pages.

    returns a dict of key: file_path and value: list of document objects
    """
    documents_dict = {}   
    for f in tqdm(file_paths):
        loader = PyPDFLoader(file_path = f)
        documents = loader.load()
        documents_dict[f] = documents
    return documents_dict


def chunk_list_of_documents(documents):
    """
    input a list of documents as Document objects

    output a list of chunks as Document objects
    """
    text_splitter = RecursiveCharacterTextSplitter(
        chunk_size = 500,
        chunk_overlap = 100, # using 20% is a good start
        length_function=len,
        is_separator_regex=False,
        add_start_index=True
    )

    chunks = text_splitter.split_documents(documents)    
    return chunks


In [29]:
file = "product-allocation.pdf"

documents_dict = load_pdfs([file])

100%|██████████| 1/1 [00:00<00:00,  4.16it/s]


In [35]:
documents_dict.keys() 
# values are the Document objects containing content of each page of the document, ~ 1 page per document object

dict_keys(['product-allocation.pdf'])

In [38]:
chunks = chunk_list_of_documents(documents=documents_dict['product-allocation.pdf'])

In [40]:
len(chunks)

31

In [42]:
chunks[0]



# setup prompt and llm for generator-llm

Now let’s generate our QA couples. For this example, we generate only 10 QA couples and will load the rest from the Hub.

But for your specific knowledge base, given that you want to get at least ~100 test samples, and accounting for the fact that we will filter out around half of these with our critique agents later on, you should generate much more, in the >200 samples.

In [78]:
# sample call for langchain_ollama

# Define the prompt template
sample_prompt = """
write a short story about this character. {name} has trait {trait} and lives in {place}.
keep to 40 words only.
"""

# Define the trait and place for the character
name = "cheeky_fella"
trait = "bravery"
place = "a small village in the mountains"

# Format the prompt with the trait and place
formatted_prompt = sample_prompt.format(name=name,trait=trait, place=place)
print(formatted_prompt)
print()

# Call the LLM with the formatted prompt
resp = generator_llm.invoke(
    input=formatted_prompt  # Pass the formatted prompt to the LLM
)

print(resp)


write a short story about this character. cheeky_fella has trait bravery and lives in a small village in the mountains.
keep to 40 words only.


content='In the mountain village of Brindlemark, Cheeky Fella stood tall, his bright smile a beacon of courage. When a fierce storm threatened to destroy the village, he rallied the townsfolk, leading them to safety with bravery and wit, earning their eternal gratitude.' response_metadata={'model': 'llama3.1', 'created_at': '2024-09-23T07:54:31.951553Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 2295223000, 'load_duration': 29363667, 'prompt_eval_count': 42, 'prompt_eval_duration': 185780000, 'eval_count': 57, 'eval_duration': 2078914000} id='run-b5f4155b-6237-4863-aa58-77d84e5f0960-0' usage_metadata={'input_tokens': 42, 'output_tokens': 57, 'total_tokens': 99}


In [82]:
resp

AIMessage(content='In the mountain village of Brindlemark, Cheeky Fella stood tall, his bright smile a beacon of courage. When a fierce storm threatened to destroy the village, he rallied the townsfolk, leading them to safety with bravery and wit, earning their eternal gratitude.', response_metadata={'model': 'llama3.1', 'created_at': '2024-09-23T07:54:31.951553Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 2295223000, 'load_duration': 29363667, 'prompt_eval_count': 42, 'prompt_eval_duration': 185780000, 'eval_count': 57, 'eval_duration': 2078914000}, id='run-b5f4155b-6237-4863-aa58-77d84e5f0960-0', usage_metadata={'input_tokens': 42, 'output_tokens': 57, 'total_tokens': 99})

In [84]:
resp.content

'In the mountain village of Brindlemark, Cheeky Fella stood tall, his bright smile a beacon of courage. When a fierce storm threatened to destroy the village, he rallied the townsfolk, leading them to safety with bravery and wit, earning their eternal gratitude.'

In [88]:
QA_generation_prompt = """
Your task is to write a factoid question and an answer given a context.
Your factoid question should be answerable with a specific, concise piece of factual information from the context.
Your factoid question should be formulated in the same style as questions users could ask in a search engine.
This means that your factoid question MUST NOT mention something like "according to the passage" or "context".
Keep your answer under 300 words.
Provide your answer as follows:

Output:::
Factoid question: (your factoid question)
Answer: (your answer to the factoid question)

Now here is the context.

Context: {context}\n
Output:::"""

In [96]:
# create function to call llm

def get_generated_qa(llm,prompt,context):
    """
    prompt must contain the input {context}
    """
    # add the context and format the prompt
    formatted_prompt = prompt.format(context=context)
    
    # Call the LLM with the formatted prompt
    resp = llm.invoke(
        input=formatted_prompt  # Pass the formatted prompt to the LLM
    )
    
    return resp

In [98]:
chunks[0]



In [113]:
import random

N_GENERATIONS = 10  

print(f"Generating {N_GENERATIONS} QA couples...")

outputs = []
for sampled_context in tqdm(random.sample(chunks, N_GENERATIONS)):

    # get QA couple
    qa_couple = get_generated_qa(generator_llm,QA_generation_prompt,sampled_context)

    # condition check if answer is too long
    try:
        question = qa_couple.content.split("Factoid question: ")[-1].split("Answer: ")[0]
        answer = qa_couple.content.split("Answer: ")[-1]
        assert len(answer) < 300, "Answer is too long"
        outputs.append(
            {
                "context": sampled_context.page_content,
                "question": question,
                "answer": answer,
                "source_doc": sampled_context.metadata["source"],
            }
        )
    except:
        continue

Generating 10 QA couples...


100%|██████████| 10/10 [00:16<00:00,  1.67s/it]


In [132]:
len(outputs)

9

In [117]:
import pandas as pd

qna_df = pd.DataFrame(outputs)
qna_df

Unnamed: 0,context,question,answer,source_doc
0,allocation sequence and a given product alloca...,What is the technical name of a product alloca...,A_ProdAllocationSequence,product-allocation.pdf
1,9/23/2024\n2 This is custom documentation. For...,What is the technical name of the Product Allo...,API_PRODUCT_ALLOC_SEQUENCE_SRV,product-allocation.pdf
2,9/23/2024\n1 This is custom documentation. For...,What is the date when this documentation was g...,"September 23, 2024.",product-allocation.pdf
3,Accelerator Hub.\nService Structure\nEntities\...,"What is the necessity of the ""Product Allocati...",Mandatory.,product-allocation.pdf
4,a large volume of data to be maintained in the...,What is enabled by this service?\n,The service enables reading of header data for...,product-allocation.pdf
5,CRUD\nCreate Read Update Delete\nX\nProperties...,What operations are supported for a Product Al...,Read the description of a specific product all...,product-allocation.pdf
6,Read the description of a speci�c product allo...,What is the technical name of this entity?\n,A_ProdAllocSqncAssgmt,product-allocation.pdf
7,POST <host>/sap/opu/odata/SAP/API_PRODUCT_ALLO...,What is the prefix used to enclose a UUID valu...,guid,product-allocation.pdf
8,ValidityStartUTCDateTime Validity start time\n...,What are the supported operations for a produc...,Read product-location assignments of a specifi...,product-allocation.pdf


In [130]:
for i,row in qna_df.iterrows():
    print(i)
    print(row['question'])
    print(row['context'])
    print("ANSWER: ",row['answer'])
    print()
    print()

0
What is the technical name of a product allocation sequence?

allocation sequence and a given product allocation sequence UUID.
Related Events
Product Allocation Sequence Events
Additional Information
Product Allocation Sequence
Query String Options
Each of the following service nodes allows standard OData query string parsing with system query options such as $orderby,
$skip, $filter, $top, $select, $format, &expand and $inlinecount.
Product Allocation Sequence
Technical name: A_ProdAllocationSequence
ANSWER:  A_ProdAllocationSequence


1
What is the technical name of the Product Allocation Sequence API?

9/23/2024
2 This is custom documentation. For more information, please visit the SAP Help PortalProduct Allocation Sequence
Technical name: API_PRODUCT_ALLOC_SEQUENCE_SRV
You can use Product Allocation (PAL) to determine the availability of requested products by checking against sales data as well
as data for restricted resources consumed along the value chain of the requested prod

good q-a-c: 
bad q-a-c: 2,3,4,5,6
* question should not be question referring to "this" since there is no conversation history in the evaluation of the goldend dataset

try another 10

we can see that the questions are all What questions which are simple questions

to do
<!-- * validate: check if the qna generation is good -->
* see if providing examples from the human dataset can improve the LLM generation
    * providing in conversation history not necessary probably, just throw all examples in 1 long prompt

In [136]:
import random

N_GENERATIONS = 10  

print(f"Generating {N_GENERATIONS} QA couples...")

outputs = []
for sampled_context in tqdm(random.sample(chunks, N_GENERATIONS)):

    # get QA couple
    qa_couple = get_generated_qa(generator_llm,QA_generation_prompt,sampled_context)

    # condition check if answer is too long
    # try:
    question = qa_couple.content.split("Factoid question: ")[-1].split("Answer: ")[0]
    answer = qa_couple.content.split("Answer: ")[-1]
    assert len(answer) < 300, "Answer is too long"
    outputs.append(
        {
            "context": sampled_context.page_content,
            "question": question,
            "answer": answer,
            "source_doc": sampled_context.metadata["source"],
        }
    )
    # except:
        # continue

qna_df2 = pd.DataFrame(outputs)
qna_df2

Generating 10 QA couples...


100%|██████████| 10/10 [00:15<00:00,  1.57s/it]


Unnamed: 0,context,question,answer,source_doc
0,Product Allocation Sequence\nTechnical name: A...,What is the technical name of the product allo...,A_ProdAllocationSequence,product-allocation.pdf
1,Read the description of a speci�c product allo...,What is the technical name of this entity?\n,A_ProdAllocSqncAssgmt,product-allocation.pdf
2,9/23/2024\n2 This is custom documentation. For...,What is the technical name of the Product Allo...,API_PRODUCT_ALLOC_SEQUENCE_SRV,product-allocation.pdf
3,Mandatory for Update\nProductAllocationSequenc...,What is the technical name for the product all...,ProductAllocationSequence,product-allocation.pdf
4,existing assignments for a product allocation ...,What external planning systems does the produc...,"SAP Merchandise and Assortment Planning (AMR),...",product-allocation.pdf
5,product allocation consumption unit as well as...,What is the purpose of a product allocation co...,The product allocation consumption unit is use...,product-allocation.pdf
6,allocation sequence and a given product alloca...,What is the technical name of a product alloca...,A_ProdAllocationSequence,product-allocation.pdf
7,a large volume of data to be maintained in the...,What is enabled by this service?\n,The service enables reading of header data for...,product-allocation.pdf
8,Product Allocation Sequence Description\nTechn...,What is the technical name of this entity?\n,A_ProdAllocSequenceT,product-allocation.pdf
9,9/23/2024\n3 This is custom documentation. For...,What is the date mentioned in the context?\n,9/23/2024,product-allocation.pdf


In [138]:
for i,row in qna_df2.iterrows():
    print(i)
    print(row['question'])
    print(row['context'])
    print("ANSWER: ",row['answer'])
    print()
    print()

0
What is the technical name of the product allocation sequence entity?

Product Allocation Sequence
Technical name: A_ProdAllocationSequence
This entity selects existing data for a given product allocation sequence.
In addition to the name of the product allocation sequence, this service node selects information about the product allocation
sequence UUID (which is required for the entity Product Allocation Sequence Assignment (A_ProdAllocSqncAssgmt)), the
ANSWER:  A_ProdAllocationSequence


1
What is the technical name of this entity?

Read the description of a speci c product allocation sequence
Read all sequence descriptions
Read the sequence description in a speci c language
Read product-location assignments
Read a speci c product allocation sequence in a speci c language
Product Allocation Sequence Assignment
Technical name: A_ProdAllocSqncAssgmt
This entity enables you to read, update, and insert product-location assignments for a given product allocation sequence and a
given pro

Note to edit the prompt: need to ensure 
* LLM doesnt only generate what questions
* LLM doesnt use this in the questions
* LLM learns from examples
* LLM doesnt generate the same questions -> keep a list of questions, then use something like a BERT to check similarity of the 2 questions, reject if score>similarity_threshold

In [142]:
QA_generation_prompt = """
Your task is to write a factoid question and an answer given a context.
Your factoid question should be answerable with a specific, concise piece of factual information from the context.
Your factoid question should be formulated in the same style as questions users could ask in a search engine.
This means that your factoid question MUST NOT mention something like "according to the passage" or "context".
Keep your answer under 300 words.
Provide your answer as follows:

Output:::
Factoid question: (your factoid question)
Answer: (your answer to the factoid question)

Now here is the context.

Context: {context}\n
Output:::"""

In [162]:
QA_generation_prompt2 = """
Your task is to write a question and an answer given a context.
Your question should be answerable with a specific, concise piece of factual information from the context.
Your question should be formulated in the same style as questions users could ask in a search engine.

Rules to follow when formulating the question and answer:
1. Ensure that the question can be answered entirely from the information present in the contexts.
2. Do not frame questions that contains more than 20 words. Use abbreviation wherever possible.
3. Make sure the question is clear and unambiguous.
4. Do not use phrases like 'based on the provided context','according to the context',etc.
5. Keep your answer under 300 words.

Output:::
Question: (your question)
Answer: (your answer to the question)

Now here is the context.

Context: {context}\n
Output:::"""
#     instruction="""""",
#     examples=[
#         {
#             "question": "What is the capital of France?",
#             "context": "France is a country in Western Europe. It has several cities, including Paris, Lyon, and Marseille. Paris is not only known for its cultural landmarks like the Eiffel Tower and the Louvre Museum but also as the administrative center.",
#             "output": "Linking the Eiffel Tower and administrative center, which city stands as both?",
#         },
#         {
#             "question": "What does the append() method do in Python?",
#             "context": "In Python, lists are used to store multiple items in a single variable. Lists are one of 4 built-in data types used to store collections of data. The append() method adds a single item to the end of a list.",
#             "output": "If a list represents a variable collection, what method extends it by one item?",
#         },
#     ],
#     input_keys=["question", "context"],
#     output_key="output",
#     output_type="str",
#     language="english",
# )
# # Complicate the given question by rewriting question into a multi-hop reasoning question based on the provided context.
# # Answering the question should require the reader to make multiple logical connections or inferences using the information available in given context.

In [164]:
import random

N_GENERATIONS = 10  

print(f"Generating {N_GENERATIONS} QA couples...")

outputs = []
for sampled_context in tqdm(random.sample(chunks, N_GENERATIONS)):

    # get QA couple
    qa_couple = get_generated_qa(generator_llm,QA_generation_prompt2,sampled_context)

    # condition check if answer is too long
    # try:
    question = qa_couple.content.split("Factoid question: ")[-1].split("Answer: ")[0]
    answer = qa_couple.content.split("Answer: ")[-1]
    # assert len(answer) < 300, "Answer is too long"
    outputs.append(
        {
            "context": sampled_context.page_content,
            "question": question,
            "answer": answer,
            "source_doc": sampled_context.metadata["source"],
        }
    )
    # except:
        # continue

qna_df = pd.DataFrame(outputs)

for i,row in qna_df.iterrows():
    print(i)
    print(row['question'])
    print(row['context'])
    print("ANSWER: ",row['answer'])
    print()
    print()

Generating 10 QA couples...


100%|██████████| 10/10 [00:19<00:00,  1.99s/it]

0
What is the possible way to delete product-location assignments for a product allocation sequence?


given product allocation sequence UUID. Note
To display more information about the properties on the SAP Business Accelerator Hub, open one of the entity's operations
and select Model.
 Note
Deletion of product-location assignments for a product allocation sequence and a product allocation sequence UUID is only
possible via service action DeleteSequenceAssignment. For more information, see Delete Product-Location Assignment.
ANSWER:  Via service action DeleteSequenceAssignment.


1
What is the purpose of the product allocation sequence in this context?


a large volume of data to be maintained in the product allocation object and a high number of products/materials for
assignment in the product allocation sequence to which the product allocation object is linked.
This service enables you to read the header data for a speci c product allocation sequence (including its description) as




In [235]:
QA_generation_prompt = """
Your task is to write a question and an answer given a context.
Your question should be answerable with a specific, concise piece of factual information from the context.
Your question should be formulated in the same style as questions users could ask in a search engine.
Your question can be closed, open, divergent, evaluation, inference, comparison, application or problem-solving question types.

Rules to follow when formulating the question and answer:
1. Ensure that the question can be answered entirely from the information present in the contexts.
2. Do not frame questions that contains more than 20 words. Use abbreviation wherever possible.
3. Make sure the question is clear and unambiguous.

4. **Strictly avoid** phrases like 'based on the provided context', 'according to the context', 'in the context', or 'in the given context'. **These phrases must not be used under any circumstances.**
5. Keep your answer under 300 words.
6. Do not use 'this' in the question as the question must be standalone and 'this' would be ambiguous.

Output:::
Question: (your question)
Answer: (your answer to the question)

Now here is the context.
Context: {context}\n
Output:::"""

# 4. Do not use phrases like 'based on the provided context','according to the context','in the context','in the given context' due to ambiguity.

In [237]:
import random

N_GENERATIONS = 10

print(f"Generating {N_GENERATIONS} QA couples...")

outputs = []
for sampled_context in tqdm(random.sample(chunks, N_GENERATIONS)):

    # get QA couple
    qa_couple = get_generated_qa(generator_llm,QA_generation_prompt,sampled_context)

    # condition check if answer is too long
    # try:
    question = qa_couple.content.split("Question: ")[-1].split("Answer: ")[0]
    answer = qa_couple.content.split("Answer: ")[-1]
    # assert len(answer) < 300, "Answer is too long"
    outputs.append(
        {
            "context": sampled_context.page_content,
            "question": question,
            "answer": answer,
            "source_doc": sampled_context.metadata["source"],
        }
    )
    # except:
        # continue
    print("QUESTION: ", question)
    print("ANSWER: ", answer)

qna_df = pd.DataFrame(outputs)

Generating 10 QA couples...


 10%|█         | 1/10 [00:02<00:23,  2.62s/it]

QUESTION:  What are the supported operations for Product Allocation Sequence?

ANSWER:  Read product-location assignments of a specific sequence, Create Product-Location Assignment for a sequence, and Read all product-location assignments of all product allocation sequences.


 20%|██        | 2/10 [00:04<00:16,  2.11s/it]

QUESTION:  What is the technical name of the entity that enables reading, updating, and inserting product-location assignments?


ANSWER:  A_ProdAllocSqncAssgmt.


 30%|███       | 3/10 [00:05<00:11,  1.65s/it]

QUESTION:  What is the technical name for the product allocation sequence?

ANSWER:  ProductAllocationSequence


 40%|████      | 4/10 [00:07<00:09,  1.61s/it]

QUESTION:  What are the supported operations for a Product Allocation Sequence?

ANSWER:  Read the description of a specific product allocation sequence and Read all sequence descriptions.


 50%|█████     | 5/10 [00:08<00:07,  1.46s/it]

QUESTION:  What is the date mentioned in the context?

ANSWER:  9/23/2024


 60%|██████    | 6/10 [00:09<00:05,  1.36s/it]

QUESTION:  What is the date mentioned in the context?

ANSWER:  9/23/2024


 70%|███████   | 7/10 [00:10<00:03,  1.29s/it]

QUESTION:  What is the HTTP method for the Update Product-Location Assignment operation?

ANSWER:  PATCH


 80%|████████  | 8/10 [00:11<00:02,  1.24s/it]

QUESTION:  What is the purpose of ETags in header entity type?


ANSWER:  Change operations.


 90%|█████████ | 9/10 [00:12<00:01,  1.21s/it]

QUESTION:  What is the necessity of the "Product Allocation Sequence" entity?

ANSWER:  Mandatory.


100%|██████████| 10/10 [00:15<00:00,  1.53s/it]

QUESTION:  What is the URL endpoint for deleting a sequence assignment in the API PRODUCT_ALLOC_SEQUENCE_SRV?


ANSWER:  <host>/sap/opu/odata/SAP/API_PRODUCT_ALLOC_SEQUENCE_SRV/DeleteSequenceAssignment





LLM still using 'in the context' despite emphasising in the prompt
- can settle this in the post processing step
- store list of questions to check if questions with these keywords are present, and if semantically similar questions are generated
- use while loop to get N instead of just for _ in range(N)

In [245]:
QA_generation_prompt = """
Your task is to write a question and an answer given a context.
Your question should be answerable with a specific, concise piece of factual information from the context.
Your question should be formulated in the same style as questions users could ask in a search engine.
Your question can be closed, open, divergent, evaluation, inference, comparison, application or problem-solving question types.
Refer to some examples here:

### Example 1:
Context: Being able to deliver the required quantity of a material to the customer at the requested time demands precise planning and control mechanisms. Unpredictable problems, such as breakdowns in production or increased demand, can lead to critical situations in order processing and must be avoided wherever possible. In advanced Available-to-Promise (aATP) in SAP S/4HANA, you can use Product Allocation (PAL) to avoid critical situations in demand and procurement by allocating materials in short supply to, for example, specific regions and customers for a specific time period. This can help avoid the situation whereby, for example, the entire available quantity of a material in short supply is allocated to a single customer, thereby making it impossible for you to confirm subsequent order requirements for the same material from other customers.\n\n\n\n\n\n Features \n\naATP in SAP S/4HANA supports the following key features for setting up, configuring and monitoring availability checks against product allocation for the business document types sales document and stock transport order:\n\n\n\n Key Feature SAP Fiori App See Also \n\n <br><br> * Create and edit product allocation objects.<br> * Define the period types for product allocation objects.<br> * Define and order the characteristics for product allocation objects.<br> * Deactivate and delete product allocation objects and their characteristics.<br><br><br> Configure Product Allocation Configuring and Executing a Check Against Product Allocation \n <br><br> * Maintain characteristic value combinations and planned allocation quantities for their time periods.<br> * Use spreadsheets (in .xlsx format) or comma separated values files (in .csv format) to maintain characteristic value combinations and their time series.<br> * Reuse assignments and quantities from deleted/deactivated characteristic value combinations to those active characteristic value combinations that are using collectives.<br> * Change the activation and constraint status of product allocation objects.<br> * Display the availability situation (planned, available and consumed quantity) for the materials in product allocation objects.<br> * Release product allocation objects and their planning data for productive usage.<br><br><br> Manage Product Allocation Planning Data Configuring and Executing a Check Against Product Allocation \n <br><br> * Create, display and edit product allocation sequences.<br>
Output::
Question: What does product allocation mean?
Answer: Product Allocation (PAL) in advanced Available-to-Promise (aATP) is a mechanism in SAP S/4HANA that helps avoid critical situations in demand and procurement. It allows the allocation of materials in short supply to specific regions and customers for a specific time period. This ensures that the entire available quantity of a material is not allocated to a single customer, enabling subsequent order requirements from other customers to be confirmed. PAL helps in precise planning and control of material delivery to meet customer demands.

### Example 2:
Context: App ID: F3829\n\nCharacteristic catalogs contain attributes that can be used as characteristics, for example, when you run availability checks for sales documents and stock transport orders. With this app, you can adapt characteristic catalogs to suit your business needs, for example, when you check availability against product allocation, execute backorder processing, use an alternative plant to confirm a requirement or protect quantities to prioritize demand. You can use this app if the business role Order Fulfillment Manager (R0226) is assigned to your user.\n\n\n\n Prerequisites \n\nTo be able to see the list of available catalogs in the Manage Characteristic Catalogs app, you need to set the catalog use types in the authorization object M_CAT_UTYP. In the Characteristic Catalog Use Type field, enter your catalog use types (for example, 01, 02 or * for all catalog use types).\n\n\n\n\n\n Key Features \n\nDepending on the selected catalog, you can use this app to:\n\n\n\n\n\n* Add characteristics as standard or custom fields to a characteristic catalog.\n* Define characteristic value groups.\n* Connect characteristics from multiple characteristic catalogs.\n* Rename standard characteristics.\n* Enable value existence check type for characteristics.\n* Define authorizations for characteristic values and groups of characteristic values.\n* Add classification characteristics from Variant Configuration\n\n\n\n\n\n Supported Device Types \n\n\n\n* Desktop\n* Tablet\n\n\n\n\n\nRelated InformationUsing the Manage Characteristic Catalogs App for aATPProduct Allocation (CA-ATP-PAL)Backorder Processing (CA-ATP-BOP)Supply Protection (CA-ATP-SUP)Alternative-Based Confirmation (CA-ATP-ABC)
Output::
Question: How can I access the characteristic catalog?
Answer: To access the characteristic catalog, you can use the "Manage Characteristic Catalogs" app . This app allows you to adapt characteristic catalogs to suit your business needs, add characteristics, define characteristic value groups, and connect characteristics from multiple catalogs . Please note that you need to have the "Order Fulfillment Manager" role assigned to your user to access this app.

### Example 3:
Context: Availability Change Log (ACL) must be activated before the changes are captured. You can activate ACL in Customizing of aATP at !Start of the navigation path Cross-Application Components !Next navigation step Advanced Available-to-Promise (aATP) !Next navigation step Availability Change Log !Next navigation step Activate Availability Change Log![End of the navigation path](URL\n\n\n\n* After ACL is configured, changes impacting the availability situation are captured for all supported documents.\n* A material-plant combination can be excluded from ACL capture by navigating to the Change View \"Availability Change Log scope based on Checking Group\": Overview screen. Add the desired checking group and select the Deactivate Logging flag.
Output::
Question: How do I configure the availability change log? 
Answer: To configure the availability change log, you need to activate it in Customizing of aATP at the start of the navigation path Cross-Application Components, then navigate to Advanced Available-to-Promise (aATP), Availability Change Log, and activate the availability change log.

Rules to follow when formulating the question and answer:
1. Ensure that the question can be answered entirely from the information present in the contexts.
2. Do not frame questions that contains more than 20 words. Use abbreviation wherever possible.
3. Make sure the question is clear and unambiguous.
4. **Strictly avoid** phrases like 'based on the provided context', 'according to the context', 'in the context', or 'in the given context'. **These phrases must not be used under any circumstances.**
5. Do not use 'this' in the question as the question must be standalone and 'this' would be ambiguous.
6. Keep your answer under 300 words.

This is the output format:
Output::
Question: (your question)
Answer: (your answer to the question)

Now here is the context.
Context: {context}\n
Output::"""

# 4. Do not use phrases like 'based on the provided context','according to the context','in the context','in the given context' due to ambiguity.

In [243]:
import random

N_GENERATIONS = 10

print(f"Generating {N_GENERATIONS} QA couples...")

outputs = []
for sampled_context in tqdm(random.sample(chunks, N_GENERATIONS)):

    # get QA couple
    qa_couple = get_generated_qa(generator_llm,QA_generation_prompt,sampled_context)

    # condition check if answer is too long
    # try:
    question = qa_couple.content.split("Question: ")[-1].split("Answer: ")[0]
    answer = qa_couple.content.split("Answer: ")[-1]
    # assert len(answer) < 300, "Answer is too long"
    outputs.append(
        {
            "context": sampled_context.page_content,
            "question": question,
            "answer": answer,
            "source_doc": sampled_context.metadata["source"],
        }
    )
    # except:
        # continue
    print("QUESTION: ", question)
    print("ANSWER: ", answer)

qna_df = pd.DataFrame(outputs)

Generating 10 QA couples...


 10%|█         | 1/10 [00:07<01:07,  7.53s/it]

QUESTION:  What is used for optimistic concurrency control in the service?

ANSWER:  Entity tags (ETags) are used for optimistic concurrency control.


 20%|██        | 2/10 [00:09<00:34,  4.29s/it]

QUESTION:  What is the necessity of the ProdAllocSqncAssignmentUUID property?

ANSWER:  The ProdAllocSqncAssignmentUUID property is Optional for Insert and Mandatory for Update.


 30%|███       | 3/10 [00:11<00:22,  3.15s/it]

QUESTION:  What are referred to as packages of supply in the context of product allocation?

ANSWER:  These packages of supply are referred to as product allocations.


 40%|████      | 4/10 [00:14<00:19,  3.17s/it]

QUESTION:  What does the Assignment (A_ProdAllocSqncAssgmt) read, update, and insert in the product-location assignments?

ANSWER:  The Assignment (A_ProdAllocSqncAssgmt) reads, updates, and inserts product-location assignments for a given product allocation sequence and a given product.


 50%|█████     | 5/10 [00:17<00:14,  2.95s/it]

QUESTION:  What is the URL for deleting a sequence assignment in product allocation?

ANSWER:  The URL for deleting a sequence assignment is POST <host>/sap/opu/odata/SAP/API_PRODUCT_ALLOC_SEQUENCE_SRV/DeleteSequenceAssignment.


 60%|██████    | 6/10 [00:20<00:11,  2.99s/it]

QUESTION:  What does product allocation sequence return in SAP?

ANSWER:  Product Allocation Sequence (A_ProdAllocationSequence) returns the name and description of the product allocation sequence as well as information about the product allocation sequence UUID, the product allocation consumption unit, and when the product allocation sequence was created.


 70%|███████   | 7/10 [00:21<00:07,  2.57s/it]

QUESTION:  What is the technical name for the product allocation sequence?

ANSWER:  The technical name for the product allocation sequence is ProductAllocationSequence.


 80%|████████  | 8/10 [00:25<00:05,  2.85s/it]

QUESTION:  What are existing assignments for a product allocation sequence from external planning systems?

ANSWER:  Existing assignments for a product allocation sequence from external planning systems such as SAP Merchandise and Assortment Planning (AMR), SAP Assortment Planning for Retail (APR), SAP Integrated Business Planning (IBP), or any other planning systems.


 90%|█████████ | 9/10 [00:27<00:02,  2.75s/it]

QUESTION:  What does the Technical name of Product Allocation Sequence Assignment refer to?

ANSWER:  This entity enables you to read, update, and insert product-location assignments for a given product allocation sequence and a given product allocation sequence UUID.


100%|██████████| 10/10 [00:30<00:00,  3.06s/it]

QUESTION:  What is Product Allocation (PAL) used for?

ANSWER:  You can use Product Allocation (PAL) to determine the availability of requested products by checking against sales data as well as data for restricted resources consumed along the value chain of the requested product.





question quality has improved but there is only what questions

In [247]:
QA_generation_prompt = """
Your task is to write a question and an answer based on the provided context. 
Your question should be factual, concise, and answerable from the context. Ensure the following variety:
1. Formulate questions using different styles: 'what', 'how', 'why', 'compare', 'explain', etc. 
2. Alternate question types across outputs: closed, open-ended, evaluative, comparative, etc.
3. Avoid repetitive 'what' questions unless the context naturally calls for it. 


Refer to some examples here:
### Example 1:
Context: Being able to deliver the required quantity of a material to the customer at the requested time demands precise planning and control mechanisms. Unpredictable problems, such as breakdowns in production or increased demand, can lead to critical situations in order processing and must be avoided wherever possible. In advanced Available-to-Promise (aATP) in SAP S/4HANA, you can use Product Allocation (PAL) to avoid critical situations in demand and procurement by allocating materials in short supply to, for example, specific regions and customers for a specific time period. This can help avoid the situation whereby, for example, the entire available quantity of a material in short supply is allocated to a single customer, thereby making it impossible for you to confirm subsequent order requirements for the same material from other customers.\n\n\n\n\n\n Features \n\naATP in SAP S/4HANA supports the following key features for setting up, configuring and monitoring availability checks against product allocation for the business document types sales document and stock transport order:\n\n\n\n Key Feature SAP Fiori App See Also \n\n <br><br> * Create and edit product allocation objects.<br> * Define the period types for product allocation objects.<br> * Define and order the characteristics for product allocation objects.<br> * Deactivate and delete product allocation objects and their characteristics.<br><br><br> Configure Product Allocation Configuring and Executing a Check Against Product Allocation \n <br><br> * Maintain characteristic value combinations and planned allocation quantities for their time periods.<br> * Use spreadsheets (in .xlsx format) or comma separated values files (in .csv format) to maintain characteristic value combinations and their time series.<br> * Reuse assignments and quantities from deleted/deactivated characteristic value combinations to those active characteristic value combinations that are using collectives.<br> * Change the activation and constraint status of product allocation objects.<br> * Display the availability situation (planned, available and consumed quantity) for the materials in product allocation objects.<br> * Release product allocation objects and their planning data for productive usage.<br><br><br> Manage Product Allocation Planning Data Configuring and Executing a Check Against Product Allocation \n <br><br> * Create, display and edit product allocation sequences.<br>
Output::
Question: What does product allocation mean?
Answer: Product Allocation (PAL) in advanced Available-to-Promise (aATP) is a mechanism in SAP S/4HANA that helps avoid critical situations in demand and procurement. It allows the allocation of materials in short supply to specific regions and customers for a specific time period. This ensures that the entire available quantity of a material is not allocated to a single customer, enabling subsequent order requirements from other customers to be confirmed. PAL helps in precise planning and control of material delivery to meet customer demands.

### Example 2:
Context: App ID: F3829\n\nCharacteristic catalogs contain attributes that can be used as characteristics, for example, when you run availability checks for sales documents and stock transport orders. With this app, you can adapt characteristic catalogs to suit your business needs, for example, when you check availability against product allocation, execute backorder processing, use an alternative plant to confirm a requirement or protect quantities to prioritize demand. You can use this app if the business role Order Fulfillment Manager (R0226) is assigned to your user.\n\n\n\n Prerequisites \n\nTo be able to see the list of available catalogs in the Manage Characteristic Catalogs app, you need to set the catalog use types in the authorization object M_CAT_UTYP. In the Characteristic Catalog Use Type field, enter your catalog use types (for example, 01, 02 or * for all catalog use types).\n\n\n\n\n\n Key Features \n\nDepending on the selected catalog, you can use this app to:\n\n\n\n\n\n* Add characteristics as standard or custom fields to a characteristic catalog.\n* Define characteristic value groups.\n* Connect characteristics from multiple characteristic catalogs.\n* Rename standard characteristics.\n* Enable value existence check type for characteristics.\n* Define authorizations for characteristic values and groups of characteristic values.\n* Add classification characteristics from Variant Configuration\n\n\n\n\n\n Supported Device Types \n\n\n\n* Desktop\n* Tablet\n\n\n\n\n\nRelated InformationUsing the Manage Characteristic Catalogs App for aATPProduct Allocation (CA-ATP-PAL)Backorder Processing (CA-ATP-BOP)Supply Protection (CA-ATP-SUP)Alternative-Based Confirmation (CA-ATP-ABC)
Output::
Question: How can I access the characteristic catalog?
Answer: To access the characteristic catalog, you can use the "Manage Characteristic Catalogs" app . This app allows you to adapt characteristic catalogs to suit your business needs, add characteristics, define characteristic value groups, and connect characteristics from multiple catalogs . Please note that you need to have the "Order Fulfillment Manager" role assigned to your user to access this app.

### Example 3:
Context: Availability Change Log (ACL) must be activated before the changes are captured. You can activate ACL in Customizing of aATP at !Start of the navigation path Cross-Application Components !Next navigation step Advanced Available-to-Promise (aATP) !Next navigation step Availability Change Log !Next navigation step Activate Availability Change Log![End of the navigation path](URL\n\n\n\n* After ACL is configured, changes impacting the availability situation are captured for all supported documents.\n* A material-plant combination can be excluded from ACL capture by navigating to the Change View \"Availability Change Log scope based on Checking Group\": Overview screen. Add the desired checking group and select the Deactivate Logging flag.
Output::
Question: How do I configure the availability change log? 
Answer: To configure the availability change log, you need to activate it in Customizing of aATP at the start of the navigation path Cross-Application Components, then navigate to Advanced Available-to-Promise (aATP), Availability Change Log, and activate the availability change log.


Rules to follow when formulating the question and answer:
1. Ensure that the question can be answered entirely from the information present in the contexts.
2. Do not frame questions that contains more than 20 words. Use abbreviation wherever possible.
3. Make sure the question is clear and unambiguous.
4. **Strictly avoid** phrases like 'based on the provided context', 'according to the context', 'in the context', or 'in the given context'. **These phrases must not be used under any circumstances.**
5. Do not use 'this' in the question as the question must be standalone and 'this' would be ambiguous.
6. Keep your answer under 300 words.

This is the output format:
Output::
Question: (your question)
Answer: (your answer to the question)

Now here is the context.
Context: {context}\n
Output::"""

# 4. Do not use phrases like 'based on the provided context','according to the context','in the context','in the given context' due to ambiguity.
# Your task is to write a question and an answer given a context.
# Your question should be answerable with a specific, concise piece of factual information from the context.
# Your question should be formulated in the same style as questions users could ask in a search engine.
# Your question can be closed, open, divergent, evaluation, inference, comparison, application or problem-solving question types.

In [249]:
import random

N_GENERATIONS = 10

print(f"Generating {N_GENERATIONS} QA couples...")

outputs = []
for sampled_context in tqdm(random.sample(chunks, N_GENERATIONS)):

    # get QA couple
    qa_couple = get_generated_qa(generator_llm,QA_generation_prompt,sampled_context)

    # condition check if answer is too long
    # try:
    question = qa_couple.content.split("Question: ")[-1].split("Answer: ")[0]
    answer = qa_couple.content.split("Answer: ")[-1]
    # assert len(answer) < 300, "Answer is too long"
    outputs.append(
        {
            "context": sampled_context.page_content,
            "question": question,
            "answer": answer,
            "source_doc": sampled_context.metadata["source"],
        }
    )
    # except:
        # continue
    print("QUESTION: ", question)
    print("ANSWER: ", answer)

qna_df = pd.DataFrame(outputs)

Generating 10 QA couples...


 10%|█         | 1/10 [00:11<01:42, 11.40s/it]

QUESTION:  Why is custom documentation not suitable for productive use?

ANSWER:  Custom documentation may not reflect the arrangement of topics in the SAP Help Portal and may be missing important aspects and/or correlations to other topics. Therefore, it is not recommended for productive use.


 20%|██        | 2/10 [00:22<01:30, 11.34s/it]

QUESTION:  What is the purpose of APIs for Availability Checks in SAP S/4HANA?

ANSWER:  The APIs for Availability Checks are used to provide a standardized interface for integrating availability checks into external systems and applications. This allows for seamless integration with other business processes and enables real-time availability information to be shared across different systems.


 30%|███       | 3/10 [00:42<01:45, 15.00s/it]

QUESTION:  What is the purpose of the validity end time in product allocation sequence?

ANSWER:  The validity end time in product allocation sequence represents the Time zone for the validity end time and is represented as ValidityEndUTCDateTime.


 40%|████      | 4/10 [00:50<01:15, 12.58s/it]

QUESTION:  What operations are supported for product allocation sequences?

ANSWER:  The following operations are supported for product allocation sequences: Read all product allocation sequences, Read a specific product allocation sequence, Read the product allocation sequence description, Read product-location assignments, Create Product-Location Assignment, Read a product allocation sequence in a specific language, and Read the product allocation sequence of a specific product-location assignment.


 50%|█████     | 5/10 [00:53<00:45,  9.01s/it]

QUESTION:  What is the purpose of the Product Allocation Sequence Assignment entity?

ANSWER:  The Product Allocation Sequence Assignment entity enables you to read, update, and insert product-location assignments for a given product allocation sequence and a given product allocation sequence UUID.


 60%|██████    | 6/10 [00:56<00:27,  6.97s/it]

QUESTION:  What is the purpose of the product allocation consumption unit?

ANSWER:  The product allocation consumption unit is used to get the header data for exactly one product allocation sequence, including the description in the corresponding system language. It provides information about when the corresponding product allocation sequence was created and changed.


 70%|███████   | 7/10 [00:59<00:17,  5.76s/it]

QUESTION:  What are the operations offered by the Product Allocation Sequence API?

ANSWER:  The Product Allocation Sequence API offers the following operations: Read Product Allocation Sequence, Query Product Allocation Sequence, Read Product Allocation Sequence and Product Location Assignments, Create Product-Location Assignment, Update Product-Location Assignment, and Delete Product-Location Assignment.


 80%|████████  | 8/10 [01:03<00:10,  5.18s/it]

QUESTION:  What are the prerequisites for using a product allocation sequence?

ANSWER:  The product allocation sequence must exist and the service node must know the ProductAllocationSequenceUUID. You can use the entity Product Allocation Sequence (A_ProdAllocationSequence) to get information about the corresponding product allocation sequence. If an existing product-location assignment is to be changed, the UUID for the corresponding assignment is required.


 90%|█████████ | 9/10 [01:06<00:04,  4.44s/it]

QUESTION:  What is the purpose of activating the availability change log in aATP Customizing?

ANSWER:  The availability change log must be activated before changes are captured. This allows capturing changes impacting the availability situation for all supported documents after it's configured.


100%|██████████| 10/10 [01:13<00:00,  7.40s/it]

QUESTION:  What does a product allocation sequence represent?

ANSWER:  A product allocation sequence represents an existing data for a given product allocation. It includes information about the name of the product allocation sequence and its UUID, which is required for the entity Product Allocation Sequence Assignment.





refine prompt

In [253]:
QA_generation_prompt = """
Your task is to generate a question and answer based on the provided context. Ensure the following when formulating the question:

1. Use a variety of question types: 'what', 'how', 'why', 'compare', 'explain', etc.
2. Alternate between closed, open-ended, evaluative, comparative, and problem-solving questions.
3. Avoid repetitive 'what' questions unless necessary for the context.
4. Questions must be answerable solely from the context provided.
5. Ensure questions are clear and unambiguous.
4. **Avoid** phrases like 'based on the provided context', 'according to the context', or similar and words like 'this' as these are ambiguous references.
5. Answers should not exceed 300 words.

### Example 1:
Context: Being able to deliver the required quantity of a material to the customer at the requested time demands precise planning and control mechanisms. Unpredictable problems, such as breakdowns in production or increased demand, can lead to critical situations in order processing and must be avoided wherever possible. In advanced Available-to-Promise (aATP) in SAP S/4HANA, you can use Product Allocation (PAL) to avoid critical situations in demand and procurement by allocating materials in short supply to, for example, specific regions and customers for a specific time period. This can help avoid the situation whereby, for example, the entire available quantity of a material in short supply is allocated to a single customer, thereby making it impossible for you to confirm subsequent order requirements for the same material from other customers.\n\n\n\n\n\n Features \n\naATP in SAP S/4HANA supports the following key features for setting up, configuring and monitoring availability checks against product allocation for the business document types sales document and stock transport order:\n\n\n\n Key Feature SAP Fiori App See Also \n\n <br><br> * Create and edit product allocation objects.<br> * Define the period types for product allocation objects.<br> * Define and order the characteristics for product allocation objects.<br> * Deactivate and delete product allocation objects and their characteristics.<br><br><br> Configure Product Allocation Configuring and Executing a Check Against Product Allocation \n <br><br> * Maintain characteristic value combinations and planned allocation quantities for their time periods.<br> * Use spreadsheets (in .xlsx format) or comma separated values files (in .csv format) to maintain characteristic value combinations and their time series.<br> * Reuse assignments and quantities from deleted/deactivated characteristic value combinations to those active characteristic value combinations that are using collectives.<br> * Change the activation and constraint status of product allocation objects.<br> * Display the availability situation (planned, available and consumed quantity) for the materials in product allocation objects.<br> * Release product allocation objects and their planning data for productive usage.<br><br><br> Manage Product Allocation Planning Data Configuring and Executing a Check Against Product Allocation \n <br><br> * Create, display and edit product allocation sequences.<br>
Output::
Question: What does product allocation mean?
Answer: Product Allocation (PAL) in advanced Available-to-Promise (aATP) is a mechanism in SAP S/4HANA that helps avoid critical situations in demand and procurement. It allows the allocation of materials in short supply to specific regions and customers for a specific time period. This ensures that the entire available quantity of a material is not allocated to a single customer, enabling subsequent order requirements from other customers to be confirmed. PAL helps in precise planning and control of material delivery to meet customer demands.

### Example 2:
Context: App ID: F3829\n\nCharacteristic catalogs contain attributes that can be used as characteristics, for example, when you run availability checks for sales documents and stock transport orders. With this app, you can adapt characteristic catalogs to suit your business needs, for example, when you check availability against product allocation, execute backorder processing, use an alternative plant to confirm a requirement or protect quantities to prioritize demand. You can use this app if the business role Order Fulfillment Manager (R0226) is assigned to your user.\n\n\n\n Prerequisites \n\nTo be able to see the list of available catalogs in the Manage Characteristic Catalogs app, you need to set the catalog use types in the authorization object M_CAT_UTYP. In the Characteristic Catalog Use Type field, enter your catalog use types (for example, 01, 02 or * for all catalog use types).\n\n\n\n\n\n Key Features \n\nDepending on the selected catalog, you can use this app to:\n\n\n\n\n\n* Add characteristics as standard or custom fields to a characteristic catalog.\n* Define characteristic value groups.\n* Connect characteristics from multiple characteristic catalogs.\n* Rename standard characteristics.\n* Enable value existence check type for characteristics.\n* Define authorizations for characteristic values and groups of characteristic values.\n* Add classification characteristics from Variant Configuration\n\n\n\n\n\n Supported Device Types \n\n\n\n* Desktop\n* Tablet\n\n\n\n\n\nRelated InformationUsing the Manage Characteristic Catalogs App for aATPProduct Allocation (CA-ATP-PAL)Backorder Processing (CA-ATP-BOP)Supply Protection (CA-ATP-SUP)Alternative-Based Confirmation (CA-ATP-ABC)
Output::
Question: How can I access the characteristic catalog?
Answer: To access the characteristic catalog, you can use the "Manage Characteristic Catalogs" app . This app allows you to adapt characteristic catalogs to suit your business needs, add characteristics, define characteristic value groups, and connect characteristics from multiple catalogs . Please note that you need to have the "Order Fulfillment Manager" role assigned to your user to access this app.

### Example 3:
Context: Availability Change Log (ACL) must be activated before the changes are captured. You can activate ACL in Customizing of aATP at !Start of the navigation path Cross-Application Components !Next navigation step Advanced Available-to-Promise (aATP) !Next navigation step Availability Change Log !Next navigation step Activate Availability Change Log![End of the navigation path](URL\n\n\n\n* After ACL is configured, changes impacting the availability situation are captured for all supported documents.\n* A material-plant combination can be excluded from ACL capture by navigating to the Change View \"Availability Change Log scope based on Checking Group\": Overview screen. Add the desired checking group and select the Deactivate Logging flag.
Output::
Question: How do I configure the availability change log? 
Answer: To configure the availability change log, you need to activate it in Customizing of aATP at the start of the navigation path Cross-Application Components, then navigate to Advanced Available-to-Promise (aATP), Availability Change Log, and activate the availability change log.

Rules:

Output format:
Output::
Question: (your question)
Answer: (your answer)

Now, here is the context:
Context: {context}\n
Output::"""
# 2. Keep questions under 20 words. Use abbreviations where applicable.

In [255]:
import random

N_GENERATIONS = 10

print(f"Generating {N_GENERATIONS} QA couples...")

outputs = []
for sampled_context in tqdm(random.sample(chunks, N_GENERATIONS)):

    # get QA couple
    qa_couple = get_generated_qa(generator_llm,QA_generation_prompt,sampled_context)

    # condition check if answer is too long
    # try:
    question = qa_couple.content.split("Question: ")[-1].split("Answer: ")[0]
    answer = qa_couple.content.split("Answer: ")[-1]
    # assert len(answer) < 300, "Answer is too long"
    outputs.append(
        {
            "context": sampled_context.page_content,
            "question": question,
            "answer": answer,
            "source_doc": sampled_context.metadata["source"],
        }
    )
    # except:
        # continue
    print("QUESTION: ", question)
    print("ANSWER: ", answer)

qna_df = pd.DataFrame(outputs)

Generating 10 QA couples...


 10%|█         | 1/10 [00:07<01:05,  7.23s/it]

QUESTION:  What information does the Product Allocation Sequence entity return?

ANSWER:  The Product Allocation Sequence entity returns the name and description of the product allocation sequence, as well as information about the product allocation sequence UUID, the product allocation consumption unit, and when the product allocation sequence was created.


 20%|██        | 2/10 [00:09<00:35,  4.38s/it]

QUESTION:  What is the purpose of the Product Allocation Sequence entity?

ANSWER:  The Product Allocation Sequence entity selects existing data for a given product allocation sequence. It is mandatory and contains the service's business data related to product allocation sequences.


 30%|███       | 3/10 [00:12<00:26,  3.78s/it]

QUESTION:  What are product allocations?
### 
ANSWER:  Product allocations refer to packages of supply that are organized in time series. They help avoid critical situations in demand and procurement processes by factoring sales and supply restrictions into the product availability check. The data used in product allocations is usually defined in separate planning systems.


 40%|████      | 4/10 [00:22<00:37,  6.19s/it]

QUESTION:  What information is displayed when a product allocation sequence is changed?

ANSWER:  When a product allocation sequence is changed, the following information is displayed: LastChangedByUser (name of the user who made the change), ProdAllocSqncConsumptionUnit (consumption unit for the product allocation sequence), ProductAllocationSequence (technical name for the product allocation sequence), ProductAllocationSequence_Text (description for the product allocation sequence in the corresponding system language), and ProductAllocationSequenceUUID (UUID for the selected product allocation sequence).


 50%|█████     | 5/10 [00:25<00:24,  4.94s/it]

QUESTION:  What are the supported operations for product allocation sequences?

ANSWER:  The following operations are supported for product allocation sequences: Read product-location assignments of a specific sequence, Create Product-Location Assignment for a sequence, and Read all product-location assignments of all product allocation sequences.


 60%|██████    | 6/10 [00:31<00:21,  5.31s/it]

QUESTION:  Why can deletion of product-location assignments be done only via service action?

ANSWER:  The deletion of product-location assignments for a product allocation sequence and a product allocation sequence UUID is only possible via the service action "DeleteSequenceAssignment". This is to ensure that such deletions are handled in a controlled manner, likely to prevent accidental or unintended changes.


 70%|███████   | 7/10 [00:34<00:14,  4.70s/it]

QUESTION:  What is Product Allocation (PAL) used for in determining product availability?

ANSWER:  Product Allocation (PAL) is used to determine the availability of requested products by checking against sales data as well as data for restricted resources consumed along the value chain of the requested product. This helps in identifying the availability of products and making informed decisions about their allocation.


 80%|████████  | 8/10 [00:56<00:20, 10.08s/it]

QUESTION:  What is the source of information about ETags in this context?

ANSWER:  The source of information about ETags is a webpage linked from the provided URL.


 90%|█████████ | 9/10 [01:00<00:08,  8.07s/it]

QUESTION:  What is the purpose of reading a specific product allocation sequence?

ANSWER:  The purpose of reading a specific product allocation sequence is to enable you to read, update, and insert product-location assignments for a given product allocation sequence and a given product allocation sequence UUID. This allows you to manage product allocation sequences and their associated product-location assignments.


100%|██████████| 10/10 [01:03<00:00,  6.38s/it]

QUESTION:  What operations are supported for product allocation sequences?

ANSWER:  The following operations are supported for product allocation sequences: Read all product allocation sequences, Read a specific product allocation sequence, Read the product allocation sequence description, Read product-location assignments, Create Product-Location Assignment, Read a product allocation sequence in a specific language, and Read the product allocation sequence of a specific product-location assignment.





check actual golden dataset for number of what, why and how questions

In [259]:
golden_data = pd.read_excel('GoldenDataSet_RAG.xlsx',sheet_name='Test_Records')
golden_data

Unnamed: 0,Product/Area,Question Category,Question,Golden Answer,Links provided to Model,Answer based on information retrieval content generated,Links provided in answer
0,S4H private cloud,Direct,What does product allocation mean?,Product Allocation (PAL) in advanced Available...,APIs for Availability Checks \nhttps://help.sa...,Product Allocation (CA-ATP-PAL)\nhttps://help....,Product Allocation (CA-ATP-PAL).\nhttps://help...
1,S4H private cloud,Direct,What is a product allocation object?,A product allocation object is the main busine...,"""Product Allocation Sequence"",\n ""documen...","Key Concepts in Product Allocation"",\n ...",Key Concepts in Product Allocation\nhttps://he...
2,S4H private cloud,Direct,What is an allocation quantity unit in aATP pr...,An allocation quantity unit in aATP product al...,"""Key Concepts in Product Allocation"",\n ""...","""Key Concepts in Product Allocation"",\n ...",Key Concepts in Product Allocation\nhttps://he...
3,S4H private cloud,Consistency,Where can I maintain authorizations for produc...,You can maintain authorizations for product al...,"""document_title"": ""Authorizations Based on Cha...","""Authorizations Based on Characteristics"",\n ...",Authorizations Based on Characteristics\nhttps...
4,S4H private cloud,Consistency,How can I access the characteristic catalog?,"To access the characteristic catalog, you can ...","""Characteristic Name"",\n ""document_url"": ...","""Manage Characteristic Catalogs"",\n ""...",Manage Characteristics Catalogs\nhttps://help....
...,...,...,...,...,...,...,...
703,IBP,Direct,which methods are available for outlier detection,The methods available for outlier detection ar...,"{\n ""document_title"": ""Outlier Correction...","{\n ""level"": ""info"",\n ""code"": ""IR_SUMMARY_G...",Outlier Correction\nhttps://help.sap.com/docs/...
704,IBP,Direct,Can you delete master data in the Excel add-in?,You can't delete master data records in the SA...,"{\n ""document_title"": ""Maintenance of Mas...","{\n ""level"": ""info"",\n ""code"": ""IR_SUMMARY_G...",Maintenance of Master Data Records Using the M...
705,IBP,Direct,How do I opt out of custom alert definitions?,"To opt out of custom alert definitions, follow...","{\n ""document_title"": ""Opting Out of Cust...","{\n ""level"": ""info"",\n ""code"": ""IR_SUMMARY_G...",Opting Out of Custom Alert Definitions\nhttps:...
706,IBP,Direct,Can I snooze an alert?,"Yes, you can snooze an alert. Snoozing alerts ...","{\n ""document_title"": ""Working with Alert...","{\n ""level"": ""info"",\n ""code"": ""IR_SUMMARY_G...",Working with Alerts\nhttps://help.sap.com/docs...


In [267]:
golden_questions = golden_data.iloc[:,2].tolist()

not_what_q = []
for q in golden_questions:
    if q.split(" ")[0].lower()!='what':
        not_what_q.append(q)

In [269]:
print(f"total count: {len(golden_questions)}")
print(f"total count what ques: {len(golden_questions)-len(not_what_q)}")

total count: 708
total count what ques: 208


In [271]:
not_what_q[:10]

['Where can I maintain authorizations for product allocation characteristics?',
 'How can I access the characteristic catalog?',
 'How do I create a product allocation object?',
 'How do I create a product allocation sequence?',
 'How do I configure Product Allocation?',
 'Configure Activity Attributes?',
 'How can I configure activity attributes?',
 'BPS?',
 'How do I configure the availability change log?',
 'Can I deactivate the availability change log?']

so 2/7 questions are what questions in the golden dataset

find out the different question types

In [288]:
set([q.split(" ")[0].lower() for q in golden_questions])

{'',
 'alternative',
 'api',
 'api_subscriptionorder',
 'are',
 'arun?',
 'bom',
 'bop',
 'bps?',
 'can',
 'cc',
 'cds',
 'compare',
 'configurable',
 'configure',
 'configuring',
 'contract',
 'deactivate',
 'deleting',
 'determine',
 'difference',
 'different',
 'do',
 'does',
 'emabrgo',
 'explain',
 'extensibility',
 'first',
 'fulfillment',
 'going',
 'have',
 'help',
 'how',
 'i',
 "i'd",
 "i'm",
 "i've",
 'in',
 'is',
 'list',
 'make',
 'manage',
 'mass',
 'master',
 'mirror',
 'monitor',
 'one',
 'partial',
 'partner',
 'plan',
 'planning',
 'please',
 'product',
 'range',
 'recheck',
 'request',
 'sales',
 'sap',
 'second',
 'service',
 'show',
 'solution',
 'subscription',
 'tell',
 'third',
 'troubleshooting',
 'under',
 'user',
 'value',
 'what',
 "what's",
 'when',
 'where',
 'which',
 'who',
 "who's",
 'why',
 'will',
 'withdraw'}

main ones 

what, when, where, which, why

In [273]:
import random

def test_llm_generation(N_GENERATIONS = 10, printing = True):
    
    print(f"Generating {N_GENERATIONS} QA couples...")
    
    outputs = []
    for sampled_context in tqdm(random.sample(chunks, N_GENERATIONS)):
    
        # get QA couple
        qa_couple = get_generated_qa(generator_llm,QA_generation_prompt,sampled_context)
    
        # condition check if answer is too long
        # try:
        question = qa_couple.content.split("Question: ")[-1].split("Answer: ")[0]
        answer = qa_couple.content.split("Answer: ")[-1]
        # assert len(answer) < 300, "Answer is too long"
        outputs.append(
            {
                "context": sampled_context.page_content,
                "question": question,
                "answer": answer,
                "source_doc": sampled_context.metadata["source"],
            }
        )
        # except:
            # continue
        if printing:
            print("QUESTION: ", question)
            print("ANSWER: ", answer)
    
    qna_df = pd.DataFrame(outputs)
    
    return qna_df

In [277]:
QA_generation_prompt = """
Your task is to generate a question and answer based on the provided context. Ensure the following when formulating the question:

1. Use a variety of question types: 'what', 'how', 'why', 'compare', 'explain', etc.
2. Alternate between closed, open-ended, evaluative, comparative, and problem-solving questions.
3. Avoid repetitive 'what' questions unless necessary for the context.
4. Questions must be answerable solely from the context provided.
5. Ensure questions are clear and unambiguous.
4. **Avoid** phrases like 'based on the provided context', 'according to the context', or similar and words like 'this' as these are ambiguous references.
5. Answers should not exceed 300 words.

### Example 1:
Context: Being able to deliver the required quantity of a material to the customer at the requested time demands precise planning and control mechanisms. Unpredictable problems, such as breakdowns in production or increased demand, can lead to critical situations in order processing and must be avoided wherever possible. In advanced Available-to-Promise (aATP) in SAP S/4HANA, you can use Product Allocation (PAL) to avoid critical situations in demand and procurement by allocating materials in short supply to, for example, specific regions and customers for a specific time period. This can help avoid the situation whereby, for example, the entire available quantity of a material in short supply is allocated to a single customer, thereby making it impossible for you to confirm subsequent order requirements for the same material from other customers.\n\n\n\n\n\n Features \n\naATP in SAP S/4HANA supports the following key features for setting up, configuring and monitoring availability checks against product allocation for the business document types sales document and stock transport order:\n\n\n\n Key Feature SAP Fiori App See Also \n\n <br><br> * Create and edit product allocation objects.<br> * Define the period types for product allocation objects.<br> * Define and order the characteristics for product allocation objects.<br> * Deactivate and delete product allocation objects and their characteristics.<br><br><br> Configure Product Allocation Configuring and Executing a Check Against Product Allocation \n <br><br> * Maintain characteristic value combinations and planned allocation quantities for their time periods.<br> * Use spreadsheets (in .xlsx format) or comma separated values files (in .csv format) to maintain characteristic value combinations and their time series.<br> * Reuse assignments and quantities from deleted/deactivated characteristic value combinations to those active characteristic value combinations that are using collectives.<br> * Change the activation and constraint status of product allocation objects.<br> * Display the availability situation (planned, available and consumed quantity) for the materials in product allocation objects.<br> * Release product allocation objects and their planning data for productive usage.<br><br><br> Manage Product Allocation Planning Data Configuring and Executing a Check Against Product Allocation \n <br><br> * Create, display and edit product allocation sequences.<br>
Output::
Question: What does product allocation mean?
Answer: Product Allocation (PAL) in advanced Available-to-Promise (aATP) is a mechanism in SAP S/4HANA that helps avoid critical situations in demand and procurement. It allows the allocation of materials in short supply to specific regions and customers for a specific time period. This ensures that the entire available quantity of a material is not allocated to a single customer, enabling subsequent order requirements from other customers to be confirmed. PAL helps in precise planning and control of material delivery to meet customer demands.

### Example 2:
Context: App ID: F3829\n\nCharacteristic catalogs contain attributes that can be used as characteristics, for example, when you run availability checks for sales documents and stock transport orders. With this app, you can adapt characteristic catalogs to suit your business needs, for example, when you check availability against product allocation, execute backorder processing, use an alternative plant to confirm a requirement or protect quantities to prioritize demand. You can use this app if the business role Order Fulfillment Manager (R0226) is assigned to your user.\n\n\n\n Prerequisites \n\nTo be able to see the list of available catalogs in the Manage Characteristic Catalogs app, you need to set the catalog use types in the authorization object M_CAT_UTYP. In the Characteristic Catalog Use Type field, enter your catalog use types (for example, 01, 02 or * for all catalog use types).\n\n\n\n\n\n Key Features \n\nDepending on the selected catalog, you can use this app to:\n\n\n\n\n\n* Add characteristics as standard or custom fields to a characteristic catalog.\n* Define characteristic value groups.\n* Connect characteristics from multiple characteristic catalogs.\n* Rename standard characteristics.\n* Enable value existence check type for characteristics.\n* Define authorizations for characteristic values and groups of characteristic values.\n* Add classification characteristics from Variant Configuration\n\n\n\n\n\n Supported Device Types \n\n\n\n* Desktop\n* Tablet\n\n\n\n\n\nRelated InformationUsing the Manage Characteristic Catalogs App for aATPProduct Allocation (CA-ATP-PAL)Backorder Processing (CA-ATP-BOP)Supply Protection (CA-ATP-SUP)Alternative-Based Confirmation (CA-ATP-ABC)
Output::
Question: How can I access the characteristic catalog?
Answer: To access the characteristic catalog, you can use the "Manage Characteristic Catalogs" app . This app allows you to adapt characteristic catalogs to suit your business needs, add characteristics, define characteristic value groups, and connect characteristics from multiple catalogs . Please note that you need to have the "Order Fulfillment Manager" role assigned to your user to access this app.

### Example 3:
Context: Availability Change Log (ACL) must be activated before the changes are captured. You can activate ACL in Customizing of aATP at !Start of the navigation path Cross-Application Components !Next navigation step Advanced Available-to-Promise (aATP) !Next navigation step Availability Change Log !Next navigation step Activate Availability Change Log![End of the navigation path](URL\n\n\n\n* After ACL is configured, changes impacting the availability situation are captured for all supported documents.\n* A material-plant combination can be excluded from ACL capture by navigating to the Change View \"Availability Change Log scope based on Checking Group\": Overview screen. Add the desired checking group and select the Deactivate Logging flag.
Output::
Question: How do I configure the availability change log? 
Answer: To configure the availability change log, you need to activate it in Customizing of aATP at the start of the navigation path Cross-Application Components, then navigate to Advanced Available-to-Promise (aATP), Availability Change Log, and activate the availability change log.

Rules:

Output format:
Output::
Question: (your question)
Answer: (your answer)

Now, here is the context:
Context: {context}\n
Output::"""
# 2. Keep questions under 20 words. Use abbreviations where applicable.

In [279]:
qna_df = test_llm_generation(N_GENERATIONS = 10, printing = True)

Generating 10 QA couples...


 10%|█         | 1/10 [00:20<03:04, 20.47s/it]

QUESTION:  What is the source of information about ETags in this context?

ANSWER:  The source of information about ETags is a webpage linked from the provided URL.


 20%|██        | 2/10 [00:26<01:35, 11.96s/it]

QUESTION:  What are the properties required to create a product allocation sequence?
### 
ANSWER:  To create a product allocation sequence, you must provide the following properties:
- Material: This is mandatory and refers to the material or product being allocated.
- Plant: Although not necessary, providing the plant location can be useful for specific allocation scenarios. However, it's optional for insertion but becomes mandatory if you're updating an existing record.
- ProdAllocSqncAssignmentUUID: If you're inserting a new record, this UUID is optional. However, if you're updating an existing product allocation sequence, this UUID becomes mandatory to identify the specific assignment being updated.


 30%|███       | 3/10 [00:29<00:54,  7.80s/it]

QUESTION:  What is the purpose of the Product Allocation Sequence Description entity?

ANSWER:  The Product Allocation Sequence Description entity selects the descriptions for a given product allocation sequence in all required languages. It is used to get the descriptions for a single product allocation sequence in all required and available languages.


 40%|████      | 4/10 [00:50<01:19, 13.22s/it]

QUESTION:  How do you reference a product allocation sequence UUID in the URL?

ANSWER:  You need to enclose the UUID value in single quotes and prefix it with "guid" when referencing it in the URL. For example: guid'<ProdAllocSequenceUUID>'


 50%|█████     | 5/10 [00:53<00:47,  9.44s/it]

QUESTION:  What are the supported operations for product allocation sequences?

ANSWER:  The following operations are supported for product allocation sequences: Read product-location assignments of a specific sequence, Create Product-Location Assignment for a sequence, and Read all product-location assignments of all product allocation sequences.


 60%|██████    | 6/10 [00:56<00:29,  7.31s/it]

QUESTION:  What operations are available for the Product Allocation Sequence API?

ANSWER:  The Product Allocation Sequence API offers the following operations: Read Product Allocation Sequence, Query Product Allocation Sequence, Read Product Allocation Sequence and Product Location Assignments, Create Product-Location Assignment, Update Product-Location Assignment, and Delete Product-Location Assignment.


 70%|███████   | 7/10 [01:06<00:24,  8.16s/it]

QUESTION:  How can I update a product-location assignment using the API?

ANSWER:  To update a product-location assignment, you can use the PATCH method on the A_ProdAllocationSequence entity in the API_PRODUCT_ALLOC_SEQUENCE_SRV service. The URL for this operation is <host>/sap/opu/odata/SAP/API_PRODUCT_ALLOC_SEQUENCE_SRV/A_ProdAllocationSequence.


 80%|████████  | 8/10 [01:23<00:21, 10.82s/it]

QUESTION:  What are the features of the availability check API?

ANSWER:  The availability check API provides a standardized interface to perform various business scenarios such as availability checks against product allocation, backorder processing, supply protection, alternative-based confirmation, and more. It also allows you to define characteristic catalogs for attributes used in availability checks and configure the availability change log to capture changes impacting the availability situation.


 90%|█████████ | 9/10 [01:32<00:10, 10.27s/it]

QUESTION:  What types of external planning systems can integrate with product allocation sequences?

ANSWER:  The existing assignments for a product allocation sequence from external planning systems such as SAP Merchandise and Assortment Planning (AMR), SAP Assortment Planning for Retail (APR), SAP Integrated Business Planning (IBP), or any other planning systems can be integrated.


100%|██████████| 10/10 [01:35<00:00,  9.52s/it]

QUESTION:  What is the purpose of the product allocation consumption unit?

ANSWER:  The product allocation consumption unit is used to get the header data for exactly one product allocation sequence, including the description in the corresponding system language. It provides information about when the corresponding product allocation sequence was created and changed.





# create multiple prompts for different question types

main ones 
* what, when, where, which, why

In [294]:
import random

def test_llm_generation(QA_generation_prompt,N_GENERATIONS = 10, printing = True):
    
    print(f"Generating {N_GENERATIONS} QA couples...")
    
    outputs = []
    for sampled_context in tqdm(random.sample(chunks, N_GENERATIONS)):
    for sampled_context in tqdm(random.sample(chunks, N_GENERATIONS)):
    
        # get QA couple
        qa_couple = get_generated_qa(generator_llm,QA_generation_prompt,sampled_context)
    
        # condition check if answer is too long
        # try:
        question = qa_couple.content.split("Question: ")[-1].split("Answer: ")[0]
        answer = qa_couple.content.split("Answer: ")[-1]
        # assert len(answer) < 300, "Answer is too long"
        outputs.append(
            {
                "context": sampled_context.page_content,
                "question": question,
                "answer": answer,
                "source_doc": sampled_context.metadata["source"],
            }
        )
        # except:
            # continue
        if printing:
            print("QUESTION: ", question)
            print("ANSWER: ", answer)
    
    qna_df = pd.DataFrame(outputs)
    
    return qna_df

In [292]:
what_qa_generation_prompt = """
Your task is to generate factual 'what' questions and their answers based on the context. 
Follow these rules:
1. Your 'what' questions should inquire about specific facts, definitions, or descriptions in the context.
2. Questions must be answerable solely from the context provided.
3. Ensure questions are clear and unambiguous.
4. **Avoid** phrases like 'based on the provided context', 'according to the context', or similar and words like 'this' as these are ambiguous references.
5. Answers should not exceed 300 words.

### Example 1:
Context: Being able to deliver the required quantity of a material to the customer at the requested time demands precise planning and control mechanisms. Unpredictable problems, such as breakdowns in production or increased demand, can lead to critical situations in order processing and must be avoided wherever possible. In advanced Available-to-Promise (aATP) in SAP S/4HANA, you can use Product Allocation (PAL) to avoid critical situations in demand and procurement by allocating materials in short supply to, for example, specific regions and customers for a specific time period. This can help avoid the situation whereby, for example, the entire available quantity of a material in short supply is allocated to a single customer, thereby making it impossible for you to confirm subsequent order requirements for the same material from other customers.\n\n\n\n\n\n Features \n\naATP in SAP S/4HANA supports the following key features for setting up, configuring and monitoring availability checks against product allocation for the business document types sales document and stock transport order:\n\n\n\n Key Feature SAP Fiori App See Also \n\n <br><br> * Create and edit product allocation objects.<br> * Define the period types for product allocation objects.<br> * Define and order the characteristics for product allocation objects.<br> * Deactivate and delete product allocation objects and their characteristics.<br><br><br> Configure Product Allocation Configuring and Executing a Check Against Product Allocation \n <br><br> * Maintain characteristic value combinations and planned allocation quantities for their time periods.<br> * Use spreadsheets (in .xlsx format) or comma separated values files (in .csv format) to maintain characteristic value combinations and their time series.<br> * Reuse assignments and quantities from deleted/deactivated characteristic value combinations to those active characteristic value combinations that are using collectives.<br> * Change the activation and constraint status of product allocation objects.<br> * Display the availability situation (planned, available and consumed quantity) for the materials in product allocation objects.<br> * Release product allocation objects and their planning data for productive usage.<br><br><br> Manage Product Allocation Planning Data Configuring and Executing a Check Against Product Allocation \n <br><br> * Create, display and edit product allocation sequences.<br>
Output::
Question: What does product allocation mean?
Answer: Product Allocation (PAL) in advanced Available-to-Promise (aATP) is a mechanism in SAP S/4HANA that helps avoid critical situations in demand and procurement. It allows the allocation of materials in short supply to specific regions and customers for a specific time period. This ensures that the entire available quantity of a material is not allocated to a single customer, enabling subsequent order requirements from other customers to be confirmed. PAL helps in precise planning and control of material delivery to meet customer demands.

### Example 2:
Context: The following concepts are of key importance for Product Allocation (PAL) in advanced Available-to-Promise (aATP):\n\n\n\n\n\n Concept: Definition: \n Alias A user-specific description for a characteristic. The alias overrides the characteristic description when the characteristic is used in product allocation objects. An alias is defined in the Manage Characteristic Catalogs app. \n Allocation quantity unit<br><br>Consumption unit of measure<br><br>Allocation rate When a product allocation object is created or maintained, an allocation quantity unit is defined and represents the unit of measure for all planned allocation quantities. When the product allocation object sequence is subsequently defined, a consumption unit of measure is entered. The consumption unit corresponds to (or has to be convertible to) the unit of measure maintained in the material master for the material assigned to the product allocation object. To calculate the planned allocation quantity (in allocation quantity units), an allocation rate is maintained in the product allocation sequence constraint.<br><br>The unit of measure of the requested order quantity is converted into a quantity in the consumption unit of measure (based on the conversion rule in the material master). The quantity of consumption units is then multiplied by the allocation rate to calculate the planned quantity (in allocation quantity units). \n Authorizations based on characteristics Authorizations Based on Characteristics Based on characteristic catalogs, you can restrict reading and editing of characteristic value combinations using authorizations (for example, restricting editing the planning data for a sales organization). You define authorizations in the Manage Characteristic Catalogs app. \n Characteristic catalog The characteristic catalog represents a selection of fields which are part of the data model for sales documents and stock transport orders. The characteristic catalog provides the characteristics that can be chosen when you configure a product allocation object (for example, sales organization, division, sold-to party). Characteristic catalogs for sales documents, stock transport orders or sales documents and stock transport orders can be adapted and mixed to create characteristic value groups that suit your business needs in the Manage Characteristic Catalogs app. \n Check date time type Specifies the requested date type used for an availability check against product allocation. If, for example, the check date type Material Availability Date is specified, the check will first determine the requested material availability check date by executing scheduling for the requested delivery date.
Output::
Question: What is an allocation quantity unit in aATP product allocation?
Answer: An allocation quantity unit in aATP product allocation represents the unit of measure for all planned allocation quantities. It is used to calculate the planned allocation quantity based on the consumption unit of measure and the allocation rate .

### Example 3:
Context: CDS View Name I_ProdAllocCheckDateTimeType, I_ProdAllocCheckDateTimeTypeT \n CDS View Description Product Allocation Check Date Time Type, Product Allocation Check Date Time Type Text \n View Type Basic \n Status Released \n\n\n\n\n\n Purpose \n\nThese two CDS views (type code list view and corresponding text view) provide a list of type codes for the allowed date time types (material availability date, goods issue date and delivery date time) used in product allocation checks, and the corresponding text representation.\n\n\n\n\n\n Structure \n\n\n\nObject Types\n\nThe business objects Product Allocation Object and Product Allocation Sequence use the Product Allocation Check Date Time Type.\n\nMain Input Parameters\n\nThe views do not have any parameters.\n\nMeasures and Attributes\n\n\n\n* ProdAllocCheckDateTimeType: The different allowed type codes.\n* ProdAllocChkDateTimeTypeDesc: Textual representation of the allowed type codes.
Output::
Question: What is the check date time type?
Answer: The check date time type refers to the allowed date time types used in product allocation checks, such as material availability date, goods issue date, and delivery date time .

Output format:
Output::
Question: (your question)
Answer: (your answer)

Now, here is the context:
Context: {context}\n
Output::"""

In [296]:
qna_df=test_llm_generation(QA_generation_prompt=what_qa_generation_prompt,N_GENERATIONS=10, printing=True)

Generating 10 QA couples...


 10%|█         | 1/10 [00:12<01:54, 12.76s/it]

QUESTION:  What is the operation for updating a product-location assignment in the Product Allocation Sequence API?

ANSWER:  The operation for updating a product-location assignment in the Product Allocation Sequence API is PATCH.


 20%|██        | 2/10 [00:32<02:13, 16.64s/it]

QUESTION:  Where can you find more information about custom documentation in SAP?

ANSWER:  For more information about custom documentation in SAP, please visit the disclaimer page at https://help.sap.com/docs/disclaimer.


 30%|███       | 3/10 [00:39<01:25, 12.24s/it]

QUESTION:  What is the consumption unit for a product allocation sequence?

ANSWER:  The consumption unit for a product allocation sequence refers to the unit of measure used for the product allocation sequence.


 40%|████      | 4/10 [00:46<01:00, 10.15s/it]

QUESTION:  What are the query string options allowed for a product allocation sequence?

ANSWER:  The query string options allowed for a product allocation sequence include $orderby, $skip, $filter, $top, $select, $format, &expand, and $inlinecount.


 50%|█████     | 5/10 [00:59<00:57, 11.42s/it]

QUESTION:  What is the purpose of reading all sequence descriptions?

ANSWER:  The purpose of reading all sequence descriptions is to retrieve information about multiple sequences at once.


 60%|██████    | 6/10 [01:08<00:42, 10.56s/it]

QUESTION:  What is the purpose of the API PRODUCT_ALLOC_SEQUENCE_SRV?

ANSWER:  The API PRODUCT_ALLOC_SEQUENCE_SRV provides a service for managing product allocation sequences, including creating, updating, deleting, and retrieving information about these sequences.


 70%|███████   | 7/10 [01:29<00:41, 13.89s/it]

QUESTION:  What is the purpose of using ETags in the service?

ANSWER:  The service uses entity tags (ETags) for optimistic concurrency control. This means that when a client requests a modification of a resource on the back-end server, the ETags of the resource on the client and on the back-end server are compared to determine whether any changes have been made to the resource on the back-end server.


 80%|████████  | 8/10 [01:58<00:37, 18.89s/it]

QUESTION:  What is the purpose of the POST operation on /DeleteSequenceAssignment?

ANSWER:  The POST operation on /DeleteSequenceAssignment is used to delete a sequence assignment.


 90%|█████████ | 9/10 [02:11<00:16, 16.95s/it]

QUESTION:  What is the purpose of inserting (ProdAllocSqncAssignmentUUID) into a service node?

ANSWER:  The purpose of inserting (ProdAllocSqncAssignmentUUID) into a service node is to assign a unique identifier for the material-plant assignment in a product allocation sequence.


100%|██████████| 10/10 [02:19<00:00, 13.90s/it]

QUESTION:  What does the A_ProdAllocSqncAssgmt entity read, update, and insert?

ANSWER:  The A_ProdAllocSqncAssgmt entity reads, updates, and inserts product-location assignments for a given product allocation sequence and a given product allocation.





need to generate questions for , when, where, which, why

In [300]:
why_qa_generation_prompt = """
Your task is to generate 'why' questions and their answers based on the context.
Follow these rules:
1. Your 'why' questions should focus on reasons, causes, or justifications from the context.
2. Questions must be answerable solely from the context provided.
3. Ensure questions are clear and unambiguous.
4. **Avoid** phrases like 'based on the provided context', 'according to the context', or similar and words like 'this' as these are ambiguous references.
5. Answers should not exceed 300 words.

Output format:
Output::
Question: (your question)
Answer: (your answer)

Now, here is the context:
Context: {context}\n
Output::"""


In [302]:
qna_df=test_llm_generation(QA_generation_prompt=why_qa_generation_prompt,N_GENERATIONS=10, printing=True)

Generating 10 QA couples...


 10%|█         | 1/10 [00:15<02:15, 15.06s/it]

QUESTION:  Why is it important to read a specific product allocation sequence in a specific language?

ANSWER:  Reading a specific product allocation sequence in a specific language ensures that the information is conveyed accurately and effectively to those who need it, regardless of their native language.


 20%|██        | 2/10 [00:19<01:09,  8.65s/it]

QUESTION:  What is the purpose of reading the header data for a specific product allocation sequence using this service?

ANSWER:  The purpose of reading the header data is to include its description as well as any assigned product and product-location assignments.


 30%|███       | 3/10 [00:20<00:38,  5.44s/it]

QUESTION:  Why must the product allocation sequence exist?

ANSWER:  The product allocation sequence must exist so that the service node can know the ProductAllocationSequenceUUID.


 40%|████      | 4/10 [00:28<00:38,  6.40s/it]

QUESTION:  Why are product-location assignments supported operations?

ANSWER:  These assignments might be necessary for managing the physical distribution of products across different locations, ensuring that stock levels are accurate and up-to-date.


 50%|█████     | 5/10 [00:30<00:23,  4.75s/it]

QUESTION:  Why is custom documentation not suitable for productive use?

ANSWER:  The arrangement of topics in the SAP Help Portal may not be reflected, and important aspects or correlations to other topics might be missing.


 60%|██████    | 6/10 [00:32<00:15,  3.77s/it]

QUESTION:  Why are product-location assignments read for a specific sequence supported?

ANSWER:  The operation is supported to allow users to view the current assignments of products to locations within a particular allocation sequence.


 70%|███████   | 7/10 [00:35<00:10,  3.64s/it]

QUESTION:  Why are there different types of HTTP requests (POST, PATCH, DELETE) used in the provided URLs?

ANSWER:  The different types of HTTP requests (POST, PATCH, DELETE) are used to perform various operations on product allocation sequences and assignments. POST is used for creating new resources, PATCH is used for updating existing ones, and DELETE is used for removing them.


 80%|████████  | 8/10 [00:38<00:06,  3.39s/it]

QUESTION:  Why are there different operations for reading and updating product-location assignments?

ANSWER:  The API provides separate operations (Read Product Allocation Sequence and Read Product Allocation Sequence and Product Location Assignments) to allow for the retrieval of product allocation sequences independently or in conjunction with their associated location assignments, depending on the specific use case.


 90%|█████████ | 9/10 [00:40<00:02,  2.87s/it]

QUESTION:  Why was this custom documentation generated?

ANSWER:  This custom documentation was generated for more information, which can be found on the SAP Help Portal.


100%|██████████| 10/10 [00:42<00:00,  4.28s/it]

QUESTION:  Why is a product allocation sequence UUID required for the entity Product Allocation Sequence Assignment?

ANSWER:  The product allocation sequence UUID is required to uniquely identify and distinguish between different product allocation sequences, allowing for accurate assignment of these sequences in the Product Allocation Sequence Assignment entity.





to do:
* add in the examples for why
* repeat steps for the other question types like which, where etc

# setup prompt and llm for critic-llm