# How to use LLMs to solve the tasks

This notebook exemplifies how to use LLMs to solve individual instances of the five semantics-aware process mining tasks 

#### Prerequisites:
- You need a HuggingFace account (see [here](huggingface.com)) 
- Generate an access token [here](https://huggingface.co/settings/tokens/new?tokenType=fineGrained)
- create a .env file in the root of this project and add the HF_TOKEN token (see env.example for an example)

In [None]:
from dotenv import load_dotenv
from evaluate_llm import generate_pt_discovery_output, generate_dfg_discovery_output, generate_activity_output, generate_binary_output, get_act_list
from llm.prompts import general_task_prompt_order, general_task_prompt, next_activity_prompt, dfg_task_prompt, pt_task_prompt
from evaluate_llm import get_model_and_tokenizer

load_dotenv()

In [None]:
# specify the model and device to use
MODEL = "meta-llama/Meta-Llama-3-8B-Instruct"  # replace with your model name; or if you want to use a local model, specify the path to the model
DEVICE = "cpu"  # replace with "cuda" if you have a GPU
FINE_TUNED = False  # set to True if you are using a fine-tuned model stored in a local directory or on Hugging Face Hub

model, tokenizer = get_model_and_tokenizer(MODEL, device=DEVICE)

## T-SAD

#### Input

The input to T-SAD is a trace and a set of possible activities.

Execute the following cell to build a prompt for one task instance.

In [None]:
# define input
trace = ["create purchase order", "approve purchase order", "create invoice", "approve invoice", "pay invoice"]
activities = ["create purchase order", "approve purchase order", "create invoice", "approve invoice", "reject invoice", "pay invoice"]


# get the prompt for the task
if not FINE_TUNED:
    t_sad_prompt = general_task_prompt + "List of process activities: " + str(activities) + "\n" + "Trace:" + str(trace) + "\nValid:"
else:
    t_sad_prompt =  "List of process activities: " + str(activities) + "\n" + "Trace:" + str(trace) + "\nValid:"
print(t_sad_prompt)

#### Output

The follwing cell calls the LLM with the respective task instance and returns either True if it considers the trace valid or false if it considers the trace invalid.

In [None]:
generate_binary_output(
    model_name=MODEL,
    device=DEVICE,
    model=model,
    tokenizer=tokenizer,
    prompt=t_sad_prompt
)

## A-SAD

#### Input

The input to A-SAD is a pair of activities and a set of possible activities.

Execute the following cell to build a prompt for one task instance.

In [None]:
pair = ("reject purchase order", "pay invoice")
activities = ["create purchase order", "approve purchase order", "create invoice", "approve invoice", "reject invoice", "pay invoice"]
# get the prompt for the task
if not FINE_TUNED:
    a_sad_prompt = general_task_prompt_order + " List of process activities: " + str(activities) + "\n" + "1. Activity:" + str(pair[0]) + "\n"  + "2. Activity:" + str(pair[1]) + "\nValid:"
else:
    a_sad_prompt = "List of process activities: " + str(activities) + "\n" + "1. Activity:" + str(pair[0]) + "\n"  + "2. Activity:" + str(pair[1]) + "\nValid:"
print(a_sad_prompt)

#### Output

The follwing cell calls the LLM with the respective task instance and returns either True if it considers th pair valid or false if it considers the pair invalid.

In [None]:
generate_binary_output(
    model_name=MODEL,
    device=DEVICE,
    model=model,
    tokenizer=tokenizer,
    prompt=a_sad_prompt
)

## S-NAP

#### Input

The input to S-NAP is a trace-prefix and a set of possible activities.

Execute the following cell to build a prompt for one task instance.

In [None]:
prefix = ["create purchase order", "approve purchase order"]
activities = ["create purchase order", "approve purchase order", "create invoice", "approve invoice", "reject invoice", "pay invoice"]

# get the prompt for the task
if not FINE_TUNED:
    s_nap_prompt = dfg_task_prompt + "List of process activities: " + str(activities) + "\n" + "Sequence of activities:" + str(prefix) + "\nAnswer:"
else:
    s_nap_prompt = "List of process activities: " + str(activities) + "\n" + "Sequence of activities:" + str(prefix) + "\nAnswer:"
print(s_nap_prompt)

#### Output

The follwing cell calls the LLM with the respective task instance and returns the next activity that continues the prefix.

In [None]:
generate_activity_output(
    model_name=MODEL,
    device=DEVICE,
    model=model,
    tokenizer=tokenizer,
    prompt=s_nap_prompt,
    activities=activities
)

## S-DFD

#### Input

The input to S-DFD is a set of possible activities.

Execute the following cell to build a prompt for one task instance.

In [None]:
activities = ["create purchase order", "approve purchase order", "create invoice", "approve invoice", "reject invoice" ,"pay invoice"]

# get the prompt for the task
if not FINE_TUNED:
    s_sfd_prompt = dfg_task_prompt + "List of process activities: " + str(activities) + "\nPairs of activities:"
else:   
    s_sfd_prompt = "List of process activities: " + str(activities) + "\nPairs of activities:"
print(s_sfd_prompt)

#### Output

The follwing cell calls the LLM with the respective task instance and returns the directly-follows pairs for the activities.

In [None]:
generate_dfg_discovery_output(
    model_name=MODEL,
    device=DEVICE,
    model=model,
    tokenizer=tokenizer,
    prompt=s_sfd_prompt
)

## S-PTD

#### Input

The input to S-PTD is a set of possible activities.

Execute the following cell to build a prompt for one task instance.

In [None]:
activities = ["create purchase order", "approve purchase order", "create invoice", "approve invoice", "reject invoice" "pay invoice"]

# get the prompt for the task
if not FINE_TUNED:
    s_ptd_prompt = pt_task_prompt + "List of process activities: " + str(activities) + "\nProcess tree:"
else:
    s_ptd_prompt = "List of process activities: " + str(activities) + "\nProcess tree:"
print(s_ptd_prompt)

#### Output

The follwing cell calls the LLM with the respective task instance and returns a process tree representation based on the activities.

In [None]:
generate_pt_discovery_output(
    model_name=MODEL,
    device=DEVICE,
    model=model,
    tokenizer=tokenizer,
    prompt=s_ptd_prompt
)