## Experimental Environment and Design

This notebook demonstrates how the experimental environment can be used and provides examples.
The notebook will mainly focus on:
- Creating `prompt` templates
- Creating and running `experiment`


### Requirements:
For the environment to be used the following is required:
- All libraries found in 'requirements.txt' need to be present
- The dataset needs to have 'record_id' as the id column and 'label' (0 or 1). At least one of the following columns need to be included:
* A column with 'openalex' in the name, which contains a link to the article on OpenAlex platform

OR

* 'title' and/or 'abstract' and/or 'keywords'

##### Key Parameters Embedded in The Environment

- The cut-off value for the classic models is tau= 0.5
- The stopping criterion for active learning is: 5% negatives of the total dataset found in a row

### `prompt`
`prompt` is used to allow the user to create the template of their desired prompt. It follows RAG style prompts, as it provides two parts of the prompt, 'augmentation' (i.e. providing context) and 'prediction'. However, it is not necessary to follow RAG methodology, e.g. by not using the augmentation part. 

In both phases (augmentation and prediciton) a list of items will be presented. However, the specific datapoints will not be given by the user directly in the prompt, rather they will be handled by the environment, based on the datasets the user would provide. Therefore, at this stage, the user simply specifices the pattern in which each item in the augmentation and prediction list should look like. 
The pattern will be a combination of text provided by the user and placeholders for specific data extracted from the data points. For the latter, the following placeholders are supported
- `{record_id}`: The ID of the article in the dataset
- `{label_token}`: Since the dataset contains the label as 0 or 1, this allows the user to present a different labeling scheme by using the labels provided in the prompt attribute `positive_token` and `negative_token` instead. E.g. if the datapoint in question is positive, this place holder will be replaced by the value provided in `positive_token`.
- `{title}`
- `{abstract}`
- `{keywords}`

*Note*: If a list is provided, please place '{}' where the list should appear in the 'augmentation' or 'prediction' part.


#### `positive_token` and `negative_token`:
If a labeling scheme is used in the prompt, this allows the user to define it and would also be necessary for the environment to read the response of the LLM correctly.

#### prediction_method
- `Methods.ID`: Model returns a list of IDs for relevant items.
- `Methods.TOKEN`: Model returns a token (e.g., '<POSITIVE>' or '<NEGATIVE>') for each item.
- `Methods.ID_TOKEN`: Model returns both an ID and a token (e.g. '{134B:<POSITIVE>}') for each item.



In [None]:
from prompt import Prompt, Methods

# ID method
prompt_id = Prompt(
    augmentation='List relevant items: {}',
    augmentation_item_pattern='{"ID":"{record_id}", content: {title} {abstract} }',
    prediction='Provide me the IDs of the relevant items here: {}',
    prediction_item_pattern='{"ID":"{record_id}", content: {title} {abstract} }',
    prediction_method=Methods.ID
)

# TOKEN method
prompt_token = Prompt(
    augmentation='Classify each item as <POSITIVE> or <NEGATIVE>: {}',
    augmentation_item_pattern='$$$ {title} {abstract} , STATUS={label_token}  $$$',
    prediction='Predict the STATUS: {}',
    prediction_item_pattern='$$${title} {abstract}, STATUS= ',
    positive_token='<POSITIVE>', 
    negative_token='<NEGATIVE>',
    prediction_method=Methods.TOKEN
)

# ID_TOKEN method
prompt_id_token = Prompt(
    augmentation='List items and their status: {}',
    augmentation_item_pattern='{"ID":"{record_id}", content: {title} {abstract}, "STATUS":"{label_token}" }',
    prediction='Which are relevant? {}',
    prediction_item_pattern='{"ID":"{record_id}", content: {title} {abstract} }',
    positive_token='<POSITIVE>',
    negative_token='<NEGATIVE>',
    prediction_method=Methods.ID_TOKEN
)


### Experiment

The experiment object is the object that acts as an interface between the user and the larger environment. For its initialization, it requires the dataset to be experimented on, passed either as shown here, however, a link to a direct file is also permissable. Then, the columns, which will be later used, have to also be defined


In [None]:
from experiment import Experiment

exp = Experiment('example_dataset.csv', columns=['title', 'abstract'])

#### Approach
The supported approaches are: 'active'/'active-learning' and 'zero-shot'/'few-shot'. However, the latter is defined in two steps, 1- the input has to contain 'shot' in the name, but weather it is actually zero or few shot is determined by the number of positive/negative class examples which will be determined later

#### Model
The following classical models are supported:
- **Naive bayes**
- **Random Forest**
- **Logistic Regression**

Simply write any part of the model name, and it should be detected

The following LLMs are supported:
- **HU LLM 1 and 3**, e.g. simply write 'HU-LLM3'
- **Gemini**, e.g. 'gemini-2.0-flash' (follows google's naming scheme)
- **Local models** reachable via OpenWebUI, e.g. 'OpenWebUI/qwen3:8b'
- **Any model reached via huggingface's inference API**, e.g. 'hf-inference/meta-llama/Llama-3.3-70B-Instruct'. If you want to use a different provider simply chage the first part (i.e. replace 'hf-inference' via a different supported provider)

*Note*: if you use a different LLM than HU, then you need to provide an api_key via api_key attribute of the environment object


In [None]:
# Classic model
model = 'bayes'
# Google Gemini LLM
model = 'gemini-2.0-flash'

exp.model=model


exp.api_key='your_api_key_here'

## Full Example: Running an Experiment
Below is a complete example using all the parameters above, including prompt, approach, and model.

In [None]:
from experiment import Experiment
from prompt import Prompt, Methods

# Define a prompt using TOKEN method
prompt = Prompt(
    augmentation='Classify each item as <POSITIVE> or <NEGATIVE>: {}',
    augmentation_item_pattern='$$$ {title} {abstract} , STATUS={label_token}  $$$',
    prediction='Predict the STATUS: {}',
    prediction_item_pattern='$$${title} {abstract}, STATUS= ',
    positive_token='<POSITIVE>',
    negative_token='<NEGATIVE>',
    prediction_method=Methods.TOKEN
)

# A link would work too, e.g. 'https://raw.githubusercontent.com/asreview/synergy-dataset/master/datasets/Cohen_2006/Nelson_2002_ids.csv'
exp = Experiment('Nelson_2002_ids.csv', columns=['title', 'abstract'])
exp.prompt = prompt

exp.approach='active'
# Here you define how many datapoint of the positive and negative class
# will be used in the training/augmentation phase
exp.set_initial_data(positives=5, negatives=1)
# Here you define the batch settings, if applicable
exp.set_batch_settings(max_train_batch_size=50, max_predict_batch_size=10, delay=1)
exp.model = 'gemini-2.0-flash' 
exp.api_key='your_api_key'
predictions = exp.run()
labels = exp.labels
exp.save()