# Simulate with Python API

For more control over the workings of the ASReview software, the ASReview Python API can be used. For example, it is possible to use custom models or implement different sampling strategies. This example shows how to simulate a review with the ASReview API and store the results in an ASReview project file.

In [2]:
from pathlib import Path

import asreview as asr
from synergy_dataset import Dataset

In this example, we use a dataset from the SYNERGY collection via the `synergy-dataset` package.

In [6]:
d = Dataset("Hall_2012").to_frame()
d.head()

Unnamed: 0_level_0,doi,title,abstract,label_included
openalex_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
https://openalex.org/W2131536587,https://doi.org/10.1109/indcon.2010.5712716,Computer vision based offset error computation...,The use of computer vision based approach has ...,0
https://openalex.org/W2557025555,https://doi.org/10.1109/induscon.2010.5740045,Design and development of a software for fault...,This paper presents an on-line fault diagnosis...,0
https://openalex.org/W2143148279,https://doi.org/10.1109/tpwrd.2005.848672,Analytical Approach to Internal Fault Simulati...,A new method for simulating faulted transforme...,0
https://openalex.org/W2111816457,https://doi.org/10.1109/icelmach.2008.4799852,Nonlinear equivalent circuit model of a tracti...,The paper presents the development of an equiv...,0
https://openalex.org/W3142547111,https://doi.org/10.1109/ipdps.2006.1639408,Fault tolerance with real-time Java,After having drawn up a state of the art on th...,0


Import the models.

In [10]:
from asreview.models.balancers import Balanced
from asreview.models.classifiers import SVM
from asreview.models.feature_extractors import Tfidf
from asreview.models.queriers import Max, TopDown
from asreview.models.stoppers import IsFittable

In [None]:
learners = [
    asr.ActiveLearningCycle(querier=TopDown(), stopper=IsFittable()),
    asr.ActiveLearningCycle(
        querier=Max(),
        classifier=SVM(C=3),
        balancer=Balanced(ratio=5),
        feature_extractor=Tfidf()
    ),
]

sim = asr.Simulate(
    d,
    d["label_included"],
    learners,
)
sim.review()

  "label": pd.Series(self.labels)[record_ids],
Relevant records found:   1%|          | 1/104 [00:00<00:17,  5.79it/s]
Records labeled       :   1%|          | 69/8793 [00:00<00:21, 402.65it/s]


Loss: 0.008
NDCG: 0.163



