# DRILL Notebook
This is a jupyter notebook file to execute [DRILL](ontolearn.learners.drill) and generate predictive results. If you have not done it already, from the main directory "Ontolearn", run the commands for Datasets mentioned [here](https://ontolearn-docs-dice-group.netlify.app/usage/02_installation#download-external-files) to download the datasets.

In [None]:
import json
import numpy as np
from ontolearn.knowledge_base import KnowledgeBase
from ontolearn.learners import Drill
from ontolearn.learning_problem import PosNegLPStandard
from owlapy.owl_individual import OWLNamedIndividual, IRI
from ontolearn.metrics import F1
from sklearn.model_selection import StratifiedKFold
from ontolearn.utils.static_funcs import compute_f1_score
from owlapy.render import DLSyntaxObjectRenderer

Open `uncle_lp.json` where we have stored the learning problem for the concept of 'Uncle' and the path to the 'family' ontology.

In [None]:
with open('uncle_lp.json') as json_file:
    settings = json.load(json_file)

Create an instance of the class `KnowledeBase` by using the path that is stored in `settings`.

In [None]:
kb = KnowledgeBase(path=settings['data_path'])

Retreive the IRIs of the positive and negative examples of Uncle from `settings` and create an instance of `StratifiedKFold` so that we can create a train and a test set.

In [None]:
examples = settings['Uncle']
p = set(examples['positive_examples'])
n = set(examples['negative_examples'])

kf = StratifiedKFold(n_splits=10, shuffle=True, random_state=1)
X = np.array(p + n)
Y = np.array([1.0 for _ in p] + [0.0 for _ in n])

Create a model of [DRILL](ontolearn.learners.drill).

In [None]:
model = Drill(knowledge_base=kb, path_pretrained_kge="../embeddings/ConEx_Family/ConEx_entity_embeddings.csv",
                  quality_func=F1(), max_runtime=10)

1. For each training/testing set create a learning problem of type `PosNegLPStandard`.
2. Fit the training learning problem to the drill model and retrieve the top predicion.
3. Compute the F1 score of the prediction on the train and test sets.
4. Print the prediction together with the quality.

In [None]:
for (ith, (train_index, test_index)) in enumerate(kf.split(X, Y)):
    #  (1)
    train_pos = {pos_individual for pos_individual in X[train_index][Y[train_index] == 1]}
    train_neg = {neg_individual for neg_individual in X[train_index][Y[train_index] == 0]}
    test_pos = {pos_individual for pos_individual in X[test_index][Y[test_index] == 1]}
    test_neg = {neg_individual for neg_individual in X[test_index][Y[test_index] == 0]}
    train_lp = PosNegLPStandard(pos=set(map(OWLNamedIndividual, map(IRI.create, train_pos))),
                                neg=set(map(OWLNamedIndividual, map(IRI.create, train_neg))))

    test_lp = PosNegLPStandard(pos=set(map(OWLNamedIndividual, map(IRI.create, test_pos))),
                               neg=set(map(OWLNamedIndividual, map(IRI.create, test_neg))))
    
    #  (2)
    pred_drill = model.fit(train_lp).best_hypotheses(n=1)

    #  (3)
    train_f1_drill = compute_f1_score(individuals={i for i in kb.individuals(pred_drill.concept)},
                                              pos=train_lp.pos,
                                              neg=train_lp.neg)
    test_f1_drill = compute_f1_score(individuals={i for i in kb.individuals(pred_drill.concept)},
                                     pos=test_lp.pos,
                                     neg=test_lp.neg)
    
    #  (4)
    print(f"Prediction: {DLSyntaxObjectRenderer().render(pred_drill.concept)} |"
          f"Train Quality: {train_f1_drill:.3f} |"
          f"Test Quality: {test_f1_drill:.3f} \n")