EmEL-V

EmEL-V is a geometric approach to generate embeddings for the description logic EL++ The implementation is done using Python and Pytorch Library.

The code is organized as follows:

Experiments: This contains separate folder for each ontology the experiment is carried out upon.
Experiments folder contains models, data and results folder(create an empty results folder and the others required to store the model)
models folder contains code which takes in the dataset names
The corresponding dataset folders must be present in the data folder
Corresponding results folder stores the trained model parameters(make sure to change the path for out_file in the code)
The implementation of evaluation metrics - Evaluating_HITS.py

-experiments/data/{dataset_name} : This folder consists of 4 processed files namely, normalized form of the ontology file to be used for training, and training,validation & testing set obtained from subclass relations in ontology.

Implementation of the code is organised in Three Parts for classification task:

First: Given an ontology OWL file we normalize it with Normalizer.groovy script using jcel jar. Normalizer file could be found here

Command to Normalize: groovy -cp jcel.jar Normalizer.groovy -i -o
Second: Using the normalized-ontology we identify the subclass relations and generate training, testing and validation set using split of 70%-20%-10%.
Third: Performing training using the normalized-ontology file while removing the 30%(validation and testing) subclass relation axioms from it. Using validation data for hyper-parameter tuning and testing to evaluate the fine-tuned models.

Associated model files:

Experiments/models/EMEL_trans_m.py : This file denotes the EmEL model implementation with translation operation and variance.
Experiments/models/EMEL_trans_bayes.py : This file denotes the EmEL model implementation with translation operation and bayesian inference.
Experiments/models/EMEL_sparse.py : This file denotes the EmEL model implementation with relations as matrices.
Experiments/models/EMEL_sparse_m.py : This file denotes the EmEL model implementation with relations as matrices and variance.

Executing the code:

Before executing the code you need CUDA installed to use a GPU and list of python libraries as provided in requirements.txt.
For execution of the code follow the directory structure as it is, further we demonstrate it using an example for GALEN dataset.
Go to directory experiments/models/ folder and run python EMEL_trans_m.py --data GO (provide other arguments if needed)
This will start the training and if you want to change the dimension size then you need to modify it in the code.
This will output corresponding embeddings for classes and relations in pkl files in the results directory.
For evaluating the embeddings run python scripts Evaluating_HITS.py and provide the path of the pkl files.

Data

The ontologies used in our evaluation (SNOMED CT, GALEN, GO) are publicly available.

Create Experiments/Data folder which would contain all the data.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
Experiments		Experiments
.DS_Store		.DS_Store
.gitignore		.gitignore
Evaluating_HITS-semrec.py		Evaluating_HITS-semrec.py
Evaluating_HITS.py		Evaluating_HITS.py
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
evaluate.sh		evaluate.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EmEL-V

Data

About

Releases

Packages

Contributors 3

Languages

License

kracr/el-embeddings

Folders and files

Latest commit

History

Repository files navigation

EmEL-V

Data

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages