This repository hosts the course project for the "LT2318: Artificiell intelligens. This project is a reimpelemntation of a paper titled "Encoding Spatial Relations from Natural Language" [1].
Spatial language involves words and phrases describing objects' position, orientation, movement, and relationships in space. Examples of spatial language include terms such as "above," "below," "near," "far," etc. Spatial language is an important aspect of human communication, as it allows us to describe and understand the world around us. For example, when we see a car moving, we might describe its movement using spatial language, saying that the car is moving "backward" or "forward." This allows us to communicate our observations clearly and helps us understand the spatial relationships between objects in the world around us.
The paper "Encoding Spatial Relations from Natural Language" presents a system capable of capturing the semantics of spatial relations from natural language descriptions. The system uses a multi-modal objective to generate images of scenes from their textual descriptions. The SLIM dataset was proposed for this reason.
Dataset information can be found here
Code is tested using python 3.9. Use conda create --name <env> --file requirements.txt
to create virual env and to install the required libraries.
Dataset download and conversion can be found here
Processed dataset files are under /srv/data/zarzouram/lt2318/slim/turk_torch/
run_train.py
expects the following arguments:
- dataset_dir: The parent directory contains the processed dataset files. It is the sas the output_dir in Section 2.3 Create Dataset
- config_path: Path for the configuration json file default is
./codes/config.json
- checkpoints_dir: directory where checkpoints are saved
- checkpoint_model: if train resuming is needed pass the checkpoint filename
- device: either gpu or cpu
Loss are tracked using Tensorboard. The path to
tensoboard files is ./logs
.
python code/run_train.py [ARGUMENT]
You do not need to create new datasets, use python code/run_train.py
to use
the default arguments.
Model testing are done in the experiments.ipynb
notebook. The notebook is
configured to load the test results done by me from
/srv/data/zarzouram/lt2318/test_outputs
.
If you want to re-test the model, under the Testing Model
section, please
change the retest
value to True
. Please do not forget to change the path
in the save_data
otherwise you will override the saved test results.
Please see the attached report, under paper
[1] Ramalho, T., Kočiský, T., Besse, F., Eslami, S. M., Melis, G., Viola, F., ... & Hermann, K. M. (2018). Encoding spatial relations from natural language. arXiv preprint arXiv:1807.01670.