This repository contains the code to reproduce the experiments in the paper:
Paolo Ferragina, Marco Frasca, Giosuè Cataldo Marinò, and Giorgio Vinciguerra. On Nonlinear Learned String Indexing. 2023. (under review)
Refer to ANN/README.md for instructions on how to train and evaluate the various artificial neural network models. Refer to FST/README.md for instructions on how to compile and run the various configurations of the succinctly-encoded trie. The plots can be created with the Python notebook in the results folder.
The real and synthetic datasets can be downloaded here. The original sources for the real datasets are: Google Books Ngram, GeoNames, Laboratory for Web Algorithmics at University of Milan, Pizza&Chili corpus.