encode-tree-cpp

tl;dr Use AutoEncoders to facilate high dimensional data indexing (C++, LibTorch)

Problem/Motivation: Multidimensional querying is inefficient (often slower than a linear search). Traditional indexing structures (e.g., R*-Tree, X-Tree) succumb to the curse of dimensionality or other structure-specific issues (e.g., boundary overlapping). Indexes such as the Pyramid technique and iMinMax adapt better to high dimensional spaces but require manual fine-tuning according to the data distribution. In addition, all these methodologies predate the current big data boom. Thus, novel solutions are required for efficient indexing on high dimensions.

Solution: 1) Use machine learning (i.e., Autoencoders) to automatically reduce the data dimensionality and 2) index the embedded space using a conventional index structure (e.g., B+-Tree).

Input: Multidimensional data in tabular form (e.g., Spotify songs)

Output: An index for efficient multidimensional queries.

Notes:

This project includes an AutoEncoder written in C++ using LibTorch (PyTorch's C++ API).
The input data follow a standard csv format. Examples and Python scripts to create the data are located in the Data folder.
The '''CustomLoader.h''' and '''CSVLoader.h''' are custom made data loaders to load your data.
LibTorch is a great C++ library for machine learning but lacks documentation and tutorials. I hope my code will help others.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
build		build
data		data
tlx		tlx
CMakeLists.txt		CMakeLists.txt
CSVLoader.cpp		CSVLoader.cpp
CSVLoader.h		CSVLoader.h
CustomLoaders.cpp		CustomLoaders.cpp
CustomLoaders.h		CustomLoaders.h
README.md		README.md
autoencoder.cpp		autoencoder.cpp
autoencoder.h		autoencoder.h
encode_tree.h		encode_tree.h
gitignore		gitignore
main.cpp		main.cpp
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

encode-tree-cpp

About

Releases

Packages

Languages

psimatis/Encode-Tree

Folders and files

Latest commit

History

Repository files navigation

encode-tree-cpp

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages