You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I came across qlever while researching the implementations of RDF graph databases.
Though I read the "Knowledge-Base Index" section in the CIKM'17 paper paper and the "Engines and indexing" chapter in the 2023 book chapter, as well as tried to read through src/index (especially IndexImpl), it is still a bit unclear to me what the high-level process is for indexing an input file of unsorted triples (e.g. from a Turtle or NTriples files).
Could you please share some high-level steps that are performed, ideally without leaving optimizations like parallel processing out?
For example, after parsing a triple from the input file, the IDs are derived for the triple parts. How are the the individual permutations generated? Are all triples first written in ID representation written to a temporary place (e.g. temporary file or memory-mapped data structure), and then for each permutation/index, all data in ID representation is re-sorted and re-read/processed to create the current permutation's index?
Such a high-level explanation might be useful as developer documentation and could help onboard new contributors.
The text was updated successfully, but these errors were encountered:
Hi there!
I came across qlever while researching the implementations of RDF graph databases.
Though I read the "Knowledge-Base Index" section in the CIKM'17 paper paper and the "Engines and indexing" chapter in the 2023 book chapter, as well as tried to read through
src/index
(especiallyIndexImpl
), it is still a bit unclear to me what the high-level process is for indexing an input file of unsorted triples (e.g. from a Turtle or NTriples files).Could you please share some high-level steps that are performed, ideally without leaving optimizations like parallel processing out?
For example, after parsing a triple from the input file, the IDs are derived for the triple parts. How are the the individual permutations generated? Are all triples first written in ID representation written to a temporary place (e.g. temporary file or memory-mapped data structure), and then for each permutation/index, all data in ID representation is re-sorted and re-read/processed to create the current permutation's index?
Such a high-level explanation might be useful as developer documentation and could help onboard new contributors.
The text was updated successfully, but these errors were encountered: