Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Located Triples #1351

Draft
wants to merge 84 commits into
base: master
Choose a base branch
from
Draft

Located Triples #1351

wants to merge 84 commits into from

Conversation

Qup42
Copy link
Member

@Qup42 Qup42 commented May 19, 2024

SPARQL 1.1 Update

Status

It should somewhat work. I'm at the stage of cleaning up and adding tests.

  1. Insert triples
INSERT DATA {
  <http://wallscope.co.uk/resource/olympics/athlete/MeganOlwenDevenishTaylorMandevilleEllis> <foo> <baz>
}

or

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
INSERT {
  ?a <foo> <baz>
} WHERE {
  ?a foaf:gender <http://wallscope.co.uk/resource/olympics/gender/F> .
}
  1. Retrieve triples
SELECT * WHERE {
  <http://wallscope.co.uk/resource/olympics/athlete/MeganOlwenDevenishTaylorMandevilleEllis> ?b ?c 
}

or

SELECT * WHERE {
  ?a <foo> <baz>
}

TODOs:

  • Unit Tests
    • CompressedRelationReader
    • DeltaTriples
  • Split up into smaller chunks
    • LocatedTriple
    • DeltaTriple
    • integration

Purpose

The purpose of this PR is that we can talk about the current state. The plan is to split off more manageable PRs from this that are then reviewed.

Open design decision

  • An update will have different or more time measurements than a normal query. Which fields should be chosen (-> consistency) and how to display them in the UI?

Hannah Bast and others added 22 commits June 9, 2023 16:47
This is the first part of a series of PRs split of from the large
proof-of-concept PR ad-freiburg#916,
which realizes SPARQL 1.1 Update
This PR is from June 2023. Since then, quite a bit of code that is
relevant for this PR has been refactored. In particular: the
`Permutation` class, the `CompressedRelationWriter` and the index
building functions in `IndexImpl`. Mostly, this refactoring has made the
code in this PR simpler. There is still some akwardness in
`LocatedTriplesTest.cpp` because we want to build an index from `Id`
triples there (and not from Turtle input).

Now evertyhing compiles and runs through again. Various tests in
`LocatedTriplesTest` fail, that's what I will look at next.
This makes the actual changes hard to spot
Both cases would have done a full scan anyway, so we can do that directly without retrieving the relation metadata.
# Conflicts:
#	test/ValueIdTest.cpp
@Qup42
Copy link
Member Author

Qup42 commented May 20, 2024

Some data:

  • Built with Release CMake Profile against cfc581e
  • Executed on my Laptop (AMD Ryzen 7840U, 32GB RAM, SSD)
  • Olympics dataset
    • ~400k and ~8M triples are to be located, once each for deletion and insertion
    • times are averaged over all permutations

~400k

IdTable, random-access IdTable, pre-sorting std::vector, random-access std::vector pre-sorting
Result materialisation 70ms 70ms 70ms 70ms
Sorting N/A 60ms-160ms N/A 20ms
Location 15ms-20ms 10ms-20ms 15ms-20ms 10ms-20ms

~8M

IdTable, random-access IdTable, pre-sorting std::vector, random-access std::vector pre-sorting
Result materialisation 1330ms 1330ms 1330ms 1330ms
Sorting N/A 1500ms-4000ms N/A 460ms-510ms
Location 330ms-450ms 240ms-360ms 330ms-460ms 220ms-350ms

@Qup42 Qup42 mentioned this pull request May 22, 2024
Copy link

codecov bot commented Jun 14, 2024

Codecov Report

Attention: Patch coverage is 70.41667% with 142 lines in your changes missing coverage. Please review.

Project coverage is 88.55%. Comparing base (f9e730c) to head (57364ec).

Files Patch % Lines
src/index/DeltaTriples.cpp 12.12% 115 Missing and 1 partial ⚠️
src/index/CompressedRelation.h 12.50% 7 Missing ⚠️
src/index/Index.cpp 14.28% 6 Missing ⚠️
src/index/DeltaTriples.h 16.66% 5 Missing ⚠️
src/index/LocatedTriples.cpp 97.51% 0 Missing and 4 partials ⚠️
src/index/CompressedRelation.cpp 97.95% 1 Missing and 1 partial ⚠️
src/index/IndexImpl.h 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1351      +/-   ##
==========================================
- Coverage   88.89%   88.55%   -0.34%     
==========================================
  Files         327      332       +5     
  Lines       28974    29378     +404     
  Branches     3210     3277      +67     
==========================================
+ Hits        25756    26016     +260     
- Misses       2066     2205     +139     
- Partials     1152     1157       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

sonarcloud bot commented Jun 19, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant