Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented function translateAlgebra. This functions takes a SPARQL … #1322

Merged
merged 4 commits into from Jun 29, 2021
Merged

Implemented function translateAlgebra. This functions takes a SPARQL … #1322

merged 4 commits into from Jun 29, 2021

Conversation

GreenfishK
Copy link

@GreenfishK GreenfishK commented May 18, 2021

…query algebra as an input and returns the query text. It should cover the whole SPARQL 1.1 specification for Select-Queries. Tests are in test/translate_algebra. I did not follow any specific test framework like Nose but wrote my own one which can simply be executed by running test/translate_algebra/main.py file.

The specifications I followed from here: https://www.w3.org/TR/sparql11-query/
Especially following sections:
17.3 Operator Mapping
17.4 Function Definitions
18.2 Translation to the SPARQL Algebra

Tests:
I tried to cover all possible constructs.
Two tests did not pass due to two potential bugs:

  1. n3() function of NegatedPath seems not to work properly (see test_property_path__negated_property_set)
  2. There is a problem in the parseQuery function with SERVICE. (see test_other__service1)

Reason for not covering ASK queries:
The query form "ASK" is currently not supported as its algebra tree that comes from translateQuery() might be wrongly implemented. ASK-queries do not project any variables in theory but the query algebra shows a "project" node with "vars". Therefore, it would require a special treatment in the new function I am proposing.

…query algebra as an input and returns the query text. Tests are in test/translate_algebra. I did not follow any specific test framework like Nose but wrote my own one.
@GreenfishK GreenfishK closed this May 18, 2021
@GreenfishK GreenfishK reopened this May 18, 2021
@coveralls
Copy link

coveralls commented May 18, 2021

Coverage Status

Coverage decreased (-1.8%) to 73.928% when pulling be00c8e on GreenfishK:translate_algebra into 58b23fd on RDFLib:master.

@GreenfishK
Copy link
Author

Automated tests passed for python 3.6 and 3.8 but not for 3.7, which is strange.

@rchateauneu
Copy link
Contributor

Automated tests passed for python 3.6 and 3.8 but not for 3.7, which is strange.

This is interesting because the same situation happened yesterday with #1321

@aucampia
Copy link
Member

Automated tests passed for python 3.6 and 3.8 but not for 3.7, which is strange.

This is interesting because the same situation happened yesterday with #1321

This error (OSError: [Errno 99] Cannot assign requested address) is something wrong with the test environment and not with the PR I think, just amend and push again to re-trigger the runner.

…query algebra as an input and returns the query text. Tests are in test/translate_algebra. I did not follow any specific test framework like Nose but wrote my own one.

Automated tests did not pass for python 3.7. Trying again
@aucampia
Copy link
Member

You can do an amend without actually changing anything, if I do git commit --amend that updates my commit ID without actually changing the description or changes. But this works also :)

@GreenfishK
Copy link
Author

You can do an amend without actually changing anything, if I do git commit --amend that updates my commit ID without actually changing the description or changes. But this works also :)

ooohh ok I did not know that. But very useful, thanks!

GreenfishK added 2 commits May 20, 2021 12:04
…atterns and solution modifiers. Added test (see test_integration__complex_query1.txt)

A potential bug has been found which occurs when non-aggregation variables are translated into Aggregate-Sample by the translateQuery function. A back-translation of the algebra will get rid of the "sample aggregation function" but only in cases when variables are not bound (see test_integration__complex_query1.txt) and the code under "elif node.name == "AggregateJoin"" where I include a treatment for SAMPLE)
@nicholascar
Copy link
Member

@GreenfishK apologies for not addressing this PR sooner - project work is busy towards the end of the Financial Year in June!

Can you please provide a bit more background, specifically what the motivation for this work is? I see a special test rig here and a series of SPARQL queries testing many aspects of conversions to/from the algebra, but what is the ultimate purpose of the work? Is it that, until now, there hasn't been any comprehensive test suite for rdflib's SPARQL handling and you want one due to bugs you've noticed? If this is the case, what can we add to documentation to show what parts of SPARQL are tested and what parts are not?

@GreenfishK
Copy link
Author

Hi @nicholascar,

thank you for the question. This project is not about testing conversions, it is about creating a single function that let's you provide a SPARQL algebra and in return get a SPARQL query. Currently, in rdflib only the other way around is possible, so, converting a SPARQL algebra into a query tree/query algebra.

The motivation behind this is Data Citation. Read more about it here "Data Citation WG | RDA" https://www.rd-alliance.org/groups/data-citation-wg.html

There is one recommendation to normalize queries in order to tell whether two queries are equal, so basically what query containment solvers do. My idea was to do the normalization not on the query but on the query's algebra, which is easier to work with. Once you have done this, you need to turn back the SPARQL algebra into a SOARQL query but there is currently no offered function by rdflib and that is what I am providing now.

@nicholascar
Copy link
Member

OK, so this is essentially a stand-alone new functionality then. Well that's fine!

I do know the RDA well, and even that WG. I was the co-chair of the Provenance Patterns WG when that WG was in operations and I reviewed their outputs, including the qurey re-writing stuff. I worked with Andreas Rauber on some elements of query normalisation. So happy to see work on this toolkit that supports those aims!

@GreenfishK
Copy link
Author

OK, so this is essentially a stand-alone new functionality then. Well that's fine!

I do know the RDA well, and even that WG. I was the co-chair of the Provenance Patterns WG when that WG was in operations and I reviewed their outputs, including the qurey re-writing stuff. I worked with Andreas Rauber on some elements of query normalisation. So happy to see work on this toolkit that supports those aims!

Wow that's such a coincidence! Thank you for approving it. Such a great feeling.

@nicholascar nicholascar merged commit 871f797 into RDFLib:master Jun 29, 2021
@nicholascar
Copy link
Member

@GreenfishK I've merged but would you consider adding a documetation section to one of the existing pages, perhaps https://rdflib.readthedocs.io/en/stable/intro_to_sparql.html - just a few lines at the end of that page about the capability you've introduced? The reason is that without a note in there, people won't know this capability's there.

@GreenfishK
Copy link
Author

@GreenfishK I've merged but would you consider adding a documetation section to one of the existing pages, perhaps https://rdflib.readthedocs.io/en/stable/intro_to_sparql.html - just a few lines at the end of that page about the capability you've introduced? The reason is that without a note in there, people won't know this capability's there.

Than you! Sure, i will document it.

@GreenfishK GreenfishK deleted the translate_algebra branch June 30, 2021 06:03
pass


def translateAlgebra(query_algebra: Query = None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure under what circumstance it makes sense for this to be null, because if it is null traverse(query_algebra.algebra, visitPre=sparql_query_text) will fail.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I will correct it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not too serious, just saw the error in mypy and thought I would let you know in case I was missing something.

aucampia added a commit to aucampia/rdflib that referenced this pull request Sep 10, 2021
Other changes:

- fix return value for `Graph.serialize` (refs RDFLib#1394)
- remove default value for
  `rdflib.plugins.sparql.algebra.translateAlgebra` (refs RDFLib#1322)
- add .dockerignore to reduce context size and make docker quicker
  to run.
- add .flake8 config to ignore line length as black is managing
  formatting.
- add mypy to docker-compose, makefile and tox.ini
- fix the way csv2rdf is invoked to ensure that the right code gets
  executed.
aucampia added a commit to aucampia/rdflib that referenced this pull request Sep 10, 2021
Other changes:

- fix return value for `Graph.serialize` (refs RDFLib#1394)
- remove default value for
  `rdflib.plugins.sparql.algebra.translateAlgebra` (refs RDFLib#1322)
- add .dockerignore to reduce context size and make docker quicker
  to run.
- add .flake8 config to ignore line length as black is managing
  formatting.
- add mypy to docker-compose, makefile and tox.ini
- fix the way csv2rdf is invoked to ensure that the right code gets
  executed.
aucampia added a commit to aucampia/rdflib that referenced this pull request Sep 10, 2021
Other changes:

- fix return value for `Graph.serialize` (refs RDFLib#1394)
- remove default value for
  `rdflib.plugins.sparql.algebra.translateAlgebra` (refs RDFLib#1322)
- add .dockerignore to reduce context size and make docker quicker
  to run.
- add .flake8 config to ignore line length as black is managing
  formatting.
- add mypy to docker-compose, makefile and tox.ini
- fix the way csv2rdf is invoked to ensure that the right code gets
  executed.
aucampia added a commit to aucampia/rdflib that referenced this pull request Sep 10, 2021
Fixes RDFLib#1311

Add `mypy` to .drone.yml and fix type errors that come up.

Not type checking examples or tests.

Other changes:

- fix return value for `Graph.serialize` (refs RDFLib#1394)
- remove default value for
  `rdflib.plugins.sparql.algebra.translateAlgebra` (refs RDFLib#1322)
- add .dockerignore to reduce context size and make docker quicker
  to run.
- add .flake8 config to ignore line length as black is managing
  formatting.
- add mypy to docker-compose, makefile and tox.ini
- fix the way csv2rdf is invoked to ensure that the right code gets
  executed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants