Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TSV Results parse skips empty bindings in result #390

Merged
merged 2 commits into from
May 7, 2014
Merged

TSV Results parse skips empty bindings in result #390

merged 2 commits into from
May 7, 2014

Conversation

gweis
Copy link
Member

@gweis gweis commented May 5, 2014

Here is a failing test case, where a tsv result with possible optional bindings is being parsed, plus a proposed quick fix for it.

) + Param('datatype', IRIREF).leaveWhitespace()))
) + Param('datatype', IRIREF).leaveWhitespace()))

NONE_VALUE = object()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not that deep into pyparsing but somehow would've expected None instead of object() here... but then this way could cause less problems...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that’s what I thought as well, but pyparsing skips results that return None.
It is an internal protocol thing to show that a term didn’t match or so.

In general I think the TSV might be faster by using less pyparsing. E.g. split each line on tabs and parse each binding separately with something like from_n3(),
but this would require a bit more work to make it robust again. (I guess that’s something to discuss in some other thread).

The patch here is rather designed to have less impact on the way the parser works at the moment.

thanks,

Gerhard

On 5 May 2014, at 12:29, Jörn Hees notifications@github.com wrote:

In rdflib/plugins/sparql/results/tsvresults.py:

@@ -29,10 +30,16 @@
RDFLITERAL = Comp('literal', Param('string', String) + Optional(
Param('lang', LANGTAG.leaveWhitespace()
) | Literal('^^').leaveWhitespace(

  •      ) + Param('datatype', IRIREF).leaveWhitespace()))
    
  • ) + Param('datatype', IRIREF).leaveWhitespace()))

+NONE_VALUE = object()
i'm not that deep into pyparsing but somehow would've expected None instead of object() here... but then this way could cause less problems...


Reply to this email directly or view it on GitHub.

@joernhees
Copy link
Member

thanks, seems good to me

gromgull added a commit that referenced this pull request May 7, 2014
TSV Results parse skips empty bindings in result
@gromgull gromgull merged commit b39e1f2 into RDFLib:master May 7, 2014
@gromgull
Copy link
Member

gromgull commented May 7, 2014

Thanks! Pyparsing is indeed slow - you can parse TSV much more quickly.

I would not give any guarantees for the existing from_n3 method - fixing it up would be a nice pull request :)

@gweis
Copy link
Member Author

gweis commented May 7, 2014

Agreed :)

from_n3 definitely needs some sort of validation or return errors in case a term is not valid. At the moment it accepts pretty much anything you throw at it and I believe it returns at least an empty “string” in case it couldn’t match anything.

might look into updating from_n3 in the next 2 weeks unless someone else is taking that over. I guess issue #192 is the place to discuss this further.

cheers

On 8 May 2014, at 4:50, Gunnar Aastrand Grimnes notifications@github.com wrote:

Thanks! Pyparsing is indeed slow - you can parse TSV much more quickly.

I would not give any guarantees for the existing from_n3 method - fixing it up would be a nice pull request :)


Reply to this email directly or view it on GitHub.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working serialization Related to serialization. SPARQL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants