New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove built-in N-Triples parser/serializer #31
Comments
One usability issue: the built-in N-Triples parser would skip invalid lines. Many RDF dumps are invalid, and SERD just stops when this happens (without clear error message at the moment; would be useful to know which line is invalid). This means you'd have to correct the error and restart generation all over again. That's not a real reason to keep the built-in parser though, because its definition of "invalid" was too narrow: it would happily parse some invalid IRIs, resulting in an invalid HDT file. At least with SERD, we know that the generated HDT file will be valid. |
One minor bump in the road of removing the built-in N-Triples parser/serializer is that there are actually call sites in the code base that are hardcoded to use them, instead of cleanly going through a factory function that would encapsulate the multiple parser/serializer implementations. For example, header parsing in |
This is overall a little bit of a larger task than we might have thought, since the varying reader/writer implementations aren't cleanly abstracted nor dispatched through a registry. I think what I'll do is lift and adapt existing known-good code (that already abstracts over Serd and Raptor, and more) from our datagraph/librdf++ library (used in Dydra), and replace the I/O layer with that. That library also works on |
Seems like a good idea, especially given the |
Removed the built-in parser with bc7a258 and 09a1e6b because people kept hitting this issue (LinkedDataFragments/Client.js#26, LinkedDataFragments/Server.js#37, …). |
As discussed in #12, @RubenVerborgh and I would propose to remove the buggy built-in N-Triples parser and serializer implementation and rely on Serd instead.
This does mean that Serd would be required in order for the command-line utilities to be of any use (and it might hence make sense to make Serd a required dependency, instead of optional), but that's practically the case right now as well for anything nontrivial, given that the built-in parser is buggy.
Also, Serd is a compact enough library that potentially (if really needed) we could import it into our source tree, at which point there would be no external dependency to install.
Thoughts?
The text was updated successfully, but these errors were encountered: