-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preserve Pretty Turtle syntax after PATCH #1438
Comments
Related: Use relative URIs #1366 |
This will require changes in the serialization dependencies Pinging @rubensworks and @RubenVerborgh for input on the serializers. |
Perhaps a (non-streaming) postprocessing mode could be added to N3.js indeed to convert lists to their compact representation. But perhaps this should be an optional or profile-based mode, as this is probably not something that's desired by default. Alternatively, it may even be easier to implement support for the expanded list syntax in rdflib? |
@timbl In practice, this could be solved easily. CSS allows plugging in different parsers and serializers and even patchers, so we could just plug in rdflib. However, the bigger picture, as @joachimvh mentions, is the streaming architecture that is key to high-performance RDF. Plugging in rdflib as default (or any parser/serializer algorithm that requires everything to be in memory) constrains the supported file size. And people have tried to handle multi-gigabyte files, successfully. We could try to relax this requirement from a necessity (= mandated by Solid spec) to a nice-to-have (= developer usability):
Can we fix this in Mashlib too? Since the Turtle list syntax does not have RDF-level model semantics, and some syntaxes do not have special list support, we would expect Mashlib to be syntax-invariant. In any case, CSS implementing this would not be enough for Mashlib to rely on it; it would need to be a spec requirement. We can consider for developer usability. |
If we only consider PATCH I seemed to understand that it was memory constrained. Is it correct ? |
Is this related with linkeddata/rdflib.js#567 ? |
At the moment; but streaming patch is possible in some cases.
No, there is no relation. |
@joachimvh says " One issue will be that all data in CSS is handled as s stream, meaning that the serializer can't know what the contents of a list are while it is serializing. ". I don't see that the |
But on a high level, @RubenVerborgh 's comment "However, the bigger picture, as @joachimvh mentions, is the streaming architecture that is key to high-performance RDF." . My to-do list broke when I tried to tweak the config with CSS. The original pretty file from NSS 1.8k became the ugly file stored by CSS 4.8k. It changed from something quite readable to something quite unreadable. I don't like that CSS is incrementally making my pod more ugly and more verbose. But much more importantly I don't like the idea that developers who start to use their own to-do lists as examples won't be able to see what is going on clearly. |
But we should be clear: the root cause is that Mashlib makes assumptions that are not afforded by the Solid Protocol. CSS is fulfilling a role of devil's advocate here: we implement the spec to the letter, and tend to not make accommodations to individual apps, because then we'd give everyone a false feeling of safety.
I don't like it either; but we really need to separate incorrect app breakage from developer usability.
But serializing isn't. With streaming, we mean arbitrarily sized streams.
The comparison is incomplete: NSS will ugly crash on large files or lots of small files. A single 8GB file will definitely do, but try things like CSS, in contrast, does not crash on any of these because it serializes with constant memory.
For reference, @TallTed has made a similar argument here: solid/specification#342 The feature request in the current issue is even stronger: to also support PATCH with minimal syntax changes. |
Agreed mashlib -- rdflib -- should recognize incoming first and rest. |
They do now. That was a distraction. The issue the ugliness to the developer, sometimes the user, and the inconsistency for the SCM and to the source code management system. Prompted mainly by this I wrote a new article I am concerned that is and when we switch solidcommunity.net to CSS the illegibility of the files after patch will be a massive hurdle for developers. We lose the view source effect. We change RDF from a simple understandable language to an incomprehensible mess. From that point of view this issue should be classed as a bug not an enhancement. If the data is all in memory then the speed argument doesn't hold, the sort can be fast. Let's add a serializer with the same algo as rdflib to the CSS stack. Th output will still be a stream, just the input will be random access to the store, or a place to sort in memory. |
@timbl I'm afraid this would lead us onto a rather slippery slope. I'm in favor of pretty printing; it has a place when it comes to developer usability. The question is where that place is, and your suggestion is to make that place the Solid HTTP interface and the underlying storage, and mandate this as if it were a spec. While I understand this point of view, the caveats I discuss below make me conclude that an app is a better place. 1. No definition of syntax-preservingFirst, there is a lack of proper definition. What exactly does preserving syntax mean? Exact tabs, spaces, comments, escape sequences for literals? rdflib.js isn't syntax-preserving: it similarly re-serializes its own output, but it wouldn't preserve the syntax of a file I wrote by hand or via N3.js. There's no standard or specification for Turtle syntax preservation, which means that we can't commit without disappointing a lot of people. Committing to this would open the door do loads of bug reports where someone's specific syntax feature isn't preserved, and they would all be right. 2. On-disk serialization is beyond the Solid ProtocolSecond, your specific use case concerns the on-disk syntax and its presumed connection to HTTP. This is a very unique case, for which I have not encountered other users yet. The specific request here is to preserve the on-disk serialization such that it works well with version control systems. This implicitly assumes that the on-disk and over-HTTP representations are the same, which does not need to be the case, given that the Solid Protocol only governs the HTTP representation (and does not put any syntactical restrictions on it). Furthermore, there might not even be an on-disk file version, if the back-end is an RDF database. If the goal is minimal changes across on-disk versions, writing canonical RDF will yield even better results. Independently, we could then apply a Turtle pretty printer over the HTTP interface to reach the goal of being "view source" friendly, which also works with database backends. 3. No consistent experienceThird, CSS implementing this on disk would not help with a consistent developer experience for the entire ecosystem. CSS deliberately tries to do the minimum as to not create false expectations. If we were somehow to promise syntax-preserving Conclusion: should it be an app?Summarizing, haphazardly committing to syntax constraints in CSS would introduce several additional assumptions into its corner of the Solid ecosystem (defining syntax preservation, linking on disk-storage and HTTP syntax…) that do not generalize to other more typical use cases. Unless we spec them, which would lead to other issues. It seems much more in the Solid spirit of app/data separation to implement pretty printing of data as an app. That way, developers can use the app to see pretty Turtle of their preference with any Solid server or backend, with neither the app nor the server going beyond the mandate of the Solid Protocol. |
Consider... View Source of HTML documents preserves line breaks, whitespace, etc., of the original document on the server. View Rendering of HTML documents may drop line breaks, fold whitespace, etc. This is the kind of different I've been talking about with regard to Turtle (and other RDF serialization) documents. If I view SOURCE, I expect to see the indentations, line breaks, comments, etc., preserved as uploaded. If I view RDF, I expect to see varying levels and kinds of pretty-printing of that data — without concern for the original document's indentations, line breaks, comments. Now, it might be acceptable for a Solid server to take an uploaded document, parse it for RDF content, and put that RDF into a triple/quad store. Wherever possible, as a user, I would want that Solid server to give me the option to preserve the original document, whether or not it were (also) parsed for RDF content. Optimally, it would also be possible to choose whether to download/view the original document and/or whatever RDF prettification (or uglification) the server was built to deliver. Yes, some limits are necessary to assure interop. Generally, the user should be given the option of what to get and/or store — complete with alerts like "getting/saving this serialization may lose inline formatting and/or comments that were in original data documents". Informed consent should be a guiding principle, even more than enforcing interop by limiting user choices. |
@TallTed Your comment seems to pertain to a different issue (solid/specification#342). This issue is about |
This issue is titled Preserve Pretty Turtle syntax after To my mind, one goal of I don't demand (though I would certainly prefer) that such a single-object change itself be adjusted to maintain the pretty-print surrounding it; I do feel quite strongly that the line(s) preceding and following the line(s) containing that object not be un-prettified. |
Environment
X-Powered-By: Community Solid Server
community-solid-server --version
for aglobal or
npx community-solid-server --version
for a local installation4.0.1
node -v
npm -v
Description
When I am using am using CSS to host, say, a SolidOS tracker, the user can
edit the configuration using the form system. The configuration includes
things like ordered lists of states in turtle
( list syntax)
.When the user edits the file, with code which sends a PATCH command to change the list from one value to another,
the file ends up encoded using low-level
rdf:first
andrdf:rest
syntax.The text was updated successfully, but these errors were encountered: