-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Algorithm specification updates by editors #218
Comments
|
|
Markus' review (including Dave's and Gregg's replies). 2. Features
2.1 Expansion
Markus: You are right, "name" is missing. But it's very difficult to see because other things are highlighted and even if "name" would be there, it wouldn't make a difference. 2.3 Flattening
Unlabeled blank nodes are not labeled in expansion. Only labeled blank nodes get relabeled. So I think it's important to highlight that straight in the introduction. I think in the API spec we can work under the assumption that the data model is clear. 5. Algorithms5.2 Remote Context Resolution
The API call itself is asynchronous, whether the algorithms it calls are doesn't really matter. In some languages you would spawn a thread, in others you rely on an event-loop, ... So I agree with Gregg that this is, IMO, clearly an implementation-specific optimization. Whether you have one async call before expansion or a number of them during expansion doesn't really matter. Furthermore, the current pre-processing step directly modifies the input document which you tried to avoid in all other operations. Passing around a map of external contexts just makes it more complex to explain. 5.3 Context Processing
.. and yet you modify your input document when you dereference external contexts. As already said above I believe it's much simpler to pre-process each local context and then pass the pre-processed result (i.e. a modified copy) to the context processing algorithm. You can then modify it as you like which would simplify the algorithm. 5.4 Create Term Definition Subalgorithm
+1, everything on the LHS is a term, it's a (more or less) opaque string. The only time you really look into the string at LHS is when @id is missing in its definition to check if you can expand it nevertheless to an IRI.
And that's exactly what confuses me. Just by looking at the LHS you don't know if it's a dependency. It's a dependency if a compact IRI is used in @id or if @id is missing and the LHS is a compact IRI.
Markus: That's exactly the point. Just the IRI mapping is overwritten. The algorithm might return in step 8.3 never without resetting the type, language, or container mapping.
There are two issues: One is that vocabRelative must be set to true because all the IRIs in the context are vocabRelative. The other one is that the result of the IRI expansion operation might be a keyword but the description of the algorithm doesn't handle it correctly. Your implementation does it - and it also supports keyword aliases (expand-0051). Regarding keyword aliasing: we agreed that you cannot use keyword aliases as keys of term definitions, they have to be @id/@type/@language/@container. You can, however, use keyword aliases as values. Just as you can use other terms or compact IRIs. I think the whole separation of keyword aliases from term definitions causes more confusion than it helps. The algorithms shouldn't care (in most cases) and thus, shouldn't separate those mappings IMO.
No, you don't inherit anything. Quote from the syntax spec:
The only thing that's special when you use a compact IRI at the LHS is that you might omit the IRI mapping since it can be calculated automatically.
Why a special case for @type? @graph is disallowed as well, just as dlfkj if it doesn't expand to an IRI.
Talked about keywords above. What about vocabRelative IRIs? See expand-0052.
5.6 IRI Expansion
As said, I don't have a strong opinion about this. If both of you agree its simpler, let's keep it.
Algorithm
OK, but then the note should be understandable. I wouldn't understand it. We could change it something like the following sentence (maybe someone has an idea how to make it a bit less clumsy). "If value has a null mapping in active context return null as it has been explicitly marked to be ignored."
5.5 Expansion
Or we create an array of the keys, sort it and then iterate over that array as we do in the compaction algorithm.
That's all we really do. We check for a colon. We should say it because saying "if it's an absolute IRI" is a bit a stretch if we never validate IRIs.
Would be fine with that as well. But what do we call it? A string containing a colon? :-P
I just find it easier to do it in one place. When you read the algorithms and come to a point where it says "If expanded property is @type..." it seems natural to handle it completely there so that I can then forget about that case. Otherwise I have to find my way through the algorithm to see how it's really handled. That's the reason why in index.html all keyword processing is consolidated in one place (steps 4.3.2.x). After that, you don't have to think about keywords anymore.
I would also argue that it would be much simpler to recognize variables by giving them names such as languageValue and formatting them in monospace (not orange though) instead of italics.
5.7 Value ExpansionAs already said above, I think it would be simpler to handle @id and @type expansion directly in the expansion algo. Fewer jumps, fewer branches, shorter algorithm(s) – not a deal breaker though.
5.8 Label Blank Nodes Subalgorithm
5.9 Generate Blank Node Identifier
5.10 Compaction Algorithm
Does it really have to be two levels deep? I don't think so. You just need the copy because you remove property or property values (references in case of objects) from shallow. So you create a copy of the array but not of element it contains.
I think I wasn't clear. I meant it would be clearer if we would say we "initialize keys to an array containing all of the keys in shallow, ordered lexicographically" and mention that we need to do this because keys from shallow can be removed within the loop - that's the reason why you create the array containing all of them but have the check in 2.5.1.
But you don't check for property generator duplicates. So how do you know if it's the right property generator? Just because the properties exist doesn't mean the values exist as well.
5.12 Inverse Context Creation Subalgorithm
This is definitely not an implementation detail because the algorithm wouldn't work at all if it's not a reference.
5.11 IRI Compaction Algorithm
5.13 Term Selection Subalgorithm
I don't think that's the case here. If I said nothing, the key will be "@null". If I said it is null, the value of the according will be "@null".
5.14 Value Compaction
But the compaction algorithm recurses into that object and compact the value @id (the place where you just validate its value). 5.15 Find and Remove Property Generator Duplicates SubalgorithmIn contrast to the algorithm in index.html, the algorithm accepts just a single property generator. So they do something different. The algorithm in index.html does the following:
The algorithm in alternate2.html does the following:
Since the algorithm is just invoked in the compaction algorithm the remaining work must be done there but it isn’t. Here’s what done there:
|
…algorithm Fix prefix to "_:b" which is easier to remember as "_:t". Improved explanation slightly. Remove note about keeping it in active context, which probably caused more confusion than it helped. This addresses #218.
This illustrates a bug in the current algorithms. See also #218.
@lanthaler This is a brilliant way of dealing with feedback on the spec, especially when it's complicated. Great job organizing the info in this way! |
To RDF algorithm updated in 6d8e825 using RDF Concepts (triples and datasets, not quads) and after Node mapping. |
Here's a potential re-ordering of the algorithms:
|
Something else: I think we should get rid of the "Purpose" subsections. Most of the time they are just restating what has already been said in the introduction of the specific section. In some section the general intro is missing. E.g., Create Term Definition Subalgorithm: This algorithm is called from the Context Processing algorithm to create term definitions in a new active context. I will spend a couple of hours editing the specs tomorrow. So if you have a couple of minutes to comment on this and the previous comment (or anything else in this thread for this matter) would be much appreciated. Thanks! |
-1 on removing the "Purpose" subsections. It's editorial, and we can do it later. I still find them helpful, if not for merely re-stating what's in the introduction sections in a more concise way. If the group agrees to remove them in the future, we can do so without risking another Last Call. |
I guess leave "Purpose" in there for now (would be my view). I feel like it adds some necessary structure to help people follow what's going on and to tie into the general solution; I could possibly be persuaded otherwise, but let's skip removing it for now. As for the reordering, I think it's good other than moving the subalgorithms that are called from various places (not just context processing) into their own sort of "Basic Utility Algorithms" (Poor name, but don't have the time to think of a better one at the moment). I agree that they should be mentioned early (I believe you made this point, Markus)... I would just separate them into their own group so people don't think they are only used during context processing. |
This is probably controversial, thus the separate commit so that we can easily revert it. I think almost all algorithms are "subalgorithms". Just marking a few as such doesn't improve clarity. The approach I'm taking here is to add the word "Algorithm" to the main algorithms and drop it (as well as "Subalgorithm") from all others. //cc @dlongley @gkellogg This addresses #218.
Using a span with a class is a lot of work when typing. I think the i-tag does the job. Since it's not used for anything else, it is also easy to replace it with something else if needed. For the time being I didn't change the style of variables - I couldn't find anything that looks good :-) This addresses #218.
.. to "For each key-value pair language-language value in value". Same for index maps. This addresses #218.
I've had a look at all the algorithms today. Here are some concrete proposals of what I would like to change (since I believe some parts may be controversial): PROPOSAL 1: Stop processing as soon as an error is encountered. This might imply that we remove a number of error conditions (haven't checked) and replace them with some automatic recovery. Everything that has an error constant, should stop processing IMO. PROPOSAL 2: Do not handle keyword aliases separately. Currently the algorithms are written in a way that suggest that keyword aliases are treated different from other terms. I would like to consolidate that (while ensuring that the fact that the result might be a keyword alias is mentioned). PROPOSAL 3: Do not use value expansion to handle the value of the following keywords: PROPOSAL 4: Do not use PROPOSAL 5: Add a small section explaining which algorithms the various API calls invoke (and how). I'm not sure we really need this but I thought I bring it up nevertheless. |
PROPOSAL 1: No opinion on this yet, as most other RDF processors attempt to continue to produce triples even after encountering a problem. Mine only stops if run in validation mode. I could support this if it resulted in some substantial simplification of the algorithms. PROPOSAL 2: +1, I think keyword aliases should be handled like any other terms where possible. PROPOSAL 3: +1 PROPOSAL 4: +0.5 In general, fewer keywords is better. If the results are equivalent, and we can do with just @null, then I'd say go for it. PROPOSAL 5: +0 I'm not sure we need this. |
PROPOSAL 1: +1 |
... by processing keywords directly. This addresses #218.
The difference is that Value Compaction doesn't create new objects but just tries to compact values to scalars. Keyword aliases are thus handled directly in the Compaction algorithm. This addresses #218.
Previously, the algorithm didn't have a dedicated parameter but used language like "if array compaction has been requested" without specifying how that might be done. I could have used a variable but I choose to directly link to the corresponding option in JsonLdOptions instead. @gkellogg, let me know if you think that couples the algorithms and the API too much. This addresses #218.
Both algorithms are short and there are no recursions. It's easier to read them when they are together instead of being in separate sections. I've also clarified Object to RDF Conversion and List to RDF Conversion a bit. Perhaps also the List to RDF Conversion should be folded into the Convert to RDF Algorithm!? /cc @gkellogg This addresses #218.
Here's what I think needs still be done for the API spec: Intro, Purpose and General Solution is missing for
I'm happy with almost all algorithms, the only thing that we should discuss is
We should probably also
|
.. so that Term Selection is really just the loop. This addresses #218.
and remove "JSON-LD output" which is never used. This addresses #218.
... also link to API calls in the feature descriptions. See http://lists.w3.org/Archives/Public/public-linked-json/2013Feb/0005.html This addresses #218.
RESOLUTION: Remove blank node re-labeling during expansion since it is no longer required. The flattening algorithm must still re-label blank nodes. |
I'm closing this issue now. I don't think we need it anymore. The only thing that hasn't been addresses yet is the "folksy language". Feel free to reopen it if something needs to be discussed. |
This issue is an attempt to streamline the editing of the API spec. I'll start by creating a todo-list with the feedback Gregg, David L., and I gave so that we can thick of things already that have been addressed.
The text was updated successfully, but these errors were encountered: