Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paths and other updates #78

Merged
merged 22 commits into from
Apr 29, 2021
Merged

Conversation

gkellogg
Copy link
Member

@gkellogg gkellogg commented Apr 16, 2021

Includes a number of changes outside of the paths section to better use ReSpec and remove some archaic concepts.


Preview | Diff

spec/index.html Outdated Show resolved Hide resolved
Copy link
Member

@TallTed TallTed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doubtless, I've missed some things ... and I only looked at the changed lines here.

spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
@gkellogg
Copy link
Member Author

Work continues. This has turned into a larger update than I had originally intended (see commit comments), I hope I'm not overstepping my mandate.

We need to describe verbs more generally, and have a resolution section in the EBNF Grammar section. My thought is to describe things that are beyond Turtle as transformations to an equivalent Turtle structure (where feasible). We'll need more on parsing Graphs, and need to decide if we're changing from the use of "Formulae" to "Formulas"; the Team Submission used "Formulae" as the plural of "Formulas".

spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
@gkellogg gkellogg changed the title WIP: Paths and other updates Paths and other updates Apr 20, 2021
The prior path algorithm didn't seem wholly correct to me (or perhaps I misunderstood it). Some terms were also unclear IMO (such as |pred|). 

E.g., the result from recursively calling the algorithm wouldn't be used as subject (`!`) / object (`^`) for an emitted statement, rather, it would be either the initial |pathItem| or the previously created |B|
Copy link
Collaborator

@william-vw william-vw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See commit e06a0b6 (i.e. on how I think the path algorithm should work)

spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>
@gkellogg
Copy link
Member Author

See commit e06a0b6 (i.e. on how I think the path algorithm should work)

I have to say, I don't find your version of the algorithm clearer; the recursive nature of what I proposed I find easier to understand, but perhaps there's a middle ground.

Stylistically, it's good form to reduce line lengths (for the reasons @TallTed stated) and use "Otherwise" for the else-side of a clause.

If the concern is for correctness, then we should also re-consider the "Paths" section, which informally defines the same transitions as re-writing the input. So that :joe!:mother becomes [is :mother of :joe] or :joe :mother B. I'll examine this further, but it does seem that the result will always be a blank node, and not obj in the case of my step 4.

My own processor does not process paths in reverse, and implements something closer to what you state, but as a description, reverse processing seems to be easier to follow, and the intent is the results, not the method. Describing it as a forward process becomes more cumbersome, even if this is the AST generated by a parser.

@TallTed
Copy link
Member

TallTed commented Apr 22, 2021

@william-vw --

See commit e06a0b6 (i.e. on how I think the path algorithm should work)

Would you please make a complete PR from your fork, so that the preview of your fork can be compared to the preview of @gkellogg's fork? Or make a change suggestion on @gkellogg's PR, so that it's (somewhat) more easily seen where the differences in the source are?

@william-vw
Copy link
Collaborator

@gkellogg It's not an issue of clarity but rather correctness.. (But, as said, I could be wrong since some terms are unclear to me, e.g., |pred|, which does not occur in the grammar)

As I understood it, given the following:

:john!:father a :Person .

Then I believe the recursive call would return the last :father, and the algorithm would use it as the subject of the statement

:father :father _:bn0

Instead of

:john :father _:bn0

Honestly I don't think a recursive process is the best representation here, since the next bnode would never be utilized in the prior emitted statement. E.g., for

:john!:father!:father a :Person .

Should yield

:john :father _:bn0 . _:bn0 :father _:bn1 . _:bn1 a :Person .

And not

:john :father _:bn1 . _:bn0 :father _:bn1 . _:bn1 a :Person .

Note that the first item in the path should also be treated differently: :john should act as the subject of the first emitted statement, whereas the other path items should act as predicates.

@gkellogg
Copy link
Member Author

gkellogg commented Apr 22, 2021

As I understood it, given the following:

:john!:father a :Person .

Then I believe the recursive call would return the last :father, and the algorithm would use it as the subject of the statement

:father :father _:bn0

Instead of

:john :father _:bn0

Following my algorithm, you would get the following:

  • Step 1 does not apply, as :john!:father is not a pathItem.
  • Step 2 separate :john!:father into path=:john, pred=:father, and dir=!.
  • Step 3 invokes the algorithm recursively using :john:
    • Step 1 returns :john as it matches a pathItem.
    • obj=:john.
  • Step 4 matches because dir=!. The triple (:john :father B) is emitted, and :john is returned as a result.

    As I said before, I believe this is wrong, and B should be returned.

As the path was replaced with B, the remaining statement is (B a :Person). In fact, my original algorithm would have resulted in (:john a :Person), which is incorrect.

Honestly I don't think a recursive process is the best representation here, since the next bnode would never be utilized in the prior emitted statement. E.g., for

:john!:father!:father a :Person .

This is not the a characteristic of the recursive algorithm, but of the misappropriation of obj as the result, and not B. Following the corrected flow of my algorithm you'd get the following steps:

  • Step 1 does not apply, as :john!:father!:father is not a pathItem.
  • Step 2 separate :john!:father!:father into path=:john!:father, pred=:father, and dir=!.
  • Step 3 invokes the algorithm recursively using :john!:father:
    • Step 1 does not apply, as :john!:father is not a pathItem.
    • Step 2 separate :john!:father into path=:john, pred=:father, and dir=!.
    • Step 3 invokes the algorithm recursively using :john:
      • Step 1 returns :john as it matches a pathItem.
    • obj=:john.
    • Step 4 matches because dir=!. The triple (:john :father B0) is emitted, and B0 is returned as a result.
  • obj=B0
  • Step 4 matches because dir=!. The triple (B0 :father B1) is emitted, and B1 is returned as a result.

As the path was replaced with B1, the remaining statement is (B1 a :Person).

So, the resulting triples are: :john :father [:father [a :Person]] ., which I believe is correct.

@TallTed said:

Would you please make a complete PR from your fork, so that the preview of your fork can be compared to the preview of @gkellogg's fork?

This would be useful: you can actually make a PR to the branch that this PR is based. Doing it through the suggestion mechanism is probably possible, but challenging. I'll just lay them out here in Markdown.

Gregg's (corrected) algorithm

  1. If path |p| matches the pathItem production, then |p| can be reduced no further, return |p| as the result.
  2. Otherwise, separate |p| into two components |path| and |pred| from the last occurrence of the directional indicator |dir|.
  3. Create |obj| by invoking this algorithm recursively using |path| for |p|.
  4. If |dir| is "!", emit a new N3 triple (|obj| |pred| B) where B is a novel blank node. return B as the result.
  5. Otherwise, |dir| is "^", emit a new N3 triple (B |pred| |obj|) where B is a novel blank node. return B as the result.

(Given the commonality in steps 4. and 5., the creation of B could be promoted to an earlier step, and returning B could be the last step, which just changes the logic about emitting the triple.)

William's algorithm:

  1. If path |p| matches the pathItem production, and no prior reduction took place, then return |p| as the result.
  2. If path |p| matches the pathItem production, and some prior reduction took place, then return the last created blank node B as the result.
  3. Otherwise, separate p into two components |pathItem| and |path|, based on the next occurrence of the directional indicator |dir|.
  4. If this is the first reduction of path |p|, |dir| is "!", and |pathItem2| stands for the next item within the |path|, emit a new N3 triple (|pathItem| |pathItem2| B) where B is a novel blank node.
  5. If this is the first reduction of path |p|, |dir| is "^", and |pathItem2| stands for the next item within the |path|, emit a new N3 triple (B |pathItem2| |pathItem|) where B is a novel blank node.
  6. If this is a subsequent reduction of path |p|, |dir| is "!", and |B_(n-1)| stands for the previously created blank node, emit a new N3 triple (|B_(n-1)| |pathItem| B) where B is a novel blank node.
  7. If this is a subsequent reduction of path |p|, |dir| is "^", and |B_(n-1)| stands for the previously created blank node, emit a new N3 triple (B |pathItem| |B_(n-1)| where B is a novel blank node.

(Also, note that |B_(n-1)| is beyond Respec's ability to translate. I also tried |B<sub>n-1</sub>|, so we're left with more conventional means, or go to using <var>B<sub>n-1</sub></var>, which should render correctly.

@william-vw
Copy link
Collaborator

@gkellogg Ahhh .. I understand how your algorithm works now. (This is why I've learned to add disclaimers such as "as far as I understood"!) Honestly it wasn't clear to me until you gave the example.

I think this is the main thing that tripped me up:

separate |p| into two components |path| and |pred| from the last occurrence of the directional indicator |dir|.

As I mentioned, |pred| wasn't clear to me; and I thought |path| referred to the last element in the production rule. So IIUC, |path| comprises the resource path with length n-1 (not including the last path item) and |pred| represents the last path item (?) The hint "from the last occurrence of the directional indicator |dir|" was insufficient for me, it seems. I only considered recursion in a "forward" direction, not "backward", hence my comment about appropriateness. (Unsure whether the fixed issue with |obj| would have made a difference in my understanding, really.)

I do appreciate the elegance of the algorithm, but it's a bit unfortunate that it doesn't translate as easily to an event-based parser (i.e., a listener vs. a visitor) since the whole AST is not (yet) available.

I think the following would be clearer to me (no ReSpec formatting at this point):

  1. If path |path_n| matches the pathItem production, then |path_n| can be reduced no further. Return |path_n| as the result.

  2. Otherwise, separate |path_n| into two components:

    • |path_(n-1)|: sub-path of path_n with length (n-1) that does not include the last pathItem in path_n.
    • |pathItem_n|: the last pathItem split off from |path_n| at the last occurrence of the directional indicator |dir|.
  3. Create |result_(n-1)| by invoking this algorithm recursively using |path(n-1)|.

  4. If |dir| is "!", emit a new N3 triple ( |result_(n-1)| |pathItem_n| B ) where B is a novel blank node. Return B as the result.

  5. Otherwise, if |dir| is "^", emit a new N3 triple ( B |pathItem_n| |result_(n-1)| ) where B is a novel blank node. Return B as the result.

Perhaps both algorithms can be given - with a toggle "recursive" vs. "iterative" :-)

@TallTed Hope that Gregg's markdown version clarifies the differences better.

@TallTed
Copy link
Member

TallTed commented Apr 23, 2021

The algorithms are definitely much more understandable with the examples. I think such should be included in the doc!

I don't have a strong feeling for/against either iterative or recursive processing. If both are viable (which I cannot yet discern), I would suggest they both be included, each with appropriate examples -- optimally with the same example data being handled by both iterative or recursive algorithms -- as it seems likely to me that different implementations/environments would be better served by each.

@gkellogg
Copy link
Member Author

I think the following would be clearer to me (no ReSpec formatting at this point):

Good suggestion.

Perhaps both algorithms can be given - with a toggle "recursive" vs. "iterative" :-)

I think it would be useful to specify alternates to the algorithm, as that would make it clear that there is no one right way to implement it and only the result matters.

It did occur to me, in the middle of the night, that we've underspecified how these algorithms relate to being in subject, predicate, or object positions, since the necessary resource to use varies. I had it in mind to work more closely with the Turtle curSubject and curObject descriptions, but that looks like pulling on a thread that would have no end.

I'll make an update that includes both algorithms, along with examples of execution for our sample paths, including in at least subject, object and maybe list member contexts, as appropriate.

@gkellogg
Copy link
Member Author

@william-vw I took some liberties with your algorithm, which I believe are isomorphic and, IMHO, easier to follow. I also included evaluation examples for each. See what you think.

@TallTed
Copy link
Member

TallTed commented Apr 27, 2021

decide if we're changing from the use of "Formulae" to "Formulas"; the Team Submission used "Formulae" as the plural of "Formulas".

I don't know if you want(ed) to do this in this PR, but in any case — I would have said Formulae, but Grammar Monster and Google's Ngram Viewer provide strong justification for "Formulas".

@gkellogg
Copy link
Member Author

I don't know if you want(ed) to do this in this PR, but in any case — I would have said Formulae, but Grammar Monster and Google's Ngram Viewer provide strong justification for "Formulas".

We discussed this on a call the other week. It wasn't intentional to move from "Formulae" to "Formulas", although both are correct. "Formulae" is, IMHO, more prosaic, but it is what was used in the original Team Submission, and changing it would seem gratuitous at this point. For example, documentation in my implementation has long used "formulae".

We'll do a more comprehensive update to normalize usage later.

Copy link
Collaborator

@william-vw william-vw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the updates are great. I suggested some minor changes on the right-to-left algorithm, and some larger ones on the left-to-right one (see comments for details).

spec/index.html Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
(<var>B<sub>n</sub></var> <var>item<sub>n</sub></var> <var>B<sub>n-1</sub></var>).</li>
</ol>

<aside class="example"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be edited if we reach a consensus about updated iterative algorithm

spec/index.html Outdated Show resolved Hide resolved
gkellogg and others added 3 commits April 28, 2021 13:47
Co-authored-by: William Van Woensel <william.van.woensel@gmail.com>
Co-authored-by: William Van Woensel <william.van.woensel@gmail.com>
Copy link
Collaborator

@william-vw william-vw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some minor suggestions - looks good

spec/index.html Outdated
<ul>
<li>Step 1 does not apply,
as <var>dir<sub>0</sub></var> is not `null`.</li>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
as <var>dir<sub>0</sub></var> is not `null`.</li>
as |n| is `0`.</li>

spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
Co-authored-by: William Van Woensel <william.van.woensel@gmail.com>
@william-vw william-vw merged commit 0f5b76f into w3c:master Apr 29, 2021
@gkellogg gkellogg deleted the paths-and-other-updates branch April 29, 2021 00:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants