Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Path clarifications #47

Merged
merged 8 commits into from Feb 27, 2020
Merged

Path clarifications #47

merged 8 commits into from Feb 27, 2020

Conversation

warpfork
Copy link
Collaborator

Several significant clarifications to Path docs.

Primarily, being committal and direct about how special values, erm... aren't -- namely, "/" is a valid segment, and so is empty string -- and confessing how much tricky work that makes.

Several methods are now more upfront about how much they do not do good things on such values. I'd like to fix these methods, but to do that, we should formally specify an escaping system and canonical string encoding for paths. That work should start in the IPLD Specs repo, and involve fixtures. Once that is tackled, I will (very!) gladly start fixing these methods to operate accordingly, and get rid of all the sharp-edges warnings.

Several methods also now include comments about whether they will enforce string character encoding or normalization on their function boundaries. They won't. (This is certainly the norm in Go -- the stance of "string is really just bytes" is ubiquitous in the gopher community -- so it may seem not worthy of wasting docs on to other gopher readers. However, since IPLD is all about cross-language consistency, and we may have people from other programming language ecosystems reading our docs, it seems good to be as clear and explicit as possible.) I suspect this will remain the case (it's turned out tricky and work-makey to try to specify anything else; subtracting the countably-infinite domain of non-normalized strings from the countably-infinite domain of all possible byte sequences makes for some very nasty and pernicious expressibility and round-tripping issues, long story short).

Thanks to @ribasushi for the kick in the shins that these docs needed work and clarifications. The previous docs were written a fair while ago when we knew less things less confidently about how the system and specs would evolve, so these clarifications were badly needed.

Primarily, being commital and direct about how special values, erm...
*aren't* -- namely, "/" is a valid segment, and so is empty string --
and confessing how much tricky work that makes.

Several methods are now more upfront about how much they *do not do
good things* on such values.

Added some new constructors for Path which *do* work for all possible
values.  (You could've handled such tricky values with a series of
Join calls, previously, but that would be ergonomically grueling.)

These facts are all defacto 'true' to the best of my knowledge now.
However, the IPLD specs repo is a bit on the quiet side about them
at present (precisely because it's such an irritating little nest of
edge cases and fun stuff)... and so the comments are also littered
with "this may change" warnings.  Still, it's better to be accurate
about what the code does and does not do in its current state.

I'd *like* to formally specify an escaping system and canonical
string encoding for paths.  That should start in the IPLD Specs
repo, though, and involve fixtures, so I won't start it here now.

Thanks to @ribasushi for the kick in the shins that these docs needed
work and clarifications.
It's not very often we care about these things in Go, and when we do,
this stance of "string is really just bytes; guard it yourself" is
the community norm... so, these docs should *almost* go without saying.
However, since IPLD is all about cross-language consistency, and
we may have people from other programming language ecosystems reading
our docs, it seems good to be as clear and explicit as possible.
I... honestly hadn't even realized this before; I just noticed it
when making a refactor that would produce this end state, and thought
"oh, that's silly".  Evidently... it's fine, since this has already
been the case, and not been problematic!  Alright then.
Apparently it's not too silly after all.
Yep, it makes a lot of creation by struct uglier, as you can see in the
diff to the test file.  Fortunately?  Nobody outside the package is
allowed to do that anyway, so, we can pretty much ignore the ergonomic.
(It'll still look uglier if someone's looking at stuff with spew or
go-cmp or some other library like that which peeks naughtily into
unexported fields, but that's hard to do much about.)

We could also address this by adding a discriminant field to the
PathSegment struct.  (If we wanted to use uint internally, we'd
probably have to do that, in fact.  Either that or start using
some other alarmingly magical number like UINT_MAX.  Ygh.)  I didn't.
I don't think we handle lists larger than 2 billion elements
particularly well anyway... so, adding more words of memory to
PathSegment to support that case seems like an entirely losing trade.
@codecov
Copy link

codecov bot commented Feb 24, 2020

Codecov Report

Merging #47 into master will increase coverage by 0.41%.
The diff coverage is 50%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #47      +/-   ##
==========================================
+ Coverage   72.21%   72.61%   +0.41%     
==========================================
  Files          50       50              
  Lines        3162     3209      +47     
==========================================
+ Hits         2283     2330      +47     
+ Misses        789      788       -1     
- Partials       90       91       +1
Impacted Files Coverage Δ
path.go 12% <0%> (+12%) ⬆️
pathSegment.go 46.16% <66.67%> (+38.47%) ⬆️
traversal/common.go 80.77% <0%> (-6.73%) ⬇️
schema/gen/go/gen.go 100% <0%> (ø) ⬆️
schema/gen/go/genKindLinkNode.go 100% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 907fb14...8093be7. Read the comment docs.

path.go Show resolved Hide resolved
Copy link
Contributor

@ribasushi ribasushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

warpfork and others added 2 commits February 27, 2020 12:43
There's already explicit sections for empty string and slash cases,
on the basis of how likely those are to provoke questions
(even though the content is essentially to highlight how *not*
exceptional they are); it makes just as much sense to do the same
call out for null bytes.

Original text started in
https://github.com/ipld/go-ipld-prime/pull/47/files#r384127483 .

Co-Authored-By: Peter Rabbitson <ribasushi@leporine.io>
Previous commit with suggestions from reviewer made me look at this
line again and realize it was *too* specific.
@warpfork warpfork merged commit 4612357 into master Feb 27, 2020
@warpfork warpfork deleted the path-clarifications branch February 27, 2020 12:02
@aschmahmann aschmahmann mentioned this pull request Sep 22, 2020
72 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants