-
Notifications
You must be signed in to change notification settings - Fork 790
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented Range parser with cats-parse #4017
Conversation
Fwiw, I am using backtrack quite a bit in #4004 because I couldn't figure out how else to avoid the parser partially matching on a piece of the input and then failing out because it wouldn't reconsider that same part of the input again on a different branch of the union. Maybe I am missing something there though. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. This is looking about right.
I've used .soft
and not .backtrack
, but I don't think .soft
helps before an orElse1
. I think you're doing this right, but I'm new at the library as well.
Range accepts any range unit but requires subranges so something like foo=bar fails, this is probably an explicit choice that deserves a comment in the parser
Are you referring to other-ranges-specifier
from the grammar?
val nonNegativeLong = Numbers.digits1 | ||
.map { ds => | ||
val l = BigInt(ds) | ||
if (l < Long.MaxValue) l.toLong else Long.MaxValue | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about a mapFilter
? (I'm doing this in GitHub. Hopefully it's right...)
val nonNegativeLong = Numbers.digits1 | |
.map { ds => | |
val l = BigInt(ds) | |
if (l < Long.MaxValue) l.toLong else Long.MaxValue | |
} | |
val nonNegativeLong = Numbers.digits1 | |
.mapFilter { ds => | |
try Some(ds.toLong) | |
case { _: NumberFormatException => None } | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that looks better. I did not know mapFilter existed. I'll update both parsers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's new in 0.2.0, and is based on the selective primitives, which makes it more efficient than a monadic parser.
@@ -27,14 +28,34 @@ class RangeParserSpec extends Http4sSpec { | |||
Range(RangeUnit.Bytes, SubRange(0, 500)), | |||
Range(RangeUnit.Bytes, SubRange(0, 499), SubRange(500, 999), SubRange(1000, 1500)), | |||
Range(RangeUnit("page"), SubRange(0, 100)), | |||
Range(10), | |||
Range(10), // renderValue implementation is incorrect according to rfc7233 so this fails |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather fix the bug than preserve bug compat. Nobody should be depending on these incorrect semantics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I just fix renderValue
then? I can run the tests but it's hard for me to be sure I'm not breaking stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I would go ahead and fix the bug, and if tests break because they were testing broken renderings, fix those, too. We'll be thoughtful about tests we change, but we shouldn't continue making any buggy assertions.
Yes, I just wanted to leave a comment in the parser to justify the difference between the spec and the implementation. Can I just say "this library only supports byte ranges" ? |
Yes, I've read that spec I don't know how many times, and wasn't aware of anything but byte ranges. I'd like to support the full spec, but I'd personally rather do the rest of the work that gets us on Dotty and Cats Effect 3 first. I think a comment is fine. |
Latest commit should fix mentioned issues. All tests pass except for those for the parser of |
Also implemented |
@@ -78,7 +78,7 @@ object Range extends HeaderKey.Internal[Range] with HeaderKey.Singleton { | |||
val suffixByteRangeSpec = negativeLong.map(SubRange(_)) | |||
|
|||
// byte-range-set = 1#( byte-range-spec / suffix-byte-range-spec ) | |||
val byteRangeSet = Rfc7230.headerRep1(byteRangeSpec.backtrack.orElse1(suffixByteRangeSpec)) | |||
val byteRangeSet = Rfc7230.headerRep1(byteRangeSpec.orElse1(suffixByteRangeSpec)) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this without backtracking and it worked, I think I know why:
In an earlier implementation of byteRangeSpec
I was incorrectly accepting -
as the first character and .backtrack
made byteRangeSet
work. Now the first character of byteRangeSpec
must be a digit and for suffixByteRangeSpec
must be -
so presumably the correct parser can be chosen without backtracking.
@rossabaker is my conclusion correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds correct to me.
Looks good after we resolve the merge conflict. |
Great! Is there something I should do? Should I rebase? |
I usually merge from the conflicting branch (in this case, dotty) instead of rebase, It makes it clearer what changed since the last review, instead of looking the whole thing over again. If you've got time, that's great. If not, one of us can clean it up. The main part of the PR looks great. |
I'm not sure how to do that, wouldn't I need write access? |
Oops, sorry, I missed your reply. No, you don't need write access. You just need to merge the dotty branch into your local checkout and then push. Something approximately like:
Where "origin" is a remote pointing at http4s, and "yourfork" is a remote pointing at your fork. Your names may be different. That will update this PR with a merge commit that resolves the conflict. |
@@ -140,6 +142,7 @@ lazy val core = libraryProject("core") | |||
ProblemFilters.exclude[DirectMissingMethodProblem]("org.http4s.parser.HttpHeaderParser.ACCEPT_ENCODING"), | |||
ProblemFilters.exclude[DirectMissingMethodProblem]("org.http4s.parser.HttpHeaderParser.ACCEPT_CHARSET"), | |||
ProblemFilters.exclude[DirectMissingMethodProblem]("org.http4s.ContentCoding.org$http4s$ContentCoding$$<init>$default$2"), | |||
ProblemFilters.exclude[DirectMissingMethodProblem]("org.http4s.headers.Content-Range.apply"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only removed one of the apply
methods, is there a way to specify which one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it understands overloads. This is probably the best we can do.
I've merged and added MiMa stuff I had forgotten. |
|
It needs a |
My bad, incoming |
Several projects have a |
CI is still a ways from working on this branch, but it all looks good now. Thanks again! |
The big news here is that
Range.renderValue
doesn't match spec,byte-range-spec
dictates a final-
if the last position is missing. That change to renderValue seemed off-topic, so I just implemented parser matching the spec. I'm obviously fine with replicating the previous parser if that's preferable.Some other issues:
Long
😬.backtrack
inbyteRangeSpec
, is that good?Range
accepts any range unit but requires subranges so something like foo=bar fails, this is probably an explicit choice that deserves a comment in the parser