Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented Range parser with cats-parse #4017

Merged
merged 6 commits into from
Dec 28, 2020
Merged

Implemented Range parser with cats-parse #4017

merged 6 commits into from
Dec 28, 2020

Conversation

fredshonorio
Copy link
Contributor

@fredshonorio fredshonorio commented Dec 16, 2020

The big news here is that Range.renderValue doesn't match spec, byte-range-spec dictates a final - if the last position is missing. That change to renderValue seemed off-topic, so I just implemented parser matching the spec. I'm obviously fine with replicating the previous parser if that's preferable.
Some other issues:

  • the parsers for Long 😬
  • I'm using .backtrack in byteRangeSpec, is that good?
  • Range accepts any range unit but requires subranges so something like foo=bar fails, this is probably an explicit choice that deserves a comment in the parser

@fredshonorio fredshonorio mentioned this pull request Dec 16, 2020
34 tasks
@lewisjkl
Copy link
Contributor

I'm using .backtrack in byteRangeSpec, is that good?

Fwiw, I am using backtrack quite a bit in #4004 because I couldn't figure out how else to avoid the parser partially matching on a piece of the input and then failing out because it wouldn't reconsider that same part of the input again on a different branch of the union. Maybe I am missing something there though.

Copy link
Member

@rossabaker rossabaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. This is looking about right.

I've used .soft and not .backtrack, but I don't think .soft helps before an orElse1. I think you're doing this right, but I'm new at the library as well.

Range accepts any range unit but requires subranges so something like foo=bar fails, this is probably an explicit choice that deserves a comment in the parser

Are you referring to other-ranges-specifier from the grammar?

Comment on lines 60 to 64
val nonNegativeLong = Numbers.digits1
.map { ds =>
val l = BigInt(ds)
if (l < Long.MaxValue) l.toLong else Long.MaxValue
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about a mapFilter? (I'm doing this in GitHub. Hopefully it's right...)

Suggested change
val nonNegativeLong = Numbers.digits1
.map { ds =>
val l = BigInt(ds)
if (l < Long.MaxValue) l.toLong else Long.MaxValue
}
val nonNegativeLong = Numbers.digits1
.mapFilter { ds =>
try Some(ds.toLong)
case { _: NumberFormatException => None }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that looks better. I did not know mapFilter existed. I'll update both parsers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's new in 0.2.0, and is based on the selective primitives, which makes it more efficient than a monadic parser.

@@ -27,14 +28,34 @@ class RangeParserSpec extends Http4sSpec {
Range(RangeUnit.Bytes, SubRange(0, 500)),
Range(RangeUnit.Bytes, SubRange(0, 499), SubRange(500, 999), SubRange(1000, 1500)),
Range(RangeUnit("page"), SubRange(0, 100)),
Range(10),
Range(10), // renderValue implementation is incorrect according to rfc7233 so this fails
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather fix the bug than preserve bug compat. Nobody should be depending on these incorrect semantics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I just fix renderValue then? I can run the tests but it's hard for me to be sure I'm not breaking stuff.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I would go ahead and fix the bug, and if tests break because they were testing broken renderings, fix those, too. We'll be thoughtful about tests we change, but we shouldn't continue making any buggy assertions.

@fredshonorio
Copy link
Contributor Author

fredshonorio commented Dec 18, 2020

Are you referring to other-ranges-specifier from the grammar?

Yes, I just wanted to leave a comment in the parser to justify the difference between the spec and the implementation. Can I just say "this library only supports byte ranges" ?

@rossabaker
Copy link
Member

Yes, I just wanted to leave a comment in the parser to justify the difference between the spec and the implementation. Can I just say "this library only supports byte ranges" ?

Yes, I've read that spec I don't know how many times, and wasn't aware of anything but byte ranges. I'd like to support the full spec, but I'd personally rather do the rest of the work that gets us on Dotty and Cats Effect 3 first. I think a comment is fine.

@fredshonorio
Copy link
Contributor Author

fredshonorio commented Dec 19, 2020

Latest commit should fix mentioned issues. All tests pass except for those for the parser of Content-Range, because of the changes to Subrange.render. I'm working on the Content-Range parser now.

@fredshonorio
Copy link
Contributor Author

Also implemented Content-Range, which means RangeParser.scala is gone 🥳
I removed one of the constructors because this header requires both byte positions. This maybe puts the usage of SubRange in question, because it's possible to create an invalid value with Content-Range(SubRange(0, None), None)
There's also the unsatisfied-range part of the spec that is not modeled with Content-Range, which I did not change.

@@ -78,7 +78,7 @@ object Range extends HeaderKey.Internal[Range] with HeaderKey.Singleton {
val suffixByteRangeSpec = negativeLong.map(SubRange(_))

// byte-range-set = 1#( byte-range-spec / suffix-byte-range-spec )
val byteRangeSet = Rfc7230.headerRep1(byteRangeSpec.backtrack.orElse1(suffixByteRangeSpec))
val byteRangeSet = Rfc7230.headerRep1(byteRangeSpec.orElse1(suffixByteRangeSpec))

Copy link
Contributor Author

@fredshonorio fredshonorio Dec 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this without backtracking and it worked, I think I know why:
In an earlier implementation of byteRangeSpec I was incorrectly accepting - as the first character and .backtrack made byteRangeSet work. Now the first character of byteRangeSpec must be a digit and for suffixByteRangeSpec must be - so presumably the correct parser can be chosen without backtracking.
@rossabaker is my conclusion correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds correct to me.

@rossabaker
Copy link
Member

Looks good after we resolve the merge conflict.

@fredshonorio
Copy link
Contributor Author

Looks good after we resolve the merge conflict.

Great! Is there something I should do? Should I rebase?

@rossabaker
Copy link
Member

I usually merge from the conflicting branch (in this case, dotty) instead of rebase, It makes it clearer what changed since the last review, instead of looking the whole thing over again. If you've got time, that's great. If not, one of us can clean it up. The main part of the PR looks great.

@fredshonorio
Copy link
Contributor Author

fredshonorio commented Dec 24, 2020

I usually merge from the conflicting branch (in this case, dotty) instead of rebase,

I'm not sure how to do that, wouldn't I need write access?

@rossabaker
Copy link
Member

Oops, sorry, I missed your reply. No, you don't need write access. You just need to merge the dotty branch into your local checkout and then push. Something approximately like:

git fetch -a origin
git merge origin/dotty
[resolve conflict]
git push yourfork

Where "origin" is a remote pointing at http4s, and "yourfork" is a remote pointing at your fork. Your names may be different.

That will update this PR with a merge commit that resolves the conflict.

@@ -140,6 +142,7 @@ lazy val core = libraryProject("core")
ProblemFilters.exclude[DirectMissingMethodProblem]("org.http4s.parser.HttpHeaderParser.ACCEPT_ENCODING"),
ProblemFilters.exclude[DirectMissingMethodProblem]("org.http4s.parser.HttpHeaderParser.ACCEPT_CHARSET"),
ProblemFilters.exclude[DirectMissingMethodProblem]("org.http4s.ContentCoding.org$http4s$ContentCoding$$<init>$default$2"),
ProblemFilters.exclude[DirectMissingMethodProblem]("org.http4s.headers.Content-Range.apply"),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only removed one of the apply methods, is there a way to specify which one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it understands overloads. This is probably the best we can do.

@fredshonorio
Copy link
Contributor Author

I've merged and added MiMa stuff I had forgotten.

@fredshonorio
Copy link
Contributor Author

fredshonorio commented Dec 27, 2020

I hope I haven't borked anything because some tests now fail to compile. Could it be that I'm running jdk 1.8? That was exactly it.

@rossabaker
Copy link
Member

It needs a scalafmtAll run on it. (I sometimes forget that one, too.)

@fredshonorio
Copy link
Contributor Author

My bad, incoming

@rossabaker
Copy link
Member

Several projects have a prePR so people can test locally in a way similar to CI. Formatting and headers are two that often trip us up. We should probably consider that. There's an alias built into sbt-spiewak, but our steps are subtly different.

@rossabaker
Copy link
Member

CI is still a ways from working on this branch, but it all looks good now. Thanks again!

@rossabaker rossabaker merged commit 68d6689 into http4s:dotty Dec 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants