Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Law-driven-design of Path#concat and Path#splitAt #5794

Conversation

RafalSumislawski
Copy link
Member

While working on #5793 I realised that often Uri.Path's behaviour doesn't line up with my intuition. The problems can be summarised as lack of symmetry, path"/".concat(path"a") == path"/a" but path"a".concat(path"/") == path"a" , and lack of preservation-of-slashes, for example path"a".splitAt(0)._2 == path"/a" summons a slash out of thin air.

I did not know how exactly Path should behave, so I started writing down my intuition in the form of simple laws / properties and then followed up with changes in implementation to make it satisfy the laws while not breaking any other tests. This PR is what I came up with.

The CORSSuite change is an interesting albeit insignificant discovery. The request with path "foo" is not a valid request as HTTP requires the path to be absolute.

@mergify mergify bot added the series/0.22 PRs targeting 0.22.x label Dec 27, 2021
@RafalSumislawski RafalSumislawski added enhancement Feature requests and improvements module:core labels Dec 27, 2021
Copy link
Member

@armanbilge armanbilge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 IDK anything about RFCs etc., but the laws expressed here seem perfectly reasonable. Just a couple questions.

test("simple concat 3")(assertEquals(path"".concat(path"/a"), path"/a"))
test("simple concat 4")(assertEquals(path"/".concat(path"a"), path"/a"))

test("When left side of concat is absolute then the result is absolute") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it makes a difference?
https://scalameta.org/munit/docs/integrations/scalacheck.html

Suggested change
test("When left side of concat is absolute then the result is absolute") {
property("When left side of concat is absolute then the result is absolute") {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I didn't know about the property.

forAll((path: Path) => assertEquals(path.splitAt(-1), path.splitAt(0)))
}

test("splitAt(segments.size + 1)._1 is identity") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about splitAt(segments.size)._1?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

splitAt(segments.size)._1 is not an identity. But let me explain this topic.

When we split a path in the middle (between some segments) then we split it at a slash and we need to decide what will happen with that slash after splitting. Let's say we split path"/a/b/" at 1. The result could be:

  • (path"/a", path"b/") - none of the sides receives the slash
  • (path"/a/", path"b/") - left side receives the slash
  • (path"/a", path"/b/") - right side receives the slash
  • (path"/a/", path"/b/") - both sides receive the slash
    As I said before "I did not know how exactly Path should behave" and this applies here as well. From my perspective the first option is wrong (some laws would break in some cases). The remaining 3 options are fine. The "both" sound best as it is symmetric. But other than that, "left" and "right" will work. And what we used to have was the "right" option, and I stayed with that, as it minimised potential for breaking something.

So what happens if we extrapolate this behaviour to splitAt(segments.size)? We end up with path"/a/b/".splitAt(2) == (path"/a/b", path"/"), so splitAt(segments.size)._1 is not an identity.

I added example-based tests which will document and freeze this specific behaviour. Mentally I put these tests in a different category than the properties, in the sense that the properties document something that fundamentally makes sense, whereas these examples (and the lack of a "splitAt(segments.size)._1 is identity") document a design decision.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, this is great. Sorry, I have one more question.

(path"/a", path"/b/") - right side receives the slash

In this case, the right path becomes an absolute path, correct? I don't disagree with preserving this behavior for compatibility, but personally this strikes me as very odd.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. It's confusing because, after all the /b/ is relative to /a.

One of the reasons why it was design like this, is that http4s's DSL expects all paths to be absolute (case Root ... => ...). That makes sense at first sight, because Requests are required to use absolute paths, but then you realise things like Routers, servletContext etc exist.

You can see that in order to have a splitAt that behaves in a reasonable way, and at the same time to satisfy the DSL, I had to do a (l.toAbsolute, r.toAbsolute) in Message.

I'm open to discussing changing this behaviour.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's right, the DSL. Thanks for the excellent explanations. I think what you've done here is very good. At the least, because it specifies, documents, and freezes the behavior.

I'll open an issue about the absolute path thing, we can consider for 1.0 maybe.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear about it. I think it would be possible to have the

(path"/a/", path"b/") - left side receives the slash

or the

(path"/a/", path"/b/") - both sides receive the slash

behaviour, without changing how the DSL works (because we make the DSL happy with the (l.toAbsolute, r.toAbsolute) anyway). It's actually the Message#scriptName (the left side of the split) receiveing a slash at the end that will cause more hard-to-foresee problems.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant to say is that the original (current seriess/0.22) design gave the slash to the right-hand-side (pathInfo) because that's what the DSL needs. In this PR this is no longer needed because the r.toAbsolute makes the DSL happy. But I kept this design-decision to avoid breaking something elsewhere (other than the DSL) by giving the slash to the left-hand-side (scriptName).

Copy link
Member

@rossabaker rossabaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the rigor.

@rossabaker rossabaker merged commit c5c0871 into http4s:series/0.22 Jan 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature requests and improvements module:core series/0.22 PRs targeting 0.22.x
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants