Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 3 additions & 7 deletions ext/uri/uri_parser_rfc3986.c
Original file line number Diff line number Diff line change
Expand Up @@ -341,19 +341,15 @@ void *php_uri_parser_rfc3986_parse(const char *uri_str, size_t uri_str_len, cons
return php_uri_parser_rfc3986_parse_ex(uri_str, uri_str_len, base_url, silent);
}

/* When calling a wither successfully, the normalized URI is surely invalidated, therefore
* it doesn't make sense to copy it. In case of failure, an exception is thrown, and the URI object
* is discarded altogether. */
ZEND_ATTRIBUTE_NONNULL static void *php_uri_parser_rfc3986_clone(void *uri)
{
const php_uri_parser_rfc3986_uris *uriparser_uris = uri;

php_uri_parser_rfc3986_uris *new_uriparser_uris = uriparser_create_uris();
copy_uri(&new_uriparser_uris->uri, &uriparser_uris->uri);
if (uriparser_uris->normalized_uri_initialized) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this change is correct. The clone_uri handler is used exclusively by the clone object handler, which is needed for two things:

  • URI modification: the change is correct in this case
  • regular cloning: the change seems incorrect

TBH I used to have a TODO comment somewhere mentioning that the two use-cases should be disambiguated somehow so that the normalized member doesn't need to be cloned if not necessary. I removed this at last before I merged my last MR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regular cloning: the change seems incorrect

My understanding is that it is not incorrect, it's just less efficient than it could be. URI normalization should be “deterministic”, thus if we clone the URI and then retrieve a normalized field on the clone it will simply renormalize the same URI.

Given that non-with-er cloning should be a comparatively rare occurrence it seems reasonable to not optimize for this case to keep the with-er cloning simple.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, I see it now. I think the best option would still be to be able to handle both cases. The clone_uri handle could accept a bool flag, which could be passed by the clone obj handler. And the obj handler could determine the value by checking the current opcode I guess.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the best option would still be to be able to handle both cases.

Sure. But for now with the current API, not copying the normalized URI seems safer / less prone to bugs. And the comment is definitely out of sync with the implementation.

Should we take this for now and then with the implementation of the with-ers take another look? Alternatively we can close and you can propose something yourself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But for now with the current API, not copying the normalized URI seems safer / less prone to bugs.

I don't really get this. What kind of bugs can happen if we also copy the normalized URI during cloning?

Should we take this for now and then with the implementation of the with-ers take another look?

I don't see much point in removing these lines and then reading them later, unless I miss any fundamental issue with normalization. So I'd say that either I can try to implement the idea, or feel free to do something similar yourself if you are interested in it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here is the other approach I mentioned: #19744

copy_uri(&new_uriparser_uris->normalized_uri, &uriparser_uris->normalized_uri);
new_uriparser_uris->normalized_uri_initialized = true;
}
/* Do not copy the normalized URI: The expected action after cloning is
* modifying the cloned URI (which will invalidate the cached normalized
* URI). */

return new_uriparser_uris;
}
Expand Down