Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a PHP implementation of URLSearchParams #118

Merged
merged 15 commits into from
Sep 5, 2023

Conversation

nyamsprod
Copy link
Member

@nyamsprod nyamsprod commented Sep 2, 2023

For documentation here's the specification:

And the test suites can be found in this folder:

Some adaptations in properties, names, behaviour are done to make the class behave in a "PHP" way.

Here's the class signature.

<?php
final class URLSearchParams implements Countable, IteratorAggregate, UriComponentInterface
{
    /*==================================
     * WHATWG URL Living Standard specification
     *=================================*/
    public function __construct(object|array|string|null $query = '');
    public function keys(): iterable;
    public function values(): iterable;
    public function has(?string $name, <optional> Stringable|string|float|int|bool|null  $value): bool;
    public function get(?string $name): ?string;
    public function getAll(?string $name): array;
    public function size(): int;
    public function entries(): Iterator;
    public function each(Closure $callback): void;
    public function set(?string $name, Stringable|string|float|int|bool|null $value): void;
    public function append(?string $name, Stringable|string|float|int|bool|null $value): void;
    public function delete(?string $name, <optional> Stringable|string|float|int|bool|null  $value): void;
    public function sort(): void;

    /*==================================
     * Named constructors
     *=================================*/
    public static function new(Stringable|string|null $value): self;
    public static function fromUri(Stringable|string $uri): self;
    public static function fromPairs(iterable $pairs): self;
    public static function fromAssociative(object|iterable $associative): self;
    public static function fromParameters(iterable $parameters): self;

    /*==================================
     * PHP's interface
     *=================================*/
    public function count(): int;
    public function getIterator(): Iterator;
    public function jsonSerialize(): string;
    public function __toString(): string;

    /*==================================
     * League URI Component interface
     *=================================*/
    public function value(): string;
    public function toString(): string;
    public function getUriComponent(): string;

    /*==================================
     * Syntactic sugar methods
     *=================================*/
    public function isNotEmpty(): bool;
    public function isEmpty(): bool;
}

@nyamsprod nyamsprod self-assigned this Sep 2, 2023
@nyamsprod
Copy link
Member Author

@trowski @kelunik

Another issue also related to unicode is that in PHP full decoding happens while in Javascript the decoding converts the unicode codepoint.

test(function() {
    var params = new URLSearchParams('a=b\u2384');
    assert_equals(params.get('a'), 'b\u2384');
    params = new URLSearchParams('a\u2384b=c');
    assert_equals(params.get('a\u2384b'), 'c');
}, 'Parse \u2384'); 

Currently in PHP you will get the full decoded unicode characters which is fine IMHO. 🤔

@nyamsprod nyamsprod added the enhancement New feature or request label Sep 2, 2023
@kelunik
Copy link
Contributor

kelunik commented Sep 2, 2023

I don't see why such escape sequences should be decoded there. Doesn't seem fine to me.

@nyamsprod
Copy link
Member Author

nyamsprod commented Sep 2, 2023

@kelunik sorry I took the wrong test my bad 😭, the example above behave as expected with my implementation. What changes is the following example

test(function() {
    var params = new URLSearchParams('a=b%f0%9f%92%a9c');
    assert_equals(params.get('a'), 'b\uD83D\uDCA9c');
    params = new URLSearchParams('a%f0%9f%92%a9b=c');
    assert_equals(params.get('a\uD83D\uDCA9b'), 'c');
}, 'Parse %f0%9f%92%a9');  // Unicode Character 'PILE OF POO' (U+1F4A9)

In PHP because we use rawurldecode we end up with

public function it_parse_pile_of_poo(): void
{
  $params = new URLSearchParams('a%f0%9f%92%a9b=c');
  self::assertSame("a💩b", $params->get('a'));
  $params = new URLSearchParams('a%f0%9f%92%a9b=c');
  self::assertSame("c", $params->get("a💩b"));
}

So to me there's nothing wrong with the PHP version even if it differs from the Javascript counterpart.

I could maybe add support for $params->get('a\uD83D\uDCA9b') returning the same value as $params->get("a💩b") but I am a bit hesitant in adding that

@kelunik
Copy link
Contributor

kelunik commented Sep 2, 2023

Still, these escape sequences are a PHP syntax within double quoted strings, not a Unicode or URL standard, no?

@nyamsprod
Copy link
Member Author

nyamsprod commented Sep 2, 2023

Still, these escape sequences are a PHP syntax within double quoted strings, not a Unicode or URL standard, no?

I would say it's a standard as per the specification see
https://url.spec.whatwg.org/#example-percent-encode-operations

Also

In Javascript, the identifiers and string literals can be expressed in Unicode via a Unicode escape sequence. The general syntax is \uXXXX , where X denotes four hexadecimal digits.

hence why I believe the tests written for javascript expects the escape sequence, which is not the case in PHP.

@kelunik
Copy link
Contributor

kelunik commented Sep 2, 2023

Sorry, you're absolutely right, I should have looked at the actual test input, there's no escape sequence there, they're just in the JavaScript expectations.

The behavior is correct that way then. 😊

@@ -558,7 +558,7 @@ public static function sameQueryAfterSortingProvider(): array
return [
'same already sorted' => ['a=3&a=1&b=2'],
'empty query' => [null],
'contains each pair key only once' => ['batman=robin&aquaman=aqualad&foo=bar&bar=baz'],
'contains each pair key only once' => ['aquaman=aqualad&bar=baz&batman=robin&foo=bar'],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this test change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Query::sort was supposed to follow WHATWG specification. Since now URLSearchParams does just that the sorting was bugfixes hence the change in the tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also sorting was wrong in this case its not because the keys appear only once that they should not be sorted.

* A literal sequence of name-value string pairs, or any object with an iterator that produces a sequence of string pairs.
* A record of string keys and string values. Note that nesting is not supported.
*/
public function __construct(object|array|string|null $query = '')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the type really be object here or be more specific like Stringable|\Traversable|\QueryInterface|...?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it should because according to the specification any object that can be iterable should be acceptable which translate to me in PHP by traversable , stdClass or any object with public properties as stated by the example in the specification and the fact that it works for instance for DateInterval. Hence why I added more named constructors in the PHP implementation to restrict in code what can and can not be use

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example from MDN URLSearchParams:

// Retrieve params via url.search, passed into constructor
const url = new URL("https://example.com?foo=1&bar=2");
const params1 = new URLSearchParams(url.search);

// Get the URLSearchParams object directly from a URL object
const params1a = url.searchParams;

// Pass in a string literal
const params2 = new URLSearchParams("foo=1&bar=2");
const params2a = new URLSearchParams("?foo=1&bar=2");

// Pass in a sequence of pairs
const params3 = new URLSearchParams([
  ["foo", "1"],
  ["bar", "2"],
]);

// Pass in a record
const params4 = new URLSearchParams({ foo: "1", bar: "2" });

Which I would translate in PHP as the following:

// Retrieve params via url.search, passed into constructor
$params1 = URLSearchParams::fromUri("https://example.com?foo=1&bar=2");

// Pass in a string literal
$params2 = new URLSearchParams("foo=1&bar=2");
$params2a = new URLSearchParams("?foo=1&bar=2");

// Pass in a sequence of pairs
$params3 = new URLSearchParams([
  ["foo", "1"],
  ["bar", "2"],
]);

// Pass in a record
$params4 = new URLSearchParams((object)["foo" => "1", "bar" => "2" });
//which means
$params5 = new URLSearchParams(new DateInterval('P3MT12M5S')); is also possible.

Hence why to reduce subtle errors I've added

  • URLSearchParams::new for string query with or without delimiter.
  • URLSearchParams::fromUri for uri string or object.
  • URLSearchParams::fromPairs for iterable of pairs elements.
  • URLSearchParams::fromRecords for records. I translate records as objects with public properties or generic iterator of key/value.
  • The URLSearchParams::fromParameters for array produces by parse_str and/or supported by http_build_query

@nyamsprod nyamsprod merged commit 43ad313 into master Sep 5, 2023
8 checks passed
@nyamsprod nyamsprod deleted the feature/urlsearchparams branch September 12, 2023 05:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants