Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add extended trim #104

Closed
olegbaturin opened this issue Jul 21, 2023 · 16 comments
Closed

add extended trim #104

olegbaturin opened this issue Jul 21, 2023 · 16 comments
Labels
status:ready for adoption Feel free to implement this issue. type:feature New feature

Comments

@olegbaturin
Copy link
Contributor

olegbaturin commented Jul 21, 2023

public static function trim(string $str): string
{
    return trim($str, " \t\n\r\0\x0B\xC2\xA0\xEF\xBB\xBF");
}

@vjik
Copy link
Member

vjik commented Jul 21, 2023

In what cases is this necessary?

@olegbaturin
Copy link
Contributor Author

In what cases is this necessary?

I'm using it to sanitize user input data.

@samdark
Copy link
Member

samdark commented Jul 24, 2023

Sanitize it from what exactly? Why exactly this characters set but not, for example, x0c?

@olegbaturin
Copy link
Contributor Author

Updated method for advanced trimming ascii and utf8 strings

public static function trim(string $str): string
{
    return preg_replace("#^[\pC\pZ]+|[\pC\pZ]+$#u", '', $str);
}

List of unicode whitespace characters https://en.wikipedia.org/wiki/Whitespace_character#Unicode

@vjik
Copy link
Member

vjik commented Aug 1, 2023

  1. Which is faster: trim() with full symbols list or preg_replace() ?

  2. Should this method has name trim or more specific?

@olegbaturin
Copy link
Contributor Author

  1. trim() removes only single bytes specified in $characters parameter and can't safely remove multibyte characters.
  2. When PCRE modifier u(PCRE_UTF8) is set, strings are treated as UTF-8, so it's a utf8 trim.

Generalized version

public static function trim(string $str, string $pattern = "\pC\pZ"): string
{
    return preg_replace("#^[$pattern]+|[$pattern]+$#u", '', $str);
}

@vjik
Copy link
Member

vjik commented Aug 3, 2023

I think need use another name... May be:

  • regexpTrim() (I like it)
  • pregTrim()
  • reTrim()

?

But I not sure. May be trim better name =) Need more opinions.

@olegbaturin
Copy link
Contributor Author

It can be applied to utf8 and ascii strings, not only to utf8.

@olegbaturin
Copy link
Contributor Author

I think need use another name... May be:

* `regexpTrim()` (I like it)

* `pregTrim()`

* `reTrim()`

?

But I not sure. May be trim better name =) Need more opinions.

trim() is well known function name for string trimming. Does the implementation details matter?

@vjik vjik added status:ready for adoption Feel free to implement this issue. type:feature New feature and removed status:under discussion labels Aug 7, 2023
@vjik
Copy link
Member

vjik commented Aug 7, 2023

@olegbaturin Will you undertake to do PR?

@olegbaturin
Copy link
Contributor Author

What about ltrim()/rtrim()?

@xepozz
Copy link
Contributor

xepozz commented Aug 19, 2023

I think need use another name... May be:

* `regexpTrim()` (I like it)

* `pregTrim()`

* `reTrim()`

?
But I not sure. May be trim better name =) Need more opinions.

trim() is well known function name for string trimming. Does the implementation details matter?

The implementations matter when performance is a goal, but otherwise developers may use regular trim, so it's good to have kind of boosted trim

@olegbaturin
Copy link
Contributor Author

developers may use regular trim

Regular trim can't be used to remove unicode symbols, unfortunately.

@xepozz
Copy link
Contributor

xepozz commented Aug 19, 2023

So would you like to implement it?

@xepozz
Copy link
Contributor

xepozz commented Aug 19, 2023

I think all trim, ltrim, rtrim can be here.

@vjik
Copy link
Member

vjik commented Sep 17, 2023

Done by #109

@vjik vjik closed this as completed Sep 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status:ready for adoption Feel free to implement this issue. type:feature New feature
Projects
None yet
Development

No branches or pull requests

4 participants