Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[String] Made AbstractString::width() follow POSIX.1-2001 #35156

Open
wants to merge 1 commit into
base: master
from

Conversation

@fancyweb
Copy link
Contributor

fancyweb commented Jan 1, 2020

Q A
Branch? master
Bug fix? no
New feature? yes
Deprecations? no
Tickets -
License MIT
Doc PR -

This PR ports the wcswidth() function (see http://man7.org/linux/man-pages/man3/wcwidth.3.html and https://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c) into the String component. This new method will be useful in the Console component to determine how many columns a character takes.

I kind of copied the Intl data import strategy.

@fancyweb fancyweb force-pushed the fancyweb:string-wcswidth branch from dd7eb72 to eba29c0 Jan 1, 2020
$tableWide = require __DIR__.'/Resources/data/wcswidth_table_wide.php';
}

if ($codePoint >= $tableWide[0][0] && $codePoint <= $tableWide[$ubound = \count($tableWide) - 1][1]) {

This comment has been minimized.

Copy link
@fancyweb

fancyweb Jan 1, 2020

Author Contributor

Should this search algorithm be extracted since it is used two times?

return array_merge(
parent::provideWcswidth(),
[
[2, '⭐'],

This comment has been minimized.

Copy link
@fancyweb

fancyweb Jan 1, 2020

Author Contributor

Most test data were taken from the Python implementation https://github.com/jquast/wcwidth/blob/master/wcwidth/tests/test_core.py.

@fancyweb fancyweb force-pushed the fancyweb:string-wcswidth branch from eba29c0 to 274edfc Jan 1, 2020
@nicolas-grekas nicolas-grekas added this to the next milestone Jan 4, 2020
@nicolas-grekas nicolas-grekas force-pushed the fancyweb:string-wcswidth branch from a7fa164 to 3c6e292 Jan 13, 2020
Copy link
Member

nicolas-grekas left a comment

Thanks, that looks good to me.
I pushed a second commit that makes the most important changes I think we should do.
Namely, we don't want to add a new method, but to improve the existing one.
That's why I made wcwidth() private, and use it in width().
Of course, I didn't finish the refacto so tests won't pass anymore :P

@nicolas-grekas nicolas-grekas changed the title [String] Add the wcswdith() method [String] Made AbstractString::width() follow POSIX.1-2001 Jan 13, 2020
@fancyweb fancyweb force-pushed the fancyweb:string-wcswidth branch 3 times, most recently from c94cef8 to 6ecc13e Jan 14, 2020
Co-authored-by: Nicolas Grekas <nicolas.grekas@gmail.com>
@fancyweb fancyweb force-pushed the fancyweb:string-wcswidth branch from 6ecc13e to 495d959 Jan 15, 2020
/**
* Returns the printable length on a terminal.
*
* If the string contains a non-printable character, -1 is returned.

This comment has been minimized.

Copy link
@nicolas-grekas

nicolas-grekas Jan 15, 2020

Member

That's the only thing I'm unsure about. While this matches wcswidth(), is this the most useful behavior?
Should be drop any control chars before calling wcswidth()?
$s = preg_replace('/\p{Cc}++/u', '', $s);

This comment has been minimized.

Copy link
@fancyweb

fancyweb Jan 16, 2020

Author Contributor

I think we can change it. width() is already not 100% wcswidth compliant.

@fancyweb

This comment has been minimized.

Copy link
Contributor Author

fancyweb commented Jan 16, 2020

@nicolas-grekas Could we however somehow have a public accessible pure implementation of wcswidth in the String component? With moving the wcswidth() method logic somewhere else for example. Symfony wcswidth implementation could become the most robust tested PHP implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.