[Console]Improve formatter for double-width character #10714

Closed
wants to merge 7 commits into
from

Conversation

Projects
None yet
4 participants
@denkiryokuhatsuden
Contributor

denkiryokuhatsuden commented Apr 15, 2014

Q A
Bug fix? yes
New feature? no
BC breaks? yes for one that expecting current broken output
Deprecations? no
Fixed tickets
Tests pass? yes
License MIT
Doc PR

EDIT: fixed the table above not to remove irrelevant line

As mb_strlen just returns "how many chars in the string",
formatting with double-width character is bit broken.

The test I add is skipped when mbstring extension is not loaded.
I'm afraid if some of you cannot properly display japanese string.
(表示するテキスト just means "Some text to display")

denkiryokuhatsuden added some commits Apr 15, 2014

[Console]Add failing test for double-width char
`$formatter->formatBlock()` formats a broken block
when some double-width character is set.
[Console]use mb_strwidth instead of mb_strlen
Some character has doubled-width so `mb_strlen` does not return actual string width (just returns count of chars).
By using `mb_strwidth` (http://www.php.net/manual/en/function.mb-strwidth.php),
it can property measure the actual width of string.
[Console]Add similar test for Application#renderException
When `Application` renderes Exception, it also writes broken block when
message contains double-width (more generally, multi-byted) character.

This commit fixes above.
@denkiryokuhatsuden

This comment has been minimized.

Show comment Hide comment
@denkiryokuhatsuden

denkiryokuhatsuden Apr 17, 2014

Contributor

I found that rendering error uses logic independent of FormatterHelper, and it has similar problem.
15e136d is commit to fix them.

Contributor

denkiryokuhatsuden commented Apr 17, 2014

I found that rendering error uses logic independent of FormatterHelper, and it has similar problem.
15e136d is commit to fix them.

+ // str_split is not suitable for multi-byte characters, we should use preg_split to get char array properly.
+ // additionally, array_slice() is not enough as some character has doubled width.
+ // we need a function to split string not by character count but by string width
+ $str_split_width = function ($string, $width) {

This comment has been minimized.

Show comment Hide comment
@stof

stof Apr 17, 2014

Member

please make it a private method rather than a closure which need to be recreated by PHP for each call to renderException

@stof

stof Apr 17, 2014

Member

please make it a private method rather than a closure which need to be recreated by PHP for each call to renderException

This comment has been minimized.

Show comment Hide comment
@denkiryokuhatsuden

denkiryokuhatsuden Apr 17, 2014

Contributor

above $str_split_width, $strlen also exists as closure. Shall I move $strlen as private method either?

@denkiryokuhatsuden

denkiryokuhatsuden Apr 17, 2014

Contributor

above $str_split_width, $strlen also exists as closure. Shall I move $strlen as private method either?

This comment has been minimized.

Show comment Hide comment
@denkiryokuhatsuden

denkiryokuhatsuden Apr 17, 2014

Contributor

and could you please give advice for method naming,

  • $strlen
    -> private strlen for consistency (Helper has a function with same name)
    -> private strwidth as this does not return length of string but width
  • $str_split_width
    -> how about private splitStringByWidth()?
@denkiryokuhatsuden

denkiryokuhatsuden Apr 17, 2014

Contributor

and could you please give advice for method naming,

  • $strlen
    -> private strlen for consistency (Helper has a function with same name)
    -> private strwidth as this does not return length of string but width
  • $str_split_width
    -> how about private splitStringByWidth()?

This comment has been minimized.

Show comment Hide comment
@stof

stof Apr 17, 2014

Member

agreed about splitStringByWidth rather than $str_split_width.

and for $strlen, it could indeed be renamed to a private method stringWidth as it does not compute the length anymore after your change

@stof

stof Apr 17, 2014

Member

agreed about splitStringByWidth rather than $str_split_width.

and for $strlen, it could indeed be renamed to a private method stringWidth as it does not compute the length anymore after your change

This comment has been minimized.

Show comment Hide comment
@denkiryokuhatsuden

denkiryokuhatsuden Apr 17, 2014

Contributor

Thanks for advice.
Shall I leave the name of Helper#strlen unchanged (which I changed its behavior, too) unlike Application's, as changing that name introduces BC break?

@denkiryokuhatsuden

denkiryokuhatsuden Apr 17, 2014

Contributor

Thanks for advice.
Shall I leave the name of Helper#strlen unchanged (which I changed its behavior, too) unlike Application's, as changing that name introduces BC break?

This comment has been minimized.

Show comment Hide comment
@stof

stof Apr 17, 2014

Member

yeah, it cannot be renamed as it is protected

@stof

stof Apr 17, 2014

Member

yeah, it cannot be renamed as it is protected

+ if (!function_exists('mb_strwidth')) {
+ return str_split($string, $width);
+ }
+ $originalEncoding = mb_detect_encoding($string);

This comment has been minimized.

Show comment Hide comment
@stof

stof Apr 17, 2014

Member

you need to handle the case of returning false (i.e. not able to detect the encoding)

@stof

stof Apr 17, 2014

Member

you need to handle the case of returning false (i.e. not able to detect the encoding)

This comment has been minimized.

Show comment Hide comment
@denkiryokuhatsuden

denkiryokuhatsuden Apr 17, 2014

Contributor

I changed the code not to do anything about converting encode if mb_detect_encoding returns false.

@denkiryokuhatsuden

denkiryokuhatsuden Apr 17, 2014

Contributor

I changed the code not to do anything about converting encode if mb_detect_encoding returns false.

+ }
+ $originalEncoding = mb_detect_encoding($string);
+
+ $convertedString = false === $originalEncoding ? $string : mb_convert_encoding($string, 'utf8', $originalEncoding);

This comment has been minimized.

Show comment Hide comment
@stof

stof Apr 17, 2014

Member

To be consistent with stringWidth, you should fallback to str_split if mb_string does not support the encoding

@stof

stof Apr 17, 2014

Member

To be consistent with stringWidth, you should fallback to str_split if mb_string does not support the encoding

This comment has been minimized.

Show comment Hide comment
@denkiryokuhatsuden

denkiryokuhatsuden Apr 17, 2014

Contributor

Thanks, I've overlooked the code in stringWidth. I've changed variable's name, too.

@denkiryokuhatsuden

denkiryokuhatsuden Apr 17, 2014

Contributor

Thanks, I've overlooked the code in stringWidth. I've changed variable's name, too.

@fabpot

This comment has been minimized.

Show comment Hide comment
@fabpot

fabpot May 12, 2014

Member

Thank you @denkiryokuhatsuden.

Member

fabpot commented May 12, 2014

Thank you @denkiryokuhatsuden.

fabpot added a commit that referenced this pull request May 12, 2014

bug #10714 [Console]Improve formatter for double-width character (den…
…kiryokuhatsuden)

This PR was squashed before being merged into the 2.3 branch (closes #10714).

Discussion
----------

[Console]Improve formatter for double-width character

| Q             | A
| ------------- | ---
| Bug fix?      | yes
| New feature?  | no
| BC breaks?    | yes for one that expecting current broken output
| Deprecations? | no
| Fixed tickets |
| Tests pass?   | yes
| License       | MIT
| Doc PR        |

EDIT: fixed the table above not to remove irrelevant line

As mb_strlen just returns "how many chars in the string",
formatting with double-width character is bit broken.

The test I add is skipped when mbstring extension is not loaded.
I'm afraid if some of you cannot properly display japanese string.
(表示するテキスト just means "Some text to display")

Commits
-------

a52f41d [Console]Improve formatter for double-width character

@fabpot fabpot closed this May 12, 2014

@jakzal

This comment has been minimized.

Show comment Hide comment
@jakzal

jakzal May 13, 2014

Member

@denkiryokuhatsuden on which PHP version have you been testing this?

Member

jakzal commented May 13, 2014

@denkiryokuhatsuden on which PHP version have you been testing this?

@denkiryokuhatsuden

This comment has been minimized.

Show comment Hide comment
@denkiryokuhatsuden

denkiryokuhatsuden May 14, 2014

Contributor

@jakzal tested on php5.5. Are there some tests failing?

Contributor

denkiryokuhatsuden commented May 14, 2014

@jakzal tested on php5.5. Are there some tests failing?

@jakzal

This comment has been minimized.

Show comment Hide comment
@jakzal

jakzal May 14, 2014

Member

@denkiryokuhatsuden yes, they were never passing on PHP 5.5. See #10897.

Member

jakzal commented May 14, 2014

@denkiryokuhatsuden yes, they were never passing on PHP 5.5. See #10897.

fabpot added a commit that referenced this pull request May 14, 2014

bug #10897 [Console] Fix a console test (jakzal)
This PR was merged into the 2.3 branch.

Discussion
----------

[Console] Fix a console test

| Q             | A
| ------------- | ---
| Bug fix?      | yes
| New feature?  | no
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | -
| License       | MIT
| Doc PR        | -

re #10714

Every wide character in the fixture file is actually 3 ansi-characters long.

Commits
-------

61108b9 Disable 5.6 until it is stable again.
8cadb49 Update the fixtures.

fabpot added a commit that referenced this pull request May 14, 2014

bug #10899 Explicitly define the encoding. (jakzal)
This PR was merged into the 2.3 branch.

Discussion
----------

Explicitly define the encoding.

| Q             | A
| ------------- | ---
| Bug fix?      | yes
| New feature?  | no
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | -
| License       | MIT
| Doc PR        | -

Credits for discovering it go to @nicolas-grekas. Cheers!

Travis for PHP 5.6 cannot be enabled yet as there's one more test failing.

re #10714 #10714

Commits
-------

619ff58 Explicitly define the encoding.

@jakzal jakzal referenced this pull request May 15, 2014

Closed

Console test is failing #10905

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment