Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PSR2 Adjustments for the UTF-8 functions #2780

Merged
merged 6 commits into from
Jul 14, 2019
Merged

PSR2 Adjustments for the UTF-8 functions #2780

merged 6 commits into from
Jul 14, 2019

Conversation

splitbrain
Copy link
Collaborator

This moves all our actual code for handling UTF-8 (with or without mbstring) to separate classes in their own namespace. The methods are all static and alias functions for backward compatibility point to them.

The various lookup tables have been moved to their own files and reformatted to have one entry per line (for better diffs).

One question that remains is if we should deprecate the old utf8_* methods in favor of their new class method counter parts. Realistically speaking we will probably never be able to remove the alias functions because the utf8 stuff is too widely used.

These will be loaded via include, later on and should be cachable by
Op-Cache.

The formatting has been adjusted to have one entry per line to make
diffing much easier in the future.

For now duplicate keys and commented code from the originals have been
kept. But this should probably be cleaned up in the future.

For now these tables are not used, yet.
This doesn't really change much since the old functions are still needed
for compatibility reasons. We may be able to reduce the number of
functions by checking which ones we really need.
Docblocks, imports, etc...
@splitbrain splitbrain requested review from phy25 and micgro42 May 20, 2019 08:21
Copy link
Collaborator

@micgro42 micgro42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm strongly in favor of deprecating the old utf8.

  1. this makes it much less confusing for new and returning maintainers that would be otherwise be presented with two sets utility methods that seem to do the same thing
  2. This encourages people to maintain their code (it does of me at least, because I don't want deprecated methods in my code)
  3. In two or three years we might think differently about "we can never remove it". Maybe all the old, not updated extensions will by then be incompatible with Dokuwiki anyway?

@micgro42
Copy link
Collaborator

micgro42 commented Jun 8, 2019

There are still two issues that are not auto-fixable:

FILE: /home/michael/temp/2019-02-02/dokuwiki/inc/Utf8/PhpString.php
 147 | WARNING | [ ] Line exceeds 120 characters; contains 121 characters (Generic.Files.LineLength.TooLong)
 188 | ERROR   | [ ] Method name "PhpString::substr_replace" is not in camel caps format (PSR1.Methods.CamelCapsMethodName.NotCamelCaps)

Actually replacing all calls is still to come.
also avoids overlong line
@splitbrain
Copy link
Collaborator Author

I am not sure about substr_replace. Since it's the replacement for the php function of the same name I would vote for keeping it in snake case to match the original.

For now this uses full qualified namespaces, sensible imports may come
later.
Copy link
Collaborator

@phy25 phy25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I look through the code and the idea looks good to me. Maybe the correctness of this can be confirmed by either fully-covered tests or pre-release?

0xff1d, 0xff1e, 0xff1f, 0xff20, 0xff3b, 0xff3c, 0xff3d, 0xff3e, 0xff40, 0xff5b,
0xff5c, 0xff5d, 0xff5e, 0xff5f, 0xff60, 0xff61, 0xff62, 0xff63, 0xff64, 0xff65,
0xffe0, 0xffe1, 0xffe2, 0xffe3, 0xffe4, 0xffe5, 0xffe6, 0xffe8, 0xffe9, 0xffea,
0xffeb, 0xffec, 0xffed,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I searched all table constants around here in plugin codebase, looks like the only plugin making use of this $UTF8_SPECIAL_CHARS2 is plugin:siteexport. Deprecating these constants should be noted somewhere as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when writing https://www.dokuwiki.org/devel:releases:refactor#utf-8_functions, I notices too that $UTF8_SPECIAL_CHARS2 has no replacement. Is this variable not needed anymore? or is it included in \dokuwiki\Utf8\Table::specialChars()?

@splitbrain splitbrain merged commit 8cbc5ee into psr2 Jul 14, 2019
@micgro42 micgro42 deleted the utf8refactor branch July 17, 2019 06:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants