Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request #72831: Add encoding option for chr and ord #2083

Closed
wants to merge 4 commits into from
Closed

Request #72831: Add encoding option for chr and ord #2083

wants to merge 4 commits into from

Conversation

masakielastic
Copy link
Contributor

This PR suggests encoding option for chr and ord based on #2081.

PHPAPI zend_string *php_escape_html_entities(unsigned char *old, size_t oldlen, int all, int flags, char *hint_charset);
PHPAPI zend_string *php_escape_html_entities_ex(unsigned char *old, size_t oldlen, int all, int flags, char *hint_charset, zend_bool double_encode);
PHPAPI zend_string *php_unescape_html_entities(unsigned char *old, size_t oldlen, int all, int flags, char *hint_charset);
PHPAPI unsigned int php_next_utf8_char(const unsigned char *str, size_t str_len, size_t *cursor, int *status);
PHPAPI unsigned int get_next_char(enum entity_charset charset, const unsigned char *str, size_t str_len, size_t *cursor, int *status);
PHPAPI enum entity_charset determine_charset(char *charset_hint);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that if these were to be exported, they should at least have the php_ prefix, like the other functions exposed

@smalyshev smalyshev added the RFC label Sep 5, 2016
c = 0xfffd;
}

if (c < 0x80) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this reimplement UTF-8 encoding? We already have enough implementations of this scattered around the php-src codebase (one of which is my fault, I must admit).

Also, did you copy this code from somewhere? If so, why couldn't it be reused without copying?

@hikari-no-yume
Copy link
Contributor

I was going to question whether chr() and ord() are the right place for this, because most of the string functions ignore Unicode. But we also have some that don't (htmlspecialchars() and the like, thus the Unicode stuff in ext/standard/html.c), and I myself moved utf8_encode() and utf8_decode() to ext/standards, so I can't really object.

@sgolemon
Copy link
Contributor

If utf-8 is your target encoding, consider IntlChar::chr() and IntlChar::ord() which have been available in PHP since 7.0

@cmb69
Copy link
Member

cmb69 commented Aug 8, 2018

This PR has been tagged with the RFC label, so it would be nice to start the RFC process, or to close this PR.

@KalleZ
Copy link
Member

KalleZ commented Mar 2, 2019

Gonna close this due to inactivity, please start the RFC process for this if you wish to pick this one up again

@KalleZ KalleZ closed this Mar 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants