New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emoji in titles or URLs cause tracking to fail #7766

Closed
ethitter opened this Issue Apr 25, 2015 · 8 comments

Comments

Projects
None yet
4 participants
@ethitter

ethitter commented Apr 25, 2015

WordPress 4.2 was released this week, and it includes full support for emoji, including in post titles and URLs. To take advantage of that, I published a post that used the 馃挜 emoji (https://s.w.org/images/core/emoji/72x72/1f4a5.png, in case it get's stripped out) in the title and URL, however Piwik failed to track any views of the post because piwik.php is returning a 400 - bad request status code. I confirmed against two other tracking systems that there were views to the post that should've been captured by Piwik.

Is this a problem with the DB encoding (utf8 vs utf8mb4) or an issue in the PHP handling of the title and URL inputs when they include extended UTF-8 characters?

@mattab

This comment has been minimized.

Show comment
Hide comment
@mattab

mattab May 22, 2015

Member

Hi @ethitter
Thanks for the report. I can confirm the URLs with Emoji are not tracked. Likely this is due to the fact that we would have to change the mysql tables from utf8 to utf8mb4. Note: Wordpress devs blogged about this change in: https://make.wordpress.org/core/2015/04/02/the-utf8mb4-upgrade/
It looks non trivial so we unfortunately can't do it soon.

Member

mattab commented May 22, 2015

Hi @ethitter
Thanks for the report. I can confirm the URLs with Emoji are not tracked. Likely this is due to the fact that we would have to change the mysql tables from utf8 to utf8mb4. Note: Wordpress devs blogged about this change in: https://make.wordpress.org/core/2015/04/02/the-utf8mb4-upgrade/
It looks non trivial so we unfortunately can't do it soon.

@mattab mattab added the Bug label May 22, 2015

@mattab mattab added this to the Mid term milestone May 22, 2015

@sgiehl

This comment has been minimized.

Show comment
Hide comment
@sgiehl

sgiehl May 22, 2015

Member

Shouldn't we do a 'quickfix' so that those urls will still be tracked. Maybe with the emoji cut off?

Member

sgiehl commented May 22, 2015

Shouldn't we do a 'quickfix' so that those urls will still be tracked. Maybe with the emoji cut off?

@ethitter

This comment has been minimized.

Show comment
Hide comment
@ethitter

ethitter May 23, 2015

Also relevant is Andrew Nacin's discussion of the security issues around these changes: https://www.youtube.com/watch?v=yQaRUEwEKxE. Simply updating the table encodings may not be sufficient; it wasn't for WordPress.

I like the idea of a quickfix to just drop the emoji, but that'd likely break the URLs being tracked as emoji would need to be stripped from there too.

ethitter commented May 23, 2015

Also relevant is Andrew Nacin's discussion of the security issues around these changes: https://www.youtube.com/watch?v=yQaRUEwEKxE. Simply updating the table encodings may not be sufficient; it wasn't for WordPress.

I like the idea of a quickfix to just drop the emoji, but that'd likely break the URLs being tracked as emoji would need to be stripped from there too.

@mattab

This comment has been minimized.

Show comment
Hide comment
@mattab

mattab Sep 11, 2015

Member

Shouldn't we do a 'quickfix' so that those urls will still be tracked. Maybe with the emoji cut off?

@sgiehl makes sense, it would be better to track partial incorrect data rather than no data at all. Maybe instead of removing emojis, we could replace with *** or so. feel free to investigate if you have time :)

Member

mattab commented Sep 11, 2015

Shouldn't we do a 'quickfix' so that those urls will still be tracked. Maybe with the emoji cut off?

@sgiehl makes sense, it would be better to track partial incorrect data rather than no data at all. Maybe instead of removing emojis, we could replace with *** or so. feel free to investigate if you have time :)

@mattab

This comment has been minimized.

Show comment
Hide comment
@mattab

mattab Sep 15, 2015

Member

FYI: Piwik now tracks URLs with emojis but emoji (and all utf8 4-byte chars) will be replaced by 锟 character. it was done in #8765

Member

mattab commented Sep 15, 2015

FYI: Piwik now tracks URLs with emojis but emoji (and all utf8 4-byte chars) will be replaced by 锟 character. it was done in #8765

@mattab

This comment has been minimized.

Show comment
Hide comment
@mattab

mattab Sep 15, 2015

Member

This is fixed. Created: #8790 Tracking API: track Emoji correctly in page URLs and others

Member

mattab commented Sep 15, 2015

This is fixed. Created: #8790 Tracking API: track Emoji correctly in page URLs and others

@gmariani

This comment has been minimized.

Show comment
Hide comment
@gmariani

gmariani May 9, 2018

Still having this issue with 3.4.0 on PHP 7.2

[09-May-2018 14:11:33 UTC] Error in Matomo: Your Matomo version 3.4.0 is up to date.
[09-May-2018 14:11:43 UTC] Error in Matomo (tracker): Error query: SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xF0\x9F\x8F\xA1 C...' for column 'name' at row 1 In query: INSERT INTO piwik_log_action (name, hash, type, url_prefix) VALUES (?,CRC32(?),?,?) Parameters: array ( 0 => '冒鸥锟铰 Chandler Arizona Luxury Homes | [John Cunningham 2018]', 1 => '冒鸥锟铰 Chandler Arizona Luxury Homes | [John Cunningham 2018]', 2 => 4, 3 => NULL, )

gmariani commented May 9, 2018

Still having this issue with 3.4.0 on PHP 7.2

[09-May-2018 14:11:33 UTC] Error in Matomo: Your Matomo version 3.4.0 is up to date.
[09-May-2018 14:11:43 UTC] Error in Matomo (tracker): Error query: SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xF0\x9F\x8F\xA1 C...' for column 'name' at row 1 In query: INSERT INTO piwik_log_action (name, hash, type, url_prefix) VALUES (?,CRC32(?),?,?) Parameters: array ( 0 => '冒鸥锟铰 Chandler Arizona Luxury Homes | [John Cunningham 2018]', 1 => '冒鸥锟铰 Chandler Arizona Luxury Homes | [John Cunningham 2018]', 2 => 4, 3 => NULL, )

@mattab

This comment has been minimized.

Show comment
Hide comment
@mattab

mattab May 9, 2018

Member

@gmariani as it is not supposed to trigger an error, could you please paste in a new issue (this one is already closed), the piwik.php?.... request that creates this error? We will make sure to address this. Thanks

Member

mattab commented May 9, 2018

@gmariani as it is not supposed to trigger an error, could you please paste in a new issue (this one is already closed), the piwik.php?.... request that creates this error? We will make sure to address this. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment