Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix special character support. #111

Merged
merged 1 commit into from
Dec 20, 2012
Merged

Fix special character support. #111

merged 1 commit into from
Dec 20, 2012

Conversation

Gipetto
Copy link
Contributor

@Gipetto Gipetto commented Dec 20, 2012

The current version of the Twilio PHP library will take input passed in and run it through the PHP htmlentities function. This produces "named" character entities in the output. Named character entities and XML don't get along. Only the 'quot', 'amp', 'apos', 'lt', and 'gt' entities are defined.

This change replaces the htmlentities call with an process that decodes the input and then runs it through htmlspecialchars to produce "numeric" entities instead.

This should accommodate most western character sets. Non standard (to htmlspecialchars) multi-byte character sets will fail silently in Twilio (ie: say nothing) instead of throwing an application error.

Desired text

é tü & må

Old and busted

Twilio returns "An application error has occurred" with the actual logged error being "Error on line 2 of document : The entity "Atilde" was referenced, but not declared. Please ensure that the response body is a valid XML document."

<?xml version="1.0" encoding="UTF-8"?>
<Response><Say>&Atilde;&copy; t&Atilde;&frac14; &amp; m&Atilde;&yen;</Say></Response>

New Hotness

Twilio properly pronounces the characters.

<?xml version="1.0" encoding="UTF-8"?>
<Response><Say>&#xE9; t&#xFC; &amp; m&#xE5;</Say></Response>

Test

The unit tests have been updated to test for desired output, but here's a quick test that can be run to observe the output:

<?php

require('/path/to/twilio-php/Services/Twilio.php');

$lines = array(
    'é tü & må', // raw UTF8
    '&eacute; t&uuml; &amp; m&aring;', // html named entities
    '&#xE9; t&#xFC; &amp; m&#xE5;', // html numeric entities
);

$response = new Services_Twilio_Twiml;

foreach ($lines as $line) {
    $response->say($line);
}

# All 3 say verbs in response should match
echo $response;

New Requirements

Library now requires PHP 5.2.3 (formerly 5.2.1) to support the non-double encoding flag on the php function htmlspecialchars

The current version of the Twilio PHP library will take input passed in and run
it through the PHP `htmlentities` function. This produces "named" character
entities in the output. Named character entities and XML don't get along. Only
the 'quot', 'amp', 'apos', 'lt', and 'gt' entities are defined.

This change replaces the `htmlentities` call with an process that decodes the
input and then runs it through `htmlspecialchars` to produce "numeric" entities
instead.

This should accommodate most western character sets. Non standard (to php's
default htmlspecialchars) multi-byte character sets will fail silently in Twilio
(ie: say nothing) instead of throwing an application error.
@walker
Copy link

walker commented Dec 20, 2012

I second this. beautiful.

@kevinburke
Copy link
Contributor

Ugh, that's annoying. Thanks Shawn.

@Gipetto
Copy link
Contributor Author

Gipetto commented Dec 20, 2012

Thank @walker - this is based on his work digging in to OpenVBX.

@kevinburke kevinburke merged commit 914253d into twilio:master Dec 20, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants