-
-
Notifications
You must be signed in to change notification settings - Fork 138
Add test for some PHP special chars #91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Here's the output of msgid "plain"
msgstr ""
msgid "DATE \\a\\t TIME"
msgstr ""
msgid "FIELD\tFIELD"
msgstr "" |
|
I know you may have some reticence about what I'm going to write, but the best approach IMHO would be to replace https://github.com/oscarotero/Gettext/blob/e93f74356280016bd50c41c62cc858c3005791e7/src/Utils/PhpFunctionsScanner.php#L42-L50 with: $bufferFunctions[0][2][] = eval('return '.$value[1].';');Contrary to what 99% of PHP developers think, The only alternative I can see is to manually unescape the string, differentiating the behaviour if the string is single quoted (we only need to look for |
|
Yes, you're right: I'm reticent :D If the string had other evaluable elements, like a variable: |
|
So, shall we parse the string char by char? |
|
What about something like this: function decodeString($value)
{
$result = '';
if ($value[0] === "'" || strpos($value, '$') === false) {
$result = eval("return $value;");
} else {
// Manual and slow way of parse $value
}
return $result;
} |
|
Here's the full code that should work: function decodeString($value)
{
static $simpleDictionary;
$result = '';
if ($value[0] === "'" || strpos($value, '$') === false) {
$result = eval("return $value;");
} else {
$value = substr($value, 1, -1);
if (!isset($simpleDictionary)) {
$simpleDecodeDictionary = array(
'\\' => "\\",
'$' => '$',
'"' => '"',
);
}
while (($p = strpos($value, '\\')) !== false) {
if (!isset($value[$p + 1])) {
break;
}
if ($p > 0) {
$result .= substr($value, 0, $p);
}
$value = substr($value, $p + 1);
if (isset($simpleDictionary[$value[0]])) {
$result .= $simpleDictionary[$value[0]];
$value = substr($value, 1);
} elseif (preg_match('/^([a-z0-9{}]+)/', $value, $m)) {
$result .= eval('return "\\'.$m[1].'";');
$value = substr($value, strlen($m[1]));
} else {
$result .= '\\';
}
}
$result .= $value;
}
return $result;
}EDIT: I updated the preg_match, to avoid future unsupported escape sequences (like |
|
If it's ok I can update this pull request accordingly |
|
Even simpler: function decodeString($value)
{
$result = '';
if ($value[0] === "'" || strpos($value, '$') === false) {
$result = eval("return @$value;");
} else {
$value = substr($value, 1, -1);
while (($p = strpos($value, '\\')) !== false) {
if (!isset($value[$p + 1])) {
break;
}
if ($p > 0) {
$result .= substr($value, 0, $p);
}
$value = substr($value, $p + 1);
if (preg_match('/^([\\$"]|[a-z0-9{}]+)/', $value, $m)) {
$result .= eval('return "\\'.$m[1].'";');
$value = substr($value, strlen($m[1]));
} else {
$result .= '\\';
}
}
$result .= $value;
}
return $result;
} |
|
...and without preg_match, for lightning fast execution: function decodeString($value)
{
$result = '';
if ($value[0] === "'" || strpos($value, '$') === false) {
$result = eval("return @$value;");
} else {
$value = substr($value, 1, -1);
while (($p = strpos($value, '\\')) !== false) {
if (!isset($value[$p + 1])) {
break;
}
if ($p > 0) {
$result .= substr($value, 0, $p);
}
$value = substr($value, $p + 1);
$p = strpos($value, '$');
if ($p === false) {
$result .= eval('return "\\'.$value.'";');
$value = '';
break;
}
if ($p === 0) {
$result .= '$';
$value = substr($value, 1);
}
else {
$result .= eval('return "\\'.substr($value, 0, $p).'";');
$value = substr($value, $p);
}
}
$result .= $value;
}
return $result;
} |
|
I think it's a bit overcomplicated. Why not use an unique solution for all cases? For example (not tested): function fix ($value) {
if (strpos($value, '\\') !== false) {
if ($value[0] === '"') {
$value = preg_replace('/[^\\\]\\\([^nrtvefxu0-7\$\\"])/', '\\\\$1', $value);
} else {
$value = preg_replace("/[^\\\]\\\([^'])/", '\\\\\1', $value);
}
}
return substr($value, 1, -1);
}Eval is going to be used in very few cases, so I don't see any value using two solutions for the same problem. |
It's broken: see https://3v4l.org/WqGos
Well, I'd use it for all the strings not enclosed in |
|
See the results from the approach of mine: https://3v4l.org/tQgip |
|
Ok, it's a bit hard to fix this with regular expressions (at least for me). |
|
Done... let's see what Trafis says 😉 |
|
If this pull request will be merged, I have already another fix (the po export removes special chars like "\t" and could mess up multibyte strings) |
|
Ah, ok, thank you. (sorry for the late, I'm currently a bit bussy) |
Add test for some PHP special chars
|
See #92 |
|
Seems like sensiolab does not like eval https://insight.sensiolabs.com/projects/496dc2a6-43be-4046-a283-f8370239dd47/analyses/23 😞 |
|
I'm open to any valid alternatives... As I said, 99% of developers consider eval=evil, but I really don't understand why... |
|
Yes, in this case I agree that it's not dangerous. But having those critical security errors is not nice. |
Let's consider this source code:
<div> <p><?php __('plain'); ?></p> <p><?php __('DATE \a\t TIME'); ?></p> <p><?php __("DATE \a\\t TIME"); ?></p> <p><?php __("DATE \\a\\t TIME"); ?></p> <p><?php __("FIELD\tFIELD"); ?></p> </div>Here we have only 3 different strings, but the php extractor finds these strings: