Support Locales #136

Closed
wants to merge 1 commit into from

4 participants

@klederson

FLOAT now should support the current given locale once there's nowhere where you can specify it.

@klederson klederson Support Locales
FLOAT now should support the current given locale once there's nowhere where you can specify it.
2c60e4f
@augustohp
Respect member

Great one!
Do you think you can provide some tests for other locales?!

@nickl-
Respect member

Tests

@augustohp is right we need tests! Tests to prove the implementation works and stays working but also as a vital source of documentation and insight. With adequate tests it may have been obvious that @klederson is attempting to fix a perceivable bug instead of adding new functionality. Please correct me if I am wrong...

The problem is that filter_var is using the default decimal delimiter . instead of the locale based one.

The problem:

filter_var('1.5', FILTER_VALIDATE_FLOAT);
// 1.5 - float value
filter_var('1,5', FILTER_VALIDATE_FLOAT);
// false - boolean
filter_var('1,5', FILTER_VALIDATE_FLOAT, [ 'options' => [ 'decimal' => ',' ] ]);
// true - \o/

Hold on! Before we go out to celebrate!

The pudding's fan trajectory

I am all for i18n but we are limited to the confines of the PHP environment and even though the suggested fix may solve the above something really bad has happened. The function filter_var is not the only one that acts this way, in actual fact it's not a functional choice perse but gets imposed at the PHP data type level.

The period is as a rule an identifier for floating point values contained in strings:

If the string does not contain any of the characters '.', 'e', or 'E' and the numeric value fits into integer type limits (as defined by PHP_INT_MAX), the string will be evaluated as an integer. In all other cases it will be evaluated as a float.

Quote from the manual on string conversions

Eminent Doom

There are very few definitive truths for sentient beings, but as a machine facts are easy to come by and blindly relied upon, without a question. If I have a string that validates as a float I use it as a float and if a value does not validate as a float it's not a float.

With this fix all locales with delimiters other than a period will at best result in int values with all fractions discarded.

$val = '1,5';
is_float($val);
// false
!!filter_var($val, FILTER_VALIDATE_FLOAT, [ 'options' => [ 'decimal' => ',' ] ]);
// true
$calc = 2 * $val;
// $calc == 2 and not 3 - bad
is_float($calc);
// false
$val = $calc / 3
// $val == 0.6666667 and not 1
is_float($val);
// true
!!filter_var($val, FILTER_VALIDATE_FLOAT, [ 'options' => [ 'decimal' => ',' ] ]);
// false - bad

As you can see, this renders a "valid float" unreliable and defunctional.

the Locale Predicament

As @augustohp points out we should also consider other locales? This is not only for testing but you will very likely want to use something other than the system locale, perhaps that of the end user. This becomes even more of a problem as locales need to be installed first and all locales are not necessarily available. To make matters worse the only way to retrieve these values (unless you know of another) is with localeconv which in turn relies on setlocale which governs the locale for the running PHP instance not per request.

Changing the system locale to represent anything other than the system locale is a bad idea and should go without saying.

Conclusion

@klederson I hope this doesn't discourage you and well done for trying. Unfortunately we cannot merge the suggested changes. I would suggest to rather implement this as another function or perhaps a collection of validators against the system locale. For the purposes of the Float validator we will have to continue to Respect PHP =)


Something you can fix

Now that you mention it and brought it out for attention there is something bugging me. I find it very strange that there is both an is_float and a filter_var because the latter will already fail or return a float value. If we want to return a boolean then we should rather cast it or apply a double negation to the result. I have a feeling the cast will be less expensive but is is as awesome? =) Which do you prefer?

Perhaps we should rather let PHP do the conversion and we only check the result, using is_float on a 0 + $input expression is much less code and should perform better as well for exactly the same result.

Proof is in the pudding

Lets do some benchmarks testing both true and false conditions over 10 000 000 repetitions for 3 iterations each:

is_float(filter_var($val, FILTER_VALIDATE_FLOAT));
$val = '3.5'
─────────────────────────────────
took 20.521008968353s to complete 10000000 repititions result true
took 20.364706039429s to complete 10000000 repititions result true
took 20.182450056076s to complete 10000000 repititions result true

$val = '3,5'
─────────────────────────────────
took 18.645578145981s to complete 10000000 repititions result false
took 18.549484968185s to complete 10000000 repititions result false
took 18.853252887726s to complete 10000000 repititions result false
!!filter_var($val, FILTER_VALIDATE_FLOAT);
$val = '3.5'
─────────────────────────────────
took 13.590573072433s to complete 10000000 repititions result true
took 13.544196844101s to complete 10000000 repititions result true
took 13.317713022232s to complete 10000000 repititions result true

$val = '3,5'
─────────────────────────────────
took 12.163883924484s to complete 10000000 repititions result false
took 11.742944002151s to complete 10000000 repititions result false
took 12.185375928879s to complete 10000000 repititions result false
(bool)filter_var($val, FILTER_VALIDATE_FLOAT);
$val = '3.5'
─────────────────────────────────
took 12.532752990723s to complete 10000000 repititions result true
took 12.35497879982s to complete 10000000 repititions result true
took 12.407600879669s to complete 10000000 repititions result true

$val = '3,5'
─────────────────────────────────
took 10.982671976089s to complete 10000000 repititions result false
took 10.937168121338s to complete 10000000 repititions result false
took 11.02613401413s to complete 10000000 repititions result false
is_float(0 + $val);
$val = '3.5'
─────────────────────────────────
took 10.79625415802s to complete 10000000 repititions result true
took 10.499306201935s to complete 10000000 repititions result true
took 10.448944091797s to complete 10000000 repititions result true

$val = '3,5'
─────────────────────────────────
took 10.567763090134s to complete 10000000 repititions result false
took 10.54952287674s to complete 10000000 repititions result false
took 10.211115121841s to complete 10000000 repititions result false

results from execution in PHP 5.5.7 on Mac OS X 10.8.

Results

The results speak for themselves, a 50% performance improvement and reduction of code is a win-win on all sides. @klederson if you can update the PR I don't see any reason we can't see this committed. Don't forget the unit tests!!! =)

I am pleasantly surprised with the double negation which comes at a marginal sacrifice even compared to casting a value which is already a boolean.


Embrace php 5.4+ syntax

@Respect do you think it is time already for us to drop 5.3 support in favour of the 5.4+ syntax enhancements. With 5.4+ we can access the array directly returned from localeconv romeving the need for the variable reference and the array syntax greatly improves readability, do you agree? With 5.5 at release 7 and 5.4 already with 23 releases I believe it is long overdue.

Example:

return !!filter_var($input, FILTER_VALIDATE_FLOAT, , [ 'options' => [ 'decimal' => localeconv()['decimal_point'] ] ]);

/** compared to **/

$localeData = localeconv();
return is_float(filter_var($input, FILTER_VALIDATE_FLOAT, array("options"=>array('decimal' => $localeData['decimal_point']))));

What do you say?

@nickl-
Respect member

tl;dr Danger do not merge!!!

Remains open for an opportunity to improve the function.

@klederson
@nickl-
Respect member

@klederson Did you see this section?

Something you can fix

Now that you mention it and brought it out for attention there is something bugging me. I find it very strange that there is both an is_float and a filter_var because the latter will already fail or return a float value. If we want to return a boolean then we should rather cast it or apply a double negation to the result. I have a feeling the cast will be less expensive but is is as awesome? =) Which do you prefer?

Perhaps we should rather let PHP do the conversion and we only check the result, using is_float on a 0 + $input expression is much less code and should perform better as well for exactly the same result.

If you still have this development environment handy and submit these changes instead we can happily merge that. =)

@augustohp augustohp added Xtra Credit and removed Xtra Credit labels Feb 15, 2014
@augustohp augustohp modified the milestone: 0.7.0, 0.6.1 Feb 20, 2014
@augustohp augustohp modified the milestone: Backlog, 0.7.0 May 9, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment