[IMP] web: more lenient number parser #115227

caburj · 2023-03-14T16:00:33Z

PURPOSE

Avoid faulty inputs caused by a weird parsing/detection mixup between thousand
and decimal separator. We must let go of the (wrong) assumption that people will paste
numbers into the format defined on their locale. We should not depend on the locale
for number parsing as much as possible.

The locale format will still be used for formatting of course, so this config has
its use - but parsing should be much more agnostic.

HOW

The purpose is achieved by doing the following parsing steps:

Parse the input using the original more strict parser using the locale.
If it failed in the first step, we parse again using a more lenient heuristics.

This lenient parsing heuristic follows the steps:

Remove all the whitespaces.
Assuming "dot" and "comma" as separators, we collect them from the input in sequence.
If the number of separators is one
- Check if it's a thousands separator from the locale
  - If so, remove it from the input
  - Otherwise, replace it with "dot" (as decimal point).
If the number of separators is two
- Check if they're the same.
  - If so, then remove them because they're thousands separators.
  - Otherwise, the first is a thousands separator and the second is decimal point.
    - Remove the thousands separator and replace the decimal point with "dot".
If the number of separators is more than two
- The first separators should be thousands separators while the last one is a decimal point.
- Check if the first separators are all the same
  - If not, throw an error, the input can't be a number.
- Check if the first separators are the same as the last
  - If so, then remove them all from the input (they're just thousands separators)
  - Otherwise, remove the thousands separators and replace the decimal point with "dot".
Convert the resulting input to number using Number.

TASK-ID: 3092583

I confirm I have signed the CLA and read the PR guidelines at www.odoo.com/submit-pr

robodoo · 2023-03-14T16:00:59Z

Pull request status dashboard.

**PURPOSE** Avoid faulty inputs caused by a weird parsing/detection mixup between thousand and decimal separator. We must let go of the (wrong) assumption that people will paste numbers into the format defined on their locale. We should not depend on the locale for number parsing as much as possible. The locale format will still be used for formatting of course, so this config has its use - but parsing should be much more agnostic. **HOW** The purpose is achieved by doing the following parsing steps: 1. Parse the input using the original more strict parser using the locale. 2. If it failed in the first step, we parse again using a more lenient heuristics. This lenient parsing heuristic follows the steps: - Remove all the whitespaces. - Assuming "dot" and "comma" as separators, we collect them from the input in sequence. - If the number of separators is one - Check if it's a thousands separator from the locale - If so, remove it from the input - Otherwise, replace it with "dot" (as decimal point). - If the number of separators is two - Check if they're the same. - If so, then remove them because they're thousands separators. - Otherwise, the first is a thousands separator and the second is decimal point. - Remove the thousands separator and replace the decimal point with "dot". - If the number of separators is more than two - The first separators should be thousands separators while the last one is a decimal point. - Check if the first separators are all the same - If not, throw an error, the input can't be a number. - Check if the first separators are the same as the last - If so, then remove them all from the input (they're just thousands separators) - Otherwise, remove the thousands separators and replace the decimal point with "dot". - Convert the resulting input to number using `Number`. TASK-ID: 3092583

caburj force-pushed the master-web-float-input-parsing-jcb branch from 9b3e875 to 52e0fc0 Compare March 14, 2023 16:13

C3POdoo added the RD research & development, internal work label Mar 14, 2023

caburj force-pushed the master-web-float-input-parsing-jcb branch 3 times, most recently from 0adddea to a8b5476 Compare March 15, 2023 13:41

caburj changed the title ~~WIP: working~~ [IMP] web: more lenient number parser Mar 15, 2023

caburj force-pushed the master-web-float-input-parsing-jcb branch 3 times, most recently from 41cb0d3 to 56885fb Compare March 15, 2023 14:10

caburj force-pushed the master-web-float-input-parsing-jcb branch from 56885fb to dfd3a4c Compare March 16, 2023 12:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[IMP] web: more lenient number parser #115227

[IMP] web: more lenient number parser #115227

caburj commented Mar 14, 2023 •

edited

robodoo commented Mar 14, 2023

[IMP] web: more lenient number parser #115227

Are you sure you want to change the base?

[IMP] web: more lenient number parser #115227

Conversation

caburj commented Mar 14, 2023 • edited

robodoo commented Mar 14, 2023

caburj commented Mar 14, 2023 •

edited