Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Field type to handle currency amounts in GTFS? #254

Closed
scmcca opened this issue Nov 24, 2020 · 7 comments
Closed

Field type to handle currency amounts in GTFS? #254

scmcca opened this issue Nov 24, 2020 · 7 comments

Comments

@scmcca
Copy link
Contributor

scmcca commented Nov 24, 2020

Hi everyone,

An issue was raised in the GTFS Fares v2 extension project around describing currency amounts for fare prices. The proposal currently describes currency amounts using float or non-negative float field types. General best practices suggest to never use floats for money due to loss/gain of money during calculations.

We (MobilityData) see a few options:

  • (a) Keep currency amounts as float, and leave correct calculations up to consumers (i.e., defining currency amounts in a preferred type, rounding calculations to two decimal places).
  • (b) Describe currency amounts as integer increments of their smallest unit (i.e., 125 cents to describe 1.25 dollars).
  • (c) Add a decimal field type to GTFS.

What are your thoughts on describing currency amounts in GTFS? We are interested in hearing different perspectives: producers, consumers, and other stakeholders.

For general discussions/needs related to Fares v2, see #252.

Thanks!

@barbeau
Copy link
Collaborator

barbeau commented Nov 24, 2020

FWIW, "data types" in GTFS are really a semantic definition - the actual data container is a string in a CSV file encoded in UTF-8.

The problem with currency calculations comes in after the String is read by consumer software, converted into a float data type in some programming language, and then used in financial calculations.

No matter what representation is in the spec, IMHO we should explicitly call out the issue with doing financial calculations using floating point values in the spec (including with the current price field). It's possible that we engineer the spec with special data types and consumers still without thinking just read it into a float. We could point to specific programming language fields that should be used such as:

Some thoughts on the options:

(a) Keep currency amounts as float, and leave correct calculations up to consumers (i.e., defining currency amounts in a preferred type, rounding calculations to two decimal places).

This could work, although as I mentioned above we should still call out why you shouldn't use float in software for financial calculations in the spec.

(b) Describe currency amounts as integer increments of their smallest unit (i.e., 125 cents to describe 1.25 dollars).

This could work, but I don't think it solves the underlying problem of consumer software importing it as a float. The issue isn't in the GTFS data container, it's in how consumers read it. It would be easy for someone to just write this without thinking - float myPrice = gtfsPrice / 100.

(c) Add a decimal field type to GTFS.

Again, the difference here between calling it a "decimal" vs "float" is just semantics in GTFS data types, but I do like that this explicitly calls out that you should be using a decimal type in your programming language rather than float. FWIW, I believe we could also change the existing GTFS fare price field data type from "non-negative float" to "decimal" without any backwards compatibility issues (but someone correct me if I'm wrong).

@timMillet
Copy link
Contributor

timMillet commented Nov 24, 2020

There are some normalization concerns regarding option B:

AFAIK, there is no international standards for currency subdivisions. The ISO 4217 standard for currencies only brings information regarding the number of digits after the decimal separator (from 0 to 4 digits).

Although the GTFS Fares v2's field definition of currency could evolve so that any fare amount provided stands for the currency subdivision as defined in ISO 4217 by the number of digits after the decimal separator (e.g: the number of digits after the decimal separator for EUR is 2, so amount=83 with currency EUR would mean EUR 0.83 instead of EUR 83), two issues will be faced:

  • In order to display the fare amount in the actual currency (e.g. EUR instead of 1/100 EUR), data consumers will have to interpret a value from the dataset according to information outside of it. This is feasible as it already exists in GTFS for currency codes and languages, but remains a source of confusion.
  • Some currencies do not use decimal subdivisions and this information is not well normalized in ISO 4217. E.g: 1 MGA (currency from Madagascar) is subdivided into 1/5 MGA, i.e 1 ariary = 5 iraimbilanja. The ISO 4217 standard will show 2 as the number of digits after the decimal separator for MGA, exactly the same number of digits than EUR, even if EUR is subdivided into 1/100 instead of 1/5. So amount=83 with currency EUR would represent EUR 0.83, but should amount=83 with currency MGA represent MGA 0.83 or 16.6? This creates another source of confusion that could be prevented by using currencies instead of currency subdivisions.

@ghost
Copy link

ghost commented Nov 24, 2020

  • In order to display the fare amount in the actual currency (e.g. EUR instead of 1/100 EUR), data consumers will have to interpret a value from the dataset according to information outside of it. This is feasible as it already exists in GTFS for currency codes and languages, but remains a source of confusion.

Could that be fixed by putting some semantics in the data itself (like language codes) and/or metadata? Just musing

@timMillet
Copy link
Contributor

Could that be fixed by putting some semantics in the data itself (like language codes) and/or metadata? Just musing

Yes, indeed. However, is it in the GTFS scope to maintain lists on language and currency codes?
Also, I'd think the issue of interpreting a value from information outside of the dataset is less important for language and currency codes, as no calculation is directly made from it.

@paulswartz
Copy link
Contributor

https://en.wikipedia.org/wiki/ISO_4217#Unofficial_codes_for_minor_units_of_currency suggests using USX / USc for the US cent, or EUX / EUc for the Euro cent. However, they're non-standard.

I still think the better solution is to define the currency values as decimals and encourage clients to process them as decimals as much as possible.

@scmcca
Copy link
Contributor Author

scmcca commented Dec 1, 2020

From the conversation above, it seems like Option A or C are the more viable paths.

Option A or C:
I very much agree with @barbeau and @paulswartz that handling financial calculations needs to be called out specifically, either by using decimal, money field types, or other specific programming language fields as suggested.

Option C:
If defining a new field type for currency amounts, we would likely want to define additional rules such as number of decimal places (depending on the currency) and that the number of decimal places be retained throughout calculations (i.e., behaving like a decimal in calculations). This would be more specific than Decimal and could be called Currency Amount.

Here is my proposal following Option C (updated on 2020-12-02):

  • Currency Amount - A decimal value indicating a currency amount. The number of decimal places is specified by ISO 4217 for the accompanying Currency Code. All financial calculations must be processed as decimal, currency, or another equivalent type. Processing Currency Amounts as float is discouraged due to gains or losses of money during calculations.

@scmcca
Copy link
Contributor Author

scmcca commented Dec 11, 2020

This seems to be resolved. The proposal above for Currency Amount has been implemented in the Fares v2 proposal document. If any additional thoughts on expressing currency amounts in GTFS emerge we can reopen this issue. Thanks, everyone!

@scmcca scmcca closed this as completed Dec 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants