Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve plural operand handling #1091

Open
3 tasks
echeran opened this issue Sep 23, 2021 · 12 comments
Open
3 tasks

Improve plural operand handling #1091

echeran opened this issue Sep 23, 2021 · 12 comments
Assignees
Labels
C-pluralrules Component: Plural rules S-medium Size: Less than a week (larger bug fix or enhancement) T-core Type: Required functionality

Comments

@echeran
Copy link
Contributor

echeran commented Sep 23, 2021

In case we're not already doing it, we should make sure that when parsing a string representing a number and/or performing PluralRules::select():

  • we parse both compact notation (1.2c3) and scientific notation (1.2e)
  • distinguish the c operand from the e operand, to future-proof when those behave differently
  • make sure we compute plural operands correctly for numbers with exponents (the examples list from the latest spec makes for good test cases)
@sffc sffc added C-pluralrules Component: Plural rules S-medium Size: Less than a week (larger bug fix or enhancement) T-core Type: Required functionality labels Oct 21, 2021
@sffc sffc added this to the ICU4X 0.5 milestone Oct 21, 2021
@sffc sffc modified the milestones: ICU4X 0.5, ICU4X 0.6 Jan 27, 2022
@sffc sffc assigned eggrobin and unassigned echeran May 25, 2022
@sffc sffc modified the milestones: ICU4X 0.6, ICU4X 1.0 (Features) May 25, 2022
@sffc
Copy link
Member

sffc commented May 25, 2022

@eggrobin Could you take a look at the bullet points in this issue and determine what needs to be done?

@sffc
Copy link
Member

sffc commented Jul 28, 2022

@eggrobin Please take a look at this; thanks!

@echeran
Copy link
Contributor Author

echeran commented Sep 15, 2022

Work to be done here:

  • Update PluralOperands struct
    • Create field for scientific notation exponent (just like c, but called e; see UTS 35 on Plural Rules for background)
  • Update FixedDecimal struct
    • Add field for compact notation exponent (corresponds to PluralOperands::c)
    • Add field for scientific exponent (to match field for compact exponent)
    • Add method to allow adjustment of compact exponent (for when constructing anew; can skip scientific for now)
  • Update parsing & formatting for FixedDecial
    • Can refer to corresponding ICU PR for parse/format methods for compact notation strings
    • The FromStr and Display impls in FixedDecimal should support parsing the 1.23c4 compact notation. (Also check on Debug and other trait impls too)
  • Update FixedDecimal's formatter in icu_decimal
    • Use formatting symbols compact notation
    • Update the data struct to make such symbols data available to the formatter
  • Update PluralRules, etc. functionality
    • Update conversion to recognize compact and scientific notation exponent fields in impl From<&'_ FixedDecimal> for PluralOperands
    • Update PluralRules::category_for() method to use compact notation exponent field

@sffc
Copy link
Member

sffc commented Sep 15, 2022

Discussion:

  • We're not fully sure whether we need both c and e in PluralOperands
  • To punt this problem, make PluralOperands be non_exhaustive and add a function PluralOperands::from_ivwftc() for 1.0. Eat the ergonomics hit for now. We can make it exhaustive again in the future once we are more confident.
  • Punt the rest of this ticket to 1.x

@zbraniecki
Copy link
Member

PluralOperands::from_ivwftc()

Can we make sure to have this as experimental? This is an ugly API and I'd like to get rid of it ASAP

Alternatively, could we have a struct called IVWFTC that PluralOperands can be From? No other methods needed, the new struct would be marked as experimental.

@sffc
Copy link
Member

sffc commented Sep 15, 2022

The other option is we just change all call sites to do Default::default() and set the fields they need. Would that be more palatable?

@sffc sffc added the needs-approval One or more stakeholders need to approve proposal label Sep 15, 2022
@zbraniecki
Copy link
Member

The other option is we just change all call sites to do Default::default() and set the fields they need. Would that be more palatable?

let po = PluralOperands::from(IVWFTC {
    i: 2,
    v: 1,
    f: 1,
    t: 2,
    c: 0,
});

vs.

let mut po = PluralOperands::default();
po.i = 2;
po.v = 1;
po.f = 1;
po.t = 2;

vs.

let po = PluralOperands::from_ivwftc(2, 1, 1, 2, 0);

My vote is on the first.

@sffc
Copy link
Member

sffc commented Sep 15, 2022

The advantage of option 2 is that we don't need any more APIs or experimental features.

Between 1 and 3, I slightly prefer 3 because it doesn't involve importing an extra type and calling the less-discoverable From/Into impls; you just invoke the function.

@sffc sffc added discuss-priority Discuss at the next ICU4X meeting and removed needs-approval One or more stakeholders need to approve proposal labels Sep 20, 2022
@sffc
Copy link
Member

sffc commented Sep 20, 2022

After discussing with @markusicu and @echeran, the point was brought up that these fields should really not generally be things that users of ICU4X should know about or mutate.

I'm therefore heavily inclined to recommend the following:

  1. Make all of the fields of PluralOperands be private (pub(crate))
  2. Add from_ivwftc as a doc(hidden) constructor
  3. Revisit this in the future

@zbraniecki
Copy link
Member

My vote is 1 > 3 >> 2 for reasons stated above.

@sffc
Copy link
Member

sffc commented Sep 22, 2022

  • @zbraniecki - It seems useful to allow people to mutate their plural operands.
  • @sffc - Some of the operands, like v, w, f, and t, are not independent. It doesn't make sense to change one without changing the others.
  • @zbraniecki - My intuition is to just make it non_exhaustive.
  • @markusicu - It needs to at least be non_exhaustive. It needs to be able to evolve as the spec evolves. But I've been bitten in the past by making public APIs that were not actually useful. I'd rather users have constructors to make these things according to CLDR rules rather than fiddling with the details. But if there's a use case, it's fine to make it public.
  • @zbraniecki - With the AST and parser being public, we allow people to customize their plurals engine. It's weird if we make some parts public and others not.
  • @markusicu - I didn't know that all the other parts were public. Do they need to be?
  • @zbraniecki - They are doc(hidden). They don't need to be public in ICU4X, but it seems to make sense to have a separate crate with different stability rules. There are customers who may need this.
  • @markusicu - My instinct, again, is to just not make public what users don't need.
  • @echeran - What I put in the PR is a public constructor to generate the plural operands, which gives room for, e.g., asserting invariants. In the case when we evolve the PluralOperands struct to, e.g., say that c and e are different operands, we may want to assert that they aren't nonzero at the same time.
  • @sffc - I've always been a bit concerned with the reference AST being part of our public API
  • @zbraniecki - I want to have a discussion on the long-term solution so that we don't preclude that possibilty in the future.
  • @sffc - This sounds like a good use case for experimental in 1.0

@sffc
Copy link
Member

sffc commented Sep 22, 2022

What to do next:

  • Make icu_plurals::rules experimental
  • Add a type icu_plurals::rules::RawPluralOperands, which is exhaustive with public fields
  • Add conversions between PluralOperands and RawPluralOperands
  • Update code that needs to build PluralOperands directly to use RawPluralOperands instead
  • Drop the PluralOperands::n() method; perhaps move to RawPluralOperands

LGTM: @sffc @zbraniecki @markusicu @echeran

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-pluralrules Component: Plural rules S-medium Size: Less than a week (larger bug fix or enhancement) T-core Type: Required functionality
Projects
None yet
Development

No branches or pull requests

4 participants