Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design Doc: Vertical Fallback in Data Loading #1462

Closed
wants to merge 4 commits into from

Conversation

sffc
Copy link
Member

@sffc sffc commented Dec 31, 2021

Fixes #1203

I've been promising a design doc for the vertical fallback implementation, a follow-up to Mihai's doc from earlier in the quarter about the theory.

The doc assumes that the reader is already familiar with the subject matter; I do not spend much time clarifying basic concepts.

In the doc, I state some claims and assumptions. At the end, I lay out four possible implementation approaches:

  1. Complete Runtime Vertical Fallback
  2. Simple Vertical Fallback with Minimized Locales
  3. Simple Vertical Fallback with Manually Maximized Locales
  4. Simple Vertical Fallback with Automatically Maximized Locales

I claim that all of these approaches achieve the same end result, but with their own caveats. I'd appreciate feedback on these four approaches, but try to keep feedback on the technical merits.

@sffc sffc requested a review from a team as a code owner December 31, 2021 07:22
@sffc
Copy link
Member Author

sffc commented Dec 31, 2021

CC @richgillam

Copy link

@richgillam richgillam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I commented on this in the ICU-TC meeting yesterday, and I don't really have anything new to say. I wanted to reread this document to make sure I didn't miss anything, and I think I understand what you're proposing a lot better now that I've read it a second time, and I think I like what you're doing here.

As I mentioned, I'm interested in seeing what you do about horizontal inheritance, but I know that's a separate problem...


Vertical fallbacking should follow a tree: every locale should have exactly one parent, except the root locale, which has no parent.

In the absence of overrides, a locale's parent is the locale with one subtag removed from the end of a locale: `pr-Latn-PT` → `pt-Latn` → `pt` → `und`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: "pt-Latn-PT".

**Cons:**

- Requires shipping the likely subtags code and data
- All lookups incur a small but nonempty penalty when maximization occurs

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I said my piece in the ICU-TC meeting the other day, but I think I vote for Option 3, with a provision to fall back to Option 2 for implementations that know their locale data doesn't need it and don't want to incur the space penalty for the likely-subtags tables.

Comment on lines +130 to +132
- Footgun: If a non-minimized locale is passed at runtime, it could have the incorrect behavior. For example, `sr-Latn-ME` or `de-Latn-LI` may not find the correct data.
- Struggles to scale when there are many data keys

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These cons sound disqualifying to me.

@sffc
Copy link
Member Author

sffc commented Jan 11, 2022

Thank you @richgillam for your review!

This is a friendly ping to all my other reviewers:

  • @dminor as the likely subtags expert
  • @zbraniecki as the author of the locale class
  • @mihnita as the author of the language negotiation doc
  • @nciric for visibility into size-constrained use cases
  • @markusicu as the primary implementer of this logic in ICU4C

Copy link
Contributor

@dminor dminor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all seems reasonable to me. I agree with Rich's comment, option 3, with a way to disable maximization for users that don't want it seems like a good way ahead.

@sffc
Copy link
Member Author

sffc commented Jan 27, 2022

I will prepare an update to this PR within the next week and then re-request review.

@sffc sffc added the waiting-on-author PRs waiting for action from the author for >7 days label Apr 15, 2022
@sffc
Copy link
Member Author

sffc commented Apr 15, 2022

This PR is obsolete so I will close it and replace it with a new one later.

@sffc sffc closed this Apr 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting-on-author PRs waiting for action from the author for >7 days
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Document design choices for vertical fallback and deduplication in data loading
3 participants