-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ICU4x number formatting #303
base: main
Are you sure you want to change the base?
ICU4x number formatting #303
Conversation
753f77b
to
43e798c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought: I'm not done reviewing this, but figured I would leave some early feedback. The implementation is looking pretty good in my initial review. I still need to complete a more thorough one.
This is going to take some careful consideration to merge since it can impact binary sizes. In addition this requires the users to have a data provider, which could be pretty significant. I'm wondering if this feature might need to start out as a trait.
Firefox is really sensitive to binary size increases, so I wouldn't feel comfortable merging until I knew we could manage the dependency changes and the additional data requirements. If Firefox has these issues, then it's possible other implementations will as well.
There are essentially 2 places new binary size can increase here, the dependency executable code, and in the data in the data provider. It would be good to measure both of these. I wonder if you just shipped with an en-001
locale (or equivalent) as a single locale, if it would be the equivalent behavior as what's going on now in Firefox and the fluent-rs project.
(I'm not the original author of this code, so I'd have to read through the code to fully know the answer here.)
Do you have any thoughts in this space @JasperDeSutter?
Yeah I agree the increased binary size is the probably the main concern for incorporating ICU components. The way I've implemented it currently allows for optionally providing the data as a user, falling back to basic number formatting when no provider is available. As such there shouldn't be a size increase from this part for consumers that don't want it. The formatting code and supporting ICU libraries are unconditionally included though. I'm sure they have a sizable impact on binary size and compile time seeing the number of crates pulled in, I'll take a look into profiling binary sizes. We could use the same feature flag for conditionally depending on these libraries as for including the data. Note that if we would want to also use ICU4X for plural rules or locale fallback, the story changes. Since these are required for the working of fluent message formatting we couldn't just put icu libraries behind a feature flag anymore. |
I compared the average binary size of all examples in fluent-bundle: (stripped release build) 605KB - without decimal formatting (main branch): 100% To me it looks like a good idea to not include data, but I'm not sure whether a feature flag is warranted for the formatting code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me it looks like a good idea to not include data, but I'm not sure whether a feature flag is warranted for the formatting code.
I agree. 38kb is not too bad in terms of binary size increase. Thanks for quantifying it! That's really helpful. Firefox plans on eventually integrating ICU4X, but there's not a concrete written plan yet. The binary increase seems acceptable, and eventually Firefox will have its own data provider, and will control that data cost.
I'd like to get some feedback with other Mozillians in this space, and we're hitting the holidays where people are out.
I think the next steps here would be to add the trait in just for the data, and make sure the ICU4X bits are well documented with suggestions on the proper way to integrate the data. I might run this by for feedback from the ICU4X team after soliciting feedback from other Mozillians.
I'm marking request changes since there's actionable feedback already, feel free to re-request review if you would like a more thorough code review at this point before adding the trait in.
Also to be clear here, I'd like to wait until after the holidays at least before merging since people are probably out since this affects the architecture of this project and Firefox. I think any code updates and reviews can happen before then, but I just wanted to set expectations. |
I'm closing as stale since there were requested changes. Feel free re-open. |
43e798c
to
55f6483
Compare
any chance we could get a review on this? |
@fee1-dead Sure, go ahead and review. Test results welcome. Only half jesting, if you want to help move things along the best way to do that is contribute by reviewing and testing yourself and reporting results, not nagging other volunteers. Also be aware that because this introduces a bunch of new dependencies we'll probably have to hold off on merging until we can base it on or at least compare it with what is happening in #335. This work should jive pretty well with that, but it would be hard to give it a blank yes and merge before we have a solid idea how other ICU4X integration is going to work. |
15f5d13
to
5812fda
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
@fee1-dead Did you review this in context with other ICU4x related work such as in #335 and the discussions in a number of related issues. We could use help reviewing and sorting out all the moving pieces, but just saying this one PR "lgtm" in isolation doesn't move the ball forward much. |
if I actually do get time I can review the other PR, but at a first glance this PR is isolated enough (only uses icu4x for number formatting) I don't think it would be very probable for a conflict with that effort to use icu4x in more places. |
Adds basic decimal formatting for FluentNumber.as_string(). Many features are still missing like currency symbols and percentage formatting. #269
Since data providers need to be managed explicitly, it needs to be handed to the memoizer to be able to construct a formatter. This required a new argument + associated type on Memoizable.
Something to figure out is where the data comes from. I've went with a simple FluentBundle.set_number_format_provider, which seems most flexible to me. I've added a test that uses icu_testdata for this, but other providers should work as well.
https://icu4x.unicode.org/doc/icu/#data-management