Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reducing memory footprint by reusing Intl.DateTimeFormat instances #7

Closed
fer22f opened this issue Jul 8, 2021 · 4 comments · Fixed by #10
Closed

Reducing memory footprint by reusing Intl.DateTimeFormat instances #7

fer22f opened this issue Jul 8, 2021 · 4 comments · Fixed by #10

Comments

@fer22f
Copy link
Contributor

fer22f commented Jul 8, 2021

Hello. I was going to post this on the original proposal-temporal repo, but since the goal there wasn't about making a polyfill for production use, I decided not to. However, since this polyfill has that goal, it seems like a relevant issue.

When using the polyfill, I ran into a deno memory issue (which I reported in denoland/deno#10721). The root problem is that in the polyfill, there are a lot of new Intl.DateTimeFormat calls, objects which V8 apparently doesn't clean up as well as it could. After some investigation, I found out node is affected too, albeit some orders of magnitude less.

The issue can be demonstrated in Temporal as follows:

import { Temporal } from '@js-temporal/polyfill';
const { ZonedDateTime } = Temporal;

setInterval(() => {
  var a;
  for (let i = 0; i < 100; i++) {
    a = ZonedDateTime.from("2021-01-01T00:00:00-03:00[America/Sao_Paulo]");
    a = Temporal.now.zonedDateTimeISO().toString();
  }
}, 1);

Which ranks at ~567M resident memory in node, and ~1033M in (newer) deno.

My solution to this problem originally, was to fork the repo for my personal use and implement a cache for these objects. A Map caches IntlDateTimeFormats at every site they are required, as such:

// ecmascript.mjs
const IntlDateTimeFormatEnUsCache = new Map();

// ...
  GetCanonicalTimeZoneIdentifier: (timeZoneIdentifier) => {
    // ...
    if (!IntlDateTimeFormatEnUsCache.has(String(timeZoneIdentifier))) {
      IntlDateTimeFormatEnUsCache.set(String(timeZoneIdentifier), new IntlDateTimeFormat('en-us', {
        timeZone: String(timeZoneIdentifier), hour12: false, era: 'short',
        year: 'numeric', month: 'numeric', day: 'numeric',
        hour: 'numeric', minute: 'numeric', second: 'numeric'
      }));
    }
    const formatter = IntlDateTimeFormatEnUsCache.get(String(timeZoneIdentifier));

Node's memory footprint in that same code above went to ~100K after this change. Deno scored at ~239M (which is about the floor memory usage of deno currently).

This is an issue that V8 could fix, but it can also be done at the library level.

justingrant added a commit to justingrant/proposal-temporal that referenced this issue Jul 9, 2021
I was wrong about what was making non-ISO calendars so slow. I
thought the problem was `formatToParts()`, but it turns out that the
`DateTimeFormat` constructor is really slow and also allocates
ridiculous amounts of RAM. See more details here:
https://bugs.chromium.org/p/v8/issues/detail?id=6528

@littledan in https://bugs.chromium.org/p/v8/issues/detail?id=6528#c4
recommended to cache DateTimeFormat instances, so that's what this
commit does.

The result is a 6x speedup in non-ISO calendar tests.
Before: 6398.83ms
After: 1062.26ms

A similar speedup is likely for `ES.GetCanonicalTimeZoneIdentifier`.
Caching time zone canonicalization (in a separate PR) should have
a big positive impact on ZonedDateTIme and TimeZone perf.

Many thanks to @fer22f for uncovering this optimization in
js-temporal/temporal-polyfill#7.
@justingrant
Copy link
Contributor

justingrant commented Jul 9, 2021

@fer22f - Really good detective work to isolate this problem to Intl.DateTimeFormat! Thanks so much for doing this investigation. The underlying issue looks like https://bugs.chromium.org/p/v8/issues/detail?id=6528 which was resolved Won't Fix 4 years ago. The TL;DR from that 2017 issue is that new DateTimeFormat() is a really expensive operation in both time and memory. @sffc, @ryzokuken, @littledan - is this expected behavior? Seems pretty excessive.

@fer22f - The workaround that @littledan proposed is exactly what you're suggesting doing above: cache instances of DateTimeFormat.

The same optimization also has a huge positive impact on non-ISO calendar performance: those tests run about 10x faster. See #8.

intl.mjs also uses DateTimeFormat, but making that more efficient probably requires a different approach. See #9.

@fer22f - Would you be interested in making a PR to optimize GetCanonicalTimeZoneIdentifier? PRs are always welcome! Note that GetFormatterParts should probably share the same cache. EDIT: note that SystemTimeZone cannot be optimized, because that would break the ability to recognize dynamic changes in system time zone, as you'd see on a mobile device when an airplane lands in a new country.

@fer22f
Copy link
Contributor Author

fer22f commented Jul 9, 2021

Sure, I have some code ready. Mostly https://github.com/fer22f/proposal-temporal/commit/5d3b36e6d1d6fb7e7eb0a83d75467dfd2f3beb44 and https://github.com/fer22f/proposal-temporal/commit/64611a1cda8073e251cd686e5876c165466c26c1. I think adding era: short to both is probably fine, right?

The fact that we can't optimize SystemTimeZone is pretty sad, since it's something that is called every time you call Temporal.now.zonedDateTimeISO()...

But I think it's probably okay, and perhaps something that can be documented (?). I just did a search in my codebase for it, and I have 0 calls to Temporal.now.zonedDateTimeISO(). I mostly use Instant and TimeZone.getInstantFor, so if I use a specific TimeZone to construct all the dates - which I have to do anyway in server-side code since the timezone there is usually wrong - I think the optimization still applies.

@justingrant
Copy link
Contributor

BTW, I ran some quick tests and it looks like this optimization will at least double the performance of Temporal's test cases, so the improvement in ZonedDateTime and TimeZone perf must be pretty massive given that those are only a small fraction of our tests. Nice!

adding era: short to both is probably fine, right?

Yep!

Mostly fer22f/proposal-temporal@5d3b36e

Looks OK. Consider refactoring common code into one method. Minor thing: I'd suggest optimizing for the already-cached case: calling get and caching if undefined is returned, instead of calling has every time when that call may not be needed. (Unless there's some advantage of calling has that I"m not aware of.)

if I use a specific TimeZone to construct all the dates - which I have to do anyway in server-side code since the timezone there is usually wrong - I think the optimization still applies.

This is correct.

ptomato pushed a commit to justingrant/proposal-temporal that referenced this issue Jul 9, 2021
I was wrong about what was making non-ISO calendars so slow. I
thought the problem was `formatToParts()`, but it turns out that the
`DateTimeFormat` constructor is really slow and also allocates
ridiculous amounts of RAM. See more details here:
https://bugs.chromium.org/p/v8/issues/detail?id=6528

@littledan in https://bugs.chromium.org/p/v8/issues/detail?id=6528#c4
recommended to cache DateTimeFormat instances, so that's what this
commit does.

The result is a 6x speedup in non-ISO calendar tests.
Before: 6398.83ms
After: 1062.26ms

A similar speedup is likely for `ES.GetCanonicalTimeZoneIdentifier`.
Caching time zone canonicalization (in a separate PR) should have
a big positive impact on ZonedDateTIme and TimeZone perf.

Many thanks to @fer22f for uncovering this optimization in
js-temporal/temporal-polyfill#7.
ptomato pushed a commit to tc39/proposal-temporal that referenced this issue Jul 9, 2021
I was wrong about what was making non-ISO calendars so slow. I
thought the problem was `formatToParts()`, but it turns out that the
`DateTimeFormat` constructor is really slow and also allocates
ridiculous amounts of RAM. See more details here:
https://bugs.chromium.org/p/v8/issues/detail?id=6528

@littledan in https://bugs.chromium.org/p/v8/issues/detail?id=6528#c4
recommended to cache DateTimeFormat instances, so that's what this
commit does.

The result is a 6x speedup in non-ISO calendar tests.
Before: 6398.83ms
After: 1062.26ms

A similar speedup is likely for `ES.GetCanonicalTimeZoneIdentifier`.
Caching time zone canonicalization (in a separate PR) should have
a big positive impact on ZonedDateTIme and TimeZone perf.

Many thanks to @fer22f for uncovering this optimization in
js-temporal/temporal-polyfill#7.
@sffc
Copy link
Contributor

sffc commented Jul 14, 2021

@fer22f - Really good detective work to isolate this problem to Intl.DateTimeFormat! Thanks so much for doing this investigation. The underlying issue looks like https://bugs.chromium.org/p/v8/issues/detail?id=6528 which was resolved Won't Fix 4 years ago. The TL;DR from that 2017 issue is that new DateTimeFormat() is a really expensive operation in both time and memory. @sffc, @ryzokuken, @littledan - is this expected behavior? Seems pretty excessive.

new DateTimeFormat() being an expensive operation is a long-time known issue in ICU that just hasn't been urgent enough to fix, since people tend to all build caches around ICU. (ICU4X intends to actually fix the issue.)

https://unicode-org.atlassian.net/browse/ICU-20115

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants