Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: reduce the latency of downloading pdf.worker.min.js #39

Closed
Tracked by #38
xiaohanyu opened this issue Apr 24, 2024 · 2 comments
Closed
Tracked by #38

perf: reduce the latency of downloading pdf.worker.min.js #39

xiaohanyu opened this issue Apr 24, 2024 · 2 comments
Assignees
Labels
frontend Frontend related perf Performance related
Milestone

Comments

@xiaohanyu
Copy link
Member

xiaohanyu commented Apr 24, 2024

Description

PPResume use react-pdf-viewer for viewing PDF, however, it has a serious bug, when the encoded PDF (by base64) changed, it will re-init its worker and one of its dependency pdf.worker.min.js would be re-downloaded.

In the whole session of resume composing, pdf.worker.min.js would be re-downloaded tens of hundreds of times, which is both bad for performance and bandwidth.

Evidence:

image

[Optional] Possible Solutions

  • cache pdf.worker.min.js in browser cache and avoid re-download because it doesn't change

Acceptance Criteria

  • pdf.worker.min.js should be downloaded for only once from remote CDN and later download request should be cached from browser local cache

Todo list

NA

@xiaohanyu xiaohanyu self-assigned this Apr 24, 2024
@xiaohanyu xiaohanyu added frontend Frontend related perf Performance related labels Apr 24, 2024
@xiaohanyu xiaohanyu modified the milestones: M-0.4, M-0.5 Apr 24, 2024
@xiaohanyu
Copy link
Member Author

We did a spike today and found that PWA service worker can mitigate this issue. By caching pdf.worker.js with PWA's cache, the download time for pdf.worker.js could be reduced from several hundreds ms to several ms.

Evidence:

image
image

Code skeleton looks something like this:

// eslint-disable-next-line no-undef
self.addEventListener('install', () => {
  console.debug('service worker installed')
})

// eslint-disable-next-line no-undef
self.addEventListener('activate', () => {
  console.debug('service worker activated')
})

// eslint-disable-next-line no-undef
self.addEventListener('fetch', function (e) {
  const CACHE_NAME = 'cache-v1'
  console.log('fetch', e.request.url)
  e.respondWith(
    (async function () {
      // eslint-disable-next-line no-undef
      const cachedResponse = await caches.match(e.request)
      if (cachedResponse) {
        console.log(
          'return from cache for response with request.url: ',
          e.request.url
        )
        return cachedResponse
      }

      const networkResponse = await fetch(e.request)

      const hosts = ['https://unpkg.com']

      if (hosts.some((host) => e.request.url.startsWith(host))) {
        // Cloning the response is necessary because request and response
        // streams can only be read once. In order to return the response to the
        // browser and put it in the cache we have to clone it. So the original
        // gets returned to the browser and the clone gets sent to the cache.
        // They are each read once.
        //
        // This clone() happens before `return networkResponse`
        const clonedResponse = networkResponse.clone()

        e.waitUntil(
          (async function () {
            // eslint-disable-next-line no-undef
            const cache = await caches.open(CACHE_NAME)
            // This will be called after `return networkResponse`
            // so make sure you already have the clone!
            console.log('set new cache')
            await cache.put(e.request, clonedResponse)
          })()
        )
      }

      return networkResponse
    })()
  )
})

Ref implementation:

@xiaohanyu
Copy link
Member Author

xiaohanyu commented May 7, 2024

As mentioned in above, we can mitigate the pain by adopting service worker to cache pdf.worker.min.js and reduce the latency from several hundreds of ms to less than 10 ms, which could drastically improve the performance.

In the end I didn't use code shown in this comment, however, the problem of the original code is, by default it is pretty dangerous to cache a opaque response since it may lead to corrupted PWA app permanently.

Instead, I adopted serwist for generating a service worker file, and self hosted pdf.js (in order to make the request to pdf.worker.min.js same origin, ah, non-opaque response to be service worker cache friendly).

So after all the optimization, the latency to download pdf.worker.min.js is reduced to less than 5 ms, evidence:

image image

For now, this should be enough for us.

Ideally, react-pdf-viewer's <Worker> should not reload many times when fileUrl changed, I guess this need a fix to react-pdf-viewer, and react's memoize won't work here.

@xiaohanyu xiaohanyu changed the title perf: avoid download pdf.worker.min.js for multiple times perf: reduce the latency of downloading pdf.worker.min.js May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
frontend Frontend related perf Performance related
Projects
None yet
Development

No branches or pull requests

1 participant