Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import workers from url #453

Merged
merged 4 commits into from
Oct 1, 2019
Merged

Import workers from url #453

merged 4 commits into from
Oct 1, 2019

Conversation

Pessimistress
Copy link
Collaborator

@Pessimistress Pessimistress commented Sep 27, 2019

Currently the *WorkerLoaders embed worker source as strings. This approach introduced the following issues:

  • Re-distributing external modules - some loaders contain large external components such as Draco and LAZ. These components do not change with each loaders.gl version but we need to re-bundle them as part of the worker.
  • Bundle size - large workers inflate application bundles significantly. Users do not have the choice to split the worker from the app bundle.
  • Duplication - When a loader module's standalone bundle is created, the large components are included twice, once in the main thread version, once in the worker as an inline string.

This PR is the first step towards addressing these issues:

  • The worker source is separated from the *WorkerLoaders and by default loaded from CDN. It significantly reduces the footprint of the worker loaders.
  • It provides the option for users to host the worker as a static file on their own server.

@coveralls
Copy link

coveralls commented Sep 27, 2019

Coverage Status

Coverage remained the same at 56.741% when pulling a5dcfc0 on x/worker-url into 5a426ff on master.

@Pessimistress Pessimistress changed the title [POC] import worker from url Import workers from url Sep 27, 2019
Copy link
Collaborator

@ibgreen ibgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great.

I am wondering if we should skip the Loader/WorkerLoader separation in v2?

A related thing that bothers me:

It would be great if we could support bundling the workers via an optional import path:

import {dracoWorker} from '@loaders.gl/draco/draco-worker-loader`;
import {DracoLoader} from '@loaders.gl/draco`;
import {registerWorkers} from `@loaders.gl/core`;

registerWorkers({
  DracoLoader: dracoWorker
});

To handle dist/es* we could add a 'src/draco-worker-loader`:

import `../draco-worker-loader`;

but that doesn't work with src because we have two-level dist folders (dist/esm) and one level src folders, it is not possible to easily resolve @loaders.gl/draco/draco-worker-loader in all cases.

@@ -56,7 +56,9 @@ test('ArrowLoader#parse (WORKER)', async t => {
return;
}

const data = await parse(fetchFile(ARROW_SIMPLE), ArrowWorkerLoader);
const data = await parse(fetchFile(ARROW_SIMPLE), ArrowWorkerLoader, {
workerUrl: 'modules/arrow/dist/arrow-loader.worker.js'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Add comment explaining why we override workerUrl. (I assume it is because we don't want to be CDN dependent in unit tests).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we fix a bug in the loader we need to test the local version, not the one already published to cdn.

worker
worker: true,
defaultOptions: {
workerUrl: `https://unpkg.com/@loaders.gl/arrow@${__VERSION__}/dist/arrow-loader.worker.js`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idea: Looks like these strings could be auto-deduced inside core based on a loader.id or loader.package field?

That way we could contain the proliferation of __VERSION__ and just handle it in core.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way it is possible for users to pass in a custom url pointing to their own server. I actually think this logic is easier to read than putting some complex resolution logic into the core.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 completely agree that code should be optimized for clarity, and I do like the fact that this makes things so explicit.

However, still worth a discussion:

  • I do find injected constants confusing (it can take reader quite a while to understand where __VERSION__ is coming from. A one-line comment above each injection could solve that.
  • I usually end up having to deal with side effects from such injections in certain configurations. You already handle the test case via explicit overrides, but how about the start-local case?
  • What will __VERSION__ be in that case, and is there a way for us to still override it in core without having more structural understanding of how these urls are formed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comment and fixed start-local (inject __VERSION__: 'latest').

We could come up with smarter ways to deal with the local test scenario in the test harness. I prefer not to do it inside the core to avoid unintended implicit behavior.

@ibgreen
Copy link
Collaborator

ibgreen commented Sep 28, 2019

edited my comment for readability

@xintongxia
Copy link

What is the benefit to import worker from url?

@ibgreen
Copy link
Collaborator

ibgreen commented Sep 28, 2019

@georgios-uber This approach should allow you to load basis from a CDN.

@Pessimistress
Copy link
Collaborator Author

What is the benefit to import worker from url?

Updated PR summary.

@Pessimistress
Copy link
Collaborator Author

It would be great if we could support bundling the workers via an optional import path

@ibgreen After this PR the worker loader is so light that I don't think it matters even if tree shaking does not work...

@ibgreen
Copy link
Collaborator

ibgreen commented Sep 29, 2019

It would be great if we could support bundling the workers via an optional import path

After this PR the worker loader is so light that I don't think it matters even if tree shaking does not work...

Yes absolutely, once workers are dynamically loaded they should be available by default. I even think we should include the workerUrl field by default in the main ...Loaders, and just remove the WorkerLoader exports.

I made the comment based on the assumption that we still want to support applications that do not want to rely on potentially flaky ``https://unpkg.com` CDN, and would like to bundle their loaders.

By providing an optionally importable worker and a registration mechanism to attach it to the Loader (replacing the CDN url), we'd be able to offer the best of both worlds.

Are you of a mind that CDN support is sufficient?

@Pessimistress
Copy link
Collaborator Author

Are you of a mind that CDN support is sufficient?

If an user does not want to use CDN, the recommendation would be copying the worker source (included in dist/) to their own server as a static asset. Either way the worker would not be bundled with the main app (not desirable anyways for the large loaders).

@ibgreen
Copy link
Collaborator

ibgreen commented Sep 30, 2019

If an user does not want to use CDN, the recommendation would be copying the worker source (included in dist/) to their own server as a static asset. Either way the worker would not be bundled with the main app (not desirable anyways for the large loaders).

I feel that we are pushing some burden on the app developers by not having an intermediary option, but I agree that we can take that approach, at least for larger workers

  • Assuming solid documentation is provided around this.
  • It might also require some version checking to make sure we can detect when the user's manually copied loaders have gone stale.

In addition, the worker loaders are only half the story. We have big amounts of code in our non-worker loaders, and as long as we are including those we still have an issue with loader size:

  • We can rely on tree-shaking to some extent
  • but we'll still slow down debug builds by importing huge loaders.
  • we'll have to make sure the scripts don't bundle the non-workers
  • etc.

Option: Dropping non-worker loaders completely

  • We'd need to implement worker support under Node.js
  • I still think the non-worker loaders have a value, especially for debugging.

@Pessimistress
Copy link
Collaborator Author

Discussed offline - merging for now, will address in follow up PRs:

  • Documentation of different worker use cases
  • Worker version compatibility check

@Pessimistress Pessimistress merged commit b34f968 into master Oct 1, 2019
@Pessimistress Pessimistress deleted the x/worker-url branch October 1, 2019 21:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants