Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Module resolution #19

Closed
Tracked by #11
josephjclark opened this issue Sep 21, 2022 · 2 comments
Closed
Tracked by #11

Module resolution #19

josephjclark opened this issue Sep 21, 2022 · 2 comments
Labels
design Design issues or discussion enhancement New feature or request

Comments

@josephjclark
Copy link
Collaborator

A design ticket to help me straighten out some spaghetti in my head.

We have a requirement to dynamically load the dependencies of a job (ie, the adaptors it relies on, but maybe also more generally any other npm packages it may import.

For example, a job will often include the line import { fn } from '@openfn/language-common'. But where do we actually load @openfn/language-common from?

Because the same job my run in various places, and because different tools need to analyse this code, the resolution of this path is quite complicated.

Clients

So who needs to resolve the module @openfn/language-common?

  • To auto-insert import statements into legacy (and modern?) jobs, the compiler needs to look up the package.json and type declaration file
  • (Actually the lookup needs to run before the compiler run, so the CLI needs to load the module's types and pass them into the compiler)
  • To execute the job, the runtime's linker needs to load the module's actual javascript.
  • To develop the runtime framework, a language adaptor or a job file, a developer may then need to override the runtime path to the module, typically to use a local version.
  • The runtime manager may also have to download/install the module from a central repository (probably unpkg). I suppose in this case we only need the specifier and version.

Also note that Lightning's code editing tools need to know what modules are imported to provide code assist and intelligence. This in practice builds on the compiler, but Lightning itself may want to pre-load module definitions or documentation for its own tooling.

So at a high level, a runtime manager (devtools or lightning) needs to be able to specify the path to a module, and the runtime and compiler both need to respect whatever path is provided.

Oh, yeah, and we need to worry about versioning as well.

Module loading rules

We can potentially provide a number of ways to provide a mapping to a module, most of them oriented around the devtools CLI.

Here are the options to resolve @openfn/x (without versioning):

  • Explicit mapping: the module may be explicitly mapped to a location on the local system
  • Modules home dir: pass a directory to use like node_modules. Modules will be resolved here first. Ie, load @openfn/x from ${MODULES_HOME}/@openfn/x. The idea here is that a developer creates a local folder with their own adaptors in, sets this as an env var, and so automatically loads local language adaptors. Difficulty: If the module is @openfn/x, then in the node_module folder, you need to have a @openfn dir and then an x dir inside it. Which is fine but quite an awkward setup.
  • Monorepo home: set the path to the language adaptor monorepo (when it exists) as an env var and we'll load adaptors straight out of there. The loader logic should understand the monorepo structure
  • Otherwise we load from the actual specifier itself (without a version) - ie, import('@openfn/x)

If monorepo and modules_home are both set, we probably use modules_home as it's more specific.

Since there's so much complexity in module resolution, the tooling needs to be really clear about where it's loading modules form.

MODULES_HOME may have another use. If a job has arbitarary node dependencies (ie axios or lodash), those need to be installed somewhere. They're not dependencies of the CLI and shouldn't be served by the CLI's node_modules. Nor the runtime's. So arbitrary

What do we need?

There's a lot of mapping rules here and I don't want to have to write and test them in several places (the CLI, runtime and compiler may each need to do this!).

So we need to provide a single function, which accepts options from a runtime manager. Something like:

resolveModulePath(specifier, { modulesHome, moduleMap, monorepoHome }) {
  const path = /* resolve the specifier into a path */
  return path || specifier
}

Where does this live? It's a dependency of compiler and runtime, but not really part of either. The CLI needs it to preload module exports. It could go in a generic utils package but it's a bit lost. So maybe a module-resolver ?

I think it's literally just one function, I'm not sure what else it needs to do. I suppose we could push more logic into it?

loadDTs()
loadModule()
loadSyntheticModules()

Once you start doing that there's actually quite a close relationship to describe-package. Maybe that needs to evolve into like a module-helper or module-manager or something. This, I think, is part of the answer.

@josephjclark josephjclark mentioned this issue Sep 21, 2022
41 tasks
@josephjclark josephjclark added the enhancement New feature or request label Sep 21, 2022
@josephjclark
Copy link
Collaborator Author

josephjclark commented Sep 21, 2022

I'm thinking about a (dramatic) rework of describe-package to be some kind of general module helper.

module-manager
module-loader
module-analyser
package-helper
analyser

It's top level exports would look like this:

// Context or Project style setup, providing module resolution rules at runtime
export type ModuleContext = {
  modulesHome: string;      // a general folder to use as like a node_modules to load from
  monorepoHome: string;    // when language adaptors are in the same monorepo, this can be used to load them
  moduleMap: Record<string, string> // explicit module: path mappings;
  logger: Logger;                 //  A log function to use to provide feedback about module resolution
}

// Describe the exports of a particular package
// Used by the CLI (passed into the compiler)
export async function describePackage(specifier) => FunctionDescription[];

//  This is really what describePackage is doing
export async function describeExportedOperations(specififer: string, context: ModuleContext) => FunctionDescription[]

// Load an actual module into memory
// used by the runtime linker
export async function loadModule(specifier, context: ModuleContext): any; 

// Fetch an arbitrary file from a package
export async function fetchFile(specifier: string, filename: string, context: ModuleContext)

// Work out the local path or moduleName, based on the provided options
// This allows modules to be loaded from different directories
//  This is probably internal and private actually
resolveModulePath(specifier, {  }) {
  const path = /* resolve the specifier into a path */
  return path || specifier
}

It will of course evolve a bit as eg FunctionDescription becomes a bigger deal.

Internally, it caches in-memory, so that nothing is loaded twice. I don't think any of the project-level abstractions should be exposed.

A lot of the analysis done here will probably end up being used by the compiler: this thing needs to build and describe the runtime environment of a running job.

I am still not really sure how module versions fit into this. Specifiers can include a version number when speaking to unpkg. But I feel like mapped paths should be handling versioning? If pointing to MODULES_HOME or even an explicit path, the version is irrelevant, I just want to use that thing.

Changes

What changes because of this?

  • The CLI no longer handle module resolution for the add-imports transformer. It just calls describe-package with the specifier and context, which does all the heavy lifting.
  • The CLI's stripVersionSpecifier probably moves into this new package
  • The compiler's preloadAdaptorExports moves into the new package
  • The runtime linker will call out to loadModule, instead of doing loading itself

These all seem like positive changes

@josephjclark josephjclark added the design Design issues or discussion label Sep 27, 2022
@josephjclark
Copy link
Collaborator Author

Closing as a lot of these ideas have found their way into the code through the repo or ongoing describe-package restructuring.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design Design issues or discussion enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant