Runtime and Compiler v2 #11

josephjclark · 2022-08-24T15:40:41Z

The next generation runtime engine.

To make it so:

npm install -g @openfn/cli
openfn --help
openfn --test

Merge Checklist

What do we need to do before merging this to main?

Update Lightning to use the new describe-package (instead of compiler)
Check the workflow-diagram tests and un-skip (I need help with this!)
~~Port describe-package tests over to ava ?~~ Nah
Tests passing
Update the top level readme
Work out what to do with the top level package.json and examples folder
Spin out unfinished items below into new issues (anything spun out has been struckthrough here)

Likely blockers with real-world jobs

Immutable state (we should make state mutable for now, or at least hide immutability behind a flag)

Runtime

Create a new runtime which accepts a job as a pipeline of functions (as a string representation of an esm module which exports an array of functions), and runs them through the execute reducer.

import { readFile } from 'node:fs/promises';
import run from '@openfn/runtime';

const job = await readFile('expression.js', 'utf8'); // compiled expression .ejs
const initialState = {};
const { data } = await run(source, initialState);

The job is loaded into a sandbox environment using experimental node:vm module loaders. This is more to load the source properly and control the execution environment, it's not a security thing. Seems to work great.

The runtime manager may want to run the whole runtime in a vm2 context for added security.

Load the job ESM code as an actual executable module (and run it)
Allow environment globals like console to be overridden inside the job
Don't import execute, use a native implementation in this package (even if it's identical)
Automated Logging and log level controls
~~run should return a promise which is also an event emitter~~
~~Emit events as operations start and return (current status: running operation 4: fetch from salesforce)~~
State is mutable by default, but can be made immutable if a flag is passed

Compiler

The new compiler will be quite different from the old because it won't need to do stuff like inject the execute reducer or ensure there's an environment for language adaptors. The runtime handles all that stuff now.

I am still using recast, acorn and ast-types to do the heavy lifting (although tbh I'm not sure why we're favouring acorn over esprima!)

Set up some patterns for visitors and unit tests
Create a transformer to ensure a valid export default []
Create a transformer to move top-level call expressions into the default exports
Get the top-level compile function working and tested (with examples)
~~Create a validator to catch naughties in code (like errant import/export statements)~~
~~Provide a validation API with error reporting (closely related to the validator I hope!)~~
~~Compile from typescript (!)~~
~~Add wrappers for global state - State globals in the new runtime #17~~
Add imports for a given adaptor

Devtools / CLI

Probably installed as a global to work on adaptors, or run straight out of kit when working on the runtime and compiler

Runtime Manager

The runtime manager /service provides an API to run jobs in worker threads.

There's an API to support and report on thread management, and there's a little web server which stays alive and receives post requests to run jobs. The server#'s just a demo client of the manager API really (but it'll be a fun toy to build out!).

Use a worker pool to execute jobs in thread
Accept jobs as strings and compile them (with a cache)
Track which threads are running which jobs, plus some usage stats
Work out a way to unit test the thread pool
(remaining steps moved to New runtime manager service #52)

Ava Notes

I've disabled type checking in ava's unit tests because:

It slows them down
There's absolutely no reason to type check here
If ava does encounter a type error anywhere in its test or source compilation, it'll hang and timeout without reporting anything (I have lost hours to this).
I for one am happy to relax typing rules while developing against tests (it's often useful to have unreferenced variables and imports, for example!)

Type errors will still get caught by the build process in CI. We could consider adding a test:tsc script as well if we want seperate CLI/local test runner for typings. At the moment I have no appetite for this.

Also worth nothing that ava works nicely in monorepos - just drop a config in the top level and ava will find it from down in the packages. Now it's easy for all packages to share a standard test definition. Neat!

Naming Jobs

When running a job, we ideally want to give it a useful name. Probably found in the preceeding comments. But how will the runtime know the name?

I think we need like an Operation function, which explicitly declares an operation with a name and a handler.

export default [
  Operation("do the thing", fn(() => { ... })
]

Alternatively we could ad arguments to every adaptor function to do this stuff.

Breaking changes

the state global

I want to remove the state global. The currnet core.execute function loads state into the 'global' context. This doesn't work very well with my new vision of "immutable" state, where we clone the state object and pass it into each operation.

I would like to:

Remove global state from the runtime environment
In the compiler, wrap every operation in a (state) => x wrapper. Maybe only if a flag is provided, maybe only if we detect a global state reference inside that operation.

This means old code can be made to work if it's compiled properly, while new code can be encouraged to adopt better practice.

General TODOs

Think a bit more about source maps. Runtime needs to be able report errors against source map positions. Source maps could be inline, disk space isn't a priority.
Create a logger service (TODO create an issue to track this)
Module resolution #19

Not really happy about this but at the moment it's needed for unit tests. vm.SourceTextModule doesn't seem to be available from inside the ava worker

We can use this in unit tests. Instead of calling out to the actual runtime (which throws errors reading vm.SourceTextModule, something complicated with the --experimental-vm-modules flag not getting passed to the ava thread), we create a worker which calls our simple mock function. All the worker lifecycle stuff is abstracted into a helper function which is used equally by the actual and mock workers - which gives us a really realistic mokc simulation.

…te typings (but badly)

stuartc · 2022-09-09T08:05:20Z

@josephjclark I just went through this PR, this is really great - it's feeling like a very solid foundation.

The CLI appears to expect a folder? Or a file called job.js, for a first pass would it not be easier/clearer to expect the exact file instead of a path that contains files with a naming pattern?

TODO what if stdout and output path are set?

I think in this case, then console.log and other things are redirected to STDERR. And the resulting state is sent to STDOUT and a file. It's a good question because what I just proposed is a pretty niche interface.

Perhaps we make the option mutually exclusive for now, you pick just one.
But this does lead me to think we might be better off logging to STDERR if you have --stdout enabled.

Security thoughts: the process inherits the node command arguments
(it has to for experimental modules to work)
Is this a concern? If secrets are passed in they could be visible
The sandbox should help

I'm wasn't immediately concerned about this, like I couldn't think of what might be passed in. I think what is super important is code accessing process, including importing it from node:process, we're far more concerned with someone accessing process.env or adding a callback to process.on.

We will need to test how the sandboxing can deal with this. I think that's the only handle we really have, I don't know how we could 'knee-cap' the worker without making life very hard for everyone - we should focus on the sandboxing and test that.

As an aside, it wouldn't be unreasonable to pass the process a 'cookie' (Erlang terminology) or key/secret used to trust other nodes in the cluster. Basically I do think there could/would be a requirement to have workers carry some kind of temporary and constant trust mechanism for communicating with the backend cluster.

You'll have to walk me through that module resolution issue you outlined above, I'd like to understand it better.

packages/workflow-diagram/test/layout.test.ts

packages/workflow-diagram/test/fixtures/single-workflow-nodeedges.json

????

I know it's failing, I'm not sure why it's not skipped here...

I thoughht I'd fixed this one too...

Tidy project

josephjclark · 2022-11-08T16:54:56Z

So I've been through and updated everything that I think needs doing. The examples now work, the readme is updated.

We do need to update Lightning to use the new describe-package, as after merging we can't update the old package (at least not easily). But it should be OK to merge before we do this. Perhaps it needs an issue over on the Lightning repo.

@stuartc I think you need to give this a quick once-over - check the top level readme and package json, plus check my previous comment re describe-package tests. It's probably fine and we could merge and fix later.

Subject to that, I think we can merge...

josephjclark marked this pull request as draft August 24, 2022 15:41

josephjclark added 2 commits August 25, 2022 17:49

Start a basic server which runs a job on POST requests

5d410d4

Update tsconfig and update tests

519d22b

josephjclark force-pushed the v2 branch from b9394eb to 519d22b Compare August 25, 2022 17:06

josephjclark added 21 commits August 26, 2022 10:07

Light refactoring

a0fdfe5

Tryinf (and failing) to send messages through piscina

9d48d79

Use workerpools for proper nessaging between threads

2c207d2

Update ts config (again)

9247166

Disable type checking in ava

61f71f2

Allow simple job queues to be pre-parsed rather than loaded as modules

5be0607

Not really happy about this but at the moment it's needed for unit tests. vm.SourceTextModule doesn't seem to be available from inside the ava worker

Moved @openfn/compiler -> @openfn/describe-package

7f4f575

Added new compiler project with simple parse function

3be12a8

Use recast for parsing

4a3118f

Start working out the transform infrastructure

567f92a

Get an ensure exports transformer working

9b5c382

Little update to tests

5903308

Update CLi (and docs); add top-level-operation transforms and tests

386e6b3

fix tsconfig in describe-package

16743e3

Update and fix build

f62572d

Integrate compiler into runtime manager, update tests

e2fd620

Let compiler detect whether incoming string is a path or source. Upda…

9f3e351

…te typings (but badly)

Randomise time of slow job

8c4d22c

Start setting up new devtools

0770f32

Docs and tidying in runtime

5735819

This comment was marked as resolved.

Sign in to view

josephjclark added 3 commits September 1, 2022 15:56

Round out cli functionality a bit

d4699e8

Documentation and tidyups

17524ae

Merge branch 'main' into v2

a50b32e

stuartc and others added 4 commits November 7, 2022 11:50

Update node to 18 on ci

ae20d77

Fix default repo location typo

73d8199

Fix typo

2bcbed7

Merge branch 'main' into v2

049bf66

josephjclark commented Nov 8, 2022

View reviewed changes

packages/workflow-diagram/test/layout.test.ts Show resolved Hide resolved

workflow-diagram: fix merge stuff

d1cd9c1

josephjclark commented Nov 8, 2022

View reviewed changes

packages/workflow-diagram/test/fixtures/single-workflow-nodeedges.json Show resolved Hide resolved

josephjclark added 16 commits November 8, 2022 14:23

workflow-diagram:Restore missing file

75fa8ec

????

workflow-dialogram: remove unused file

63a79b9

Update top readme

92ea7d8

Update docs

15797a5

Docs fix

077b849

describe-package: restore worker bundle

81adb51

compiler-worker: fixed build

909e031

examples: compiler-worker -> dts-inspector

7f7a1ab

tweak readme

14c2c5e

tweak readme again

1473b78

update package lock

56d33d9

skip failing test

07fda1d

I know it's failing, I'm not sure why it's not skipped here...

describe-package: update failing test

3222eee

I thoughht I'd fixed this one too...

describe-package: remove dead code

f6c9771

changeset

6c57303

Merge pull request #61 from OpenFn/tidy-project

eb52b0d

Tidy project

josephjclark marked this pull request as ready for review November 8, 2022 16:54

josephjclark requested a review from stuartc November 8, 2022 16:55

stuartc merged commit 6e52d10 into main Nov 9, 2022

stuartc deleted the v2 branch November 9, 2022 10:02

stuartc removed their request for review November 9, 2022 10:03

taylordowns2000 mentioned this pull request Mar 24, 2023

New runtime manager service #52

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Runtime and Compiler v2 #11

Runtime and Compiler v2 #11

josephjclark commented Aug 24, 2022 •

edited

Loading

This comment was marked as resolved.

stuartc commented Sep 9, 2022

josephjclark commented Nov 8, 2022

Runtime and Compiler v2 #11

Runtime and Compiler v2 #11

Conversation

josephjclark commented Aug 24, 2022 • edited Loading

Merge Checklist

Runtime

Compiler

Devtools / CLI

Runtime Manager

Ava Notes

Naming Jobs

Breaking changes

the state global

General TODOs

This comment was marked as resolved.

stuartc commented Sep 9, 2022

josephjclark commented Nov 8, 2022

josephjclark commented Aug 24, 2022 •

edited

Loading