Support module scrapers

The dream here is to let other users maintain scrapers in a community repo, or on their own githubs, and let developers simply install them via npm.
```bash
npm i scrape-pages @community-scrapers/twitter-feed @community-scrapers/twitter-login
```

`ConfigInit`:
```yaml
scrape:
  module: '@community-scrapers/twitter-feed'
```

yields `Config`:
```yaml
input:
  - '@community-scrapers/twitter-feed:username'
define:
  @community-scrapers/twitter-feed:feedpage: ...
  @community-scrapers/twitter-feed:post: ...
  @community-scrapers/twitter-feed:post-media: ...
scrape:
  module: '@community-scrapers/twitter-feed'
```

Local `define` defs can override those inside module `define`.

How to wire this stuff up?
### inputs
Create a object in each `ScrapeStep` that came from a module. Object should map full input keys to module's internal keys. The internal keys will be the ones actually used in the handlebar templates. E.g.
```js
{
  '@community-scrapers/twitter-feed:username': 'username'
}
```
### scrape
Two options:
1. Create a separate `flow.ts` instance for a module and hook that up to whatever is above/below it.
2. Crawl through a module scraper, find all empty `scrapeEach` arrays and reattach the rest of the structure there.

### stateful values
There may be times when a local/module scraper gets a value that you want for the rest of the run. Most often this will be an auth/access token.
```yaml
define:
  'user-likes-page':
    download:
       urlTemplate: 'https://twitter.com/likes'
       headerTemplates:
         'x-twitter-access-token': '{{ accessToken }}'
    parse:
      selector: '.post a'
      attribute: 'href'
scrape:
  module: '@community-scrapers/twitter-login'
  valueAsInput: 'accessToken'
  forEach:
    - scraper: 'user-likes-page'
```
This is essentially global state, whenever `'@community-scrapers/twitter-login'` gives us a value, we update the input value for `'accessToken'`, and replace the passed down value with `''`

### organizing dependencies
It is possible to have a separate directory where module scrapers live using `worker_threads`.
```bash
mkdir scrape-pages-runners
cd scrape-pages-runners
npm init
npm i scrape-pages @community-scrapers/twitter-feed @community-scrapers/twitter-login
```

Your main nodejs process can run something like
```js
const { Worker } = require('worker_threads')
const worker = new Worker('./scrape-pages-runners/worker.js', { workerData: { config, options } })
worker.on('message', ([event, data]) => console.log(event, data)) // wire up scraper events here
worker.on('exit', () => console.log('complete.')
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support module scrapers #12

inputs

scrape

stateful values

organizing dependencies

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Support module scrapers #12

Description

inputs

scrape

stateful values

organizing dependencies

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions