cli2: add sketch of plugin loading #1810

wchargin · 2020-05-27T21:19:06Z

Summary:
This adds a CliPlugin interface and a basic implementation for the
GitHub plugin.

Test Plan:
Create a new directory /tmp/test-instance, with:

// sourcecred.json
{"bundledPlugins": ["sourcecred/github"]}

// config/sourcecred/github/config.json
{"repositories": ["sourcecred/example-github"]}

Then, run

yarn backend &&
(cd /tmp/test-instance && node "$OLDPWD/bin/sc2.js" load)

and observe that the new instance has a cache directory containing a
GitHub database.

wchargin-branch: cli2-load

@decentralion

Summary: This patch creates a new binary, `./bin/sc2`, which will be the home for a rewrite of the CLI intended to implement an instance system. See: <https://discourse.sourcecred.io/t/sourcecred-instance-system/244> Paired with @decentralion. Test Plan: Run `yarn backend` and `node ./bin/sc2.js`, which should nicely fail with a “not yet implemented” message. wchargin-branch: cli2-skeleton wchargin-source: a76bc7625c5508b876e471f6c8a8f82363f76a12

@decentralion

Summary: This adds a `CliPlugin` interface and a basic implementation for the GitHub plugin. Paired with @decentralion. Test Plan: Create a new directory `/tmp/test-instance`, with: ``` // sourcecred.json {"bundledPlugins": ["sourcecred/github"]} // config/sourcecred/github/config.json {"repositories": ["sourcecred/example-github"]} ``` Then, run ``` yarn backend && (cd /tmp/test-instance && NODE_PATH="$OLDPWD/node_modules/" node "$OLDPWD/bin/sc2.js") ``` and observe that the new instance has a cache directory containing a GitHub database. wchargin-branch: cli2-load wchargin-source: fca19a521cac5f28b52622c3e6e2be8100f76378

wchargin-branch: cli2-load wchargin-source: 0a48febe709d8c9e855a1ff74ae3f2429dcc136d

Beanow

I'm missing all context for this prototype. So consider my comments as thinking out loud.
I'll defer review to @decentralion

Beanow · 2020-05-28T15:39:53Z

src/cli2/common.js

+// Make a directory, if it doesn't exist.
+function mkdirx(path: string) {
+  try {
+    fs.mkdirSync(path);
+  } catch (e) {
+    if (e.code !== "EEXIST") {
+      throw e;
+    }
+  }
+}


Seems like fs-extra's mkdirp() would be easier to use here.

That does something else. We don’t want to make all the transitive
parent directories.

Beanow · 2020-05-28T15:40:06Z

src/cli2/common.js

+  const pathComponents = [...prefix, pluginOwner, pluginName];
+  let path = baseDir;
+  for (const pc of pathComponents) {
+    path = pathJoin(path, pc);
+    mkdirx(path);
+  }


Seems like fs-extra's mkdirp() would be easier to use here.

Beanow · 2020-05-28T16:33:46Z

src/cli2/instanceConfig.js

+type JsonObject =
+  | string
+  | number
+  | boolean
+  | null
+  | JsonObject[]
+  | {[string]: JsonObject};


Having read your linked: https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/

This is pretty nice. Previously been irked by the any JSON parse returns. Have tested https://gajus.github.io/flow-runtime for expected server response models and found some issues doing that.

Not sure if I like this library too much, due to fairly opaque babel magic. On the other hand, not having to declare the same type twice (once as a static Flow type, again as a parser) is really handy.

Have you tried any generators? Thoughts on them?

We (@decentralion and I) chatted about this a bit. We chose to write the
correct-style parser here (perhaps because I was driving ;-) ) because
it really does give us much stronger guarantees than the any-downcast,
though we definitely aren’t big fans of the verbosity. I’m not super
inclined to reflect Flow types into parsers, but I’m up for abstracting
this with an attoparsec-like or optics-based approach if we think that
that would be helpful. But later, not now; let’s find out how we
actually use these first.

Beanow · 2020-05-28T16:34:41Z

src/cli2/common.js

+): Promise<InstanceConfig> {
+  const projectFilePath = pathJoin(baseDir, "sourcecred.json");
+  const contents = await fs.readFile(projectFilePath);
+  return Promise.resolve(parseConfig(JSON.parse(contents)));


Nit: in an async function Promise.resolve is redundant.

Yes, but only because of a JavaScript design flaw that has really
aged like milk with the prevalence of TypeScript and Flow. I prefer to
write the code in the sensible, principled style, even though it’s not
strictly required by the language.

Certainly won't argue with you that Promises have a bunch of design flaws. I've built several client projects with https://github.com/fluture-js/Fluture instead of Promises and there's a lot of great properties there that I'm sorely missing having gone back to Promises.

On the other hand, the language integration definitely yields productivity benefits over going against the unfortunate standard, and reduces learning curve.

In this case return Promise.resolve() is a pattern that makes use of one of those Promise design choices, to silently unpack Promises at every opportunity.

In other words:

Promise.resolve(Promise.resolve(123)) > Promise { <state>: "fulfilled", <value>: 123 }

We get a Promise<number> not a Promise<Promise<number>>.
A surprising and undesirable property if you ask me.

My taste preference here is to avoid that surprise, so I generally encourage others not to return Promise.resolve().

Anyhow, if you feel like there's a benefit to going the other way, curious to hear about it. But for this PR, I'm fine with either :]

I prefer the non-redundant style, although haven't invested the effort to really understand why William avoids it. Since we don't have a lint rule to automatically enforce it, I say it's authors pick on which to use.

That resolve(resolve(123)) is equivalent to resolve(123) is
precisely the design flaw that I’m trying to avoid. If the standard
didn’t have that flaw, then you would have to return resolve(123)
here, so returning 123 is surfacing the flaw, not hiding it. When
I read async function foo() { return 123; }, which shouldn’t compile,
I have to remind myself that it works because of this flaw.

I’m happy to discuss in more detail, and I can provide a specific
example of code that this design flaw makes it impossible to write
cleanly (forcing ugly contortions in the form of boxing) as well as
point to some of the complexity that the flaw forces in real code. I’ll
keep it as is for now, and am fine with it being author’s pick, just as
so many other things are.

Beanow · 2020-05-28T17:19:49Z

src/cli2/load.js

+const loadCommand: Command = async (args, std) => {
+  if (args.length !== 0) {
+    die(std, "usage: sourcecred load");
+  }
+  const baseDir = process.cwd();
+  const config = await loadInstanceConfig(baseDir);
+  for (const [name, plugin] of config.bundledPlugins) {
+    const dirContext = pluginDirectoryContext(baseDir, name);
+    plugin.load(dirContext);
+  }
+  return 0;
+};


Want to suggest that there's some complecting going on here.

One part is the transport implementation we just-so-happen to be using (CLI) and the other is a use-case.

Since we're taking a greenfield approach at the CLI now, it's a distinction I'd love to see at some point. Doing so would allow exposing a first-class javascript API, similar to the likes of babel and webpack. While making the CLI transport implementation simple.

In this case the distinction would look like:

const loadCommand: Command = async (args, std) => { // Positional arguments, transport detail. if (args.length !== 0) { die(std, "usage: sourcecred load"); } // Environment data extracting, transport detail. const baseDir = process.cwd(); // Call the use-case. loadUsecase(baseDir); // Exit codes, transport detail. return 0; }; // Suitable to expose as Javascript API. function loadUseCase(baseDir: string): void { const config = await loadInstanceConfig(baseDir); for (const [name, plugin] of config.bundledPlugins) { const dirContext = pluginDirectoryContext(baseDir, name); plugin.load(dirContext); } }

Based on the dev call today. If we do want to run a local daemon we can send commands over HTTP, I think this will help out.

IPFS and Docker are examples that invested in this structure, as they expose their commands through multiple transports (library, unix sockets and HTTP)

I agree with Beanow that we'll want to expose a functional style for implementing this logic that isn't hard-wired into the CLI. I don't need that to happen in this commit--I think it will be easy to factor out in the future when we need it. Though it could also be nice to have. I approve either way.

Yes, I of course also agree that it’d be great to have a JS API that
backs the CLI and is also exposed. This is basically what I describe at
the end of #945. But I don’t think that defining such an API should
block developing this CLI.

Beanow · 2020-05-28T17:42:54Z

src/plugins/github/cliPlugin.js

+// Shim to interface with `fetchGithubRepo`; TODO: refactor that to just
+// take a directory.
+class CacheProviderImpl implements CacheProvider {


Actually, I'm a huge fan of using these interfaces. It decouples the rest of the code from the implementation details of using filesystems. Would recommend applying the same idea to the configuration and tokens too like ConfigProvider and SecretsProvider.

On the flipside, I would rather not see fs.readFile or process.env in a use-case.

To give a common example where this practice pays off. Many applications use ENV for secrets. However interesting when using Docker Swarm, ENV is not suitable as it will be passed around in plaintext. Instead it should use "Docker Secrets", which most application implement by accepting a *_FILE suffix to secrets and reading it from that file path.

Having a central SecretsProvider implementation, adding this in would be easy and something the core can implement once. Having process.env scattered across the plugins, this would mean every plugin maintainer has to figure this out.

On the flipside, I would rather not see fs.readFile or process.env in a use-case.

I think we're in agreement that stuff like fs.readFile and process.env should be kept at the outermost layer possible, and decoupled from implementing the logic.

IMO, losing the CacheProvider is not a big deal because we aren't actually supporting other filesystems, so as developers we aren't getting feedback on whether we're (for example) using CacheProvider consistently across the codebase. Without that feedback, there's no particular reason to believe that we will actually be using it consistently when we need that flexibility, or even that this is giving us the right flexibility compared to what we'll actually need when we need it. So we don't benefit much from it, and it does add extra indirection and cognitive overhead.

I’m not really familiar with the way in which you’re using the term
use-case. Perhaps you could clarify?

Currently, fetchGithubRepo takes a simple token: GithubToken. It’s
up to the caller to fetch that, directly from an environment variable or
from a file or from a cast5-decryption or whatever. It’s beyond the
scope of fetchGithubRepo to care where that token came from. I don’t
see how introducing a secretProvider: () => GithubToken (presumably?)
makes that any simpler.

teamdandelion · 2020-05-29T01:30:51Z

src/cli2/common.js

+): Promise<InstanceConfig> {
+  const projectFilePath = pathJoin(baseDir, "sourcecred.json");
+  const contents = await fs.readFile(projectFilePath);
+  return Promise.resolve(parseConfig(JSON.parse(contents)));


I prefer the non-redundant style, although haven't invested the effort to really understand why William avoids it. Since we don't have a lint rule to automatically enforce it, I say it's authors pick on which to use.

teamdandelion · 2020-05-29T01:50:02Z

src/cli2/cliPlugin.js

+  load(PluginDirectoryContext): Promise<void>;
+  graph(PluginDirectoryContext, ReferenceDetector): Promise<WeightedGraph>;
+  referenceDetector(PluginDirectoryContext): Promise<ReferenceDetector>;


Let's provide a TaskReporter to each of these methods.

teamdandelion · 2020-05-29T01:50:45Z

src/cli2/load.js

+  const config = await loadInstanceConfig(baseDir);
+  for (const [name, plugin] of config.bundledPlugins) {
+    const dirContext = pluginDirectoryContext(baseDir, name);
+    plugin.load(dirContext);


Let's have the task reporter report that load started for the plugin--that way we have consistent measurements for all the plugins, and the plugin can use the provided task reporter to give more detailed information (e.g. timing on sub-portions)

wchargin-branch: cli2-load wchargin-source: 529877b5799df838fc42af78c5d56977d79111d8 # Conflicts: # src/cli2/sourcecred.js

wchargin-branch: cli2-load wchargin-source: 529877b5799df838fc42af78c5d56977d79111d8

@decentralion

Summary: Paired with @decentralion. Test Plan: Follow the test plan for #1810, then additionally run ``` (cd /tmp/test-instance && node "$OLDPWD/bin/sc2.js" graph) ``` and note that the `output/graphs/...` directory has a graph JSON file. wchargin-branch: cli2-graph

@decentralion

Summary: This adds a `CliPlugin` interface and a basic implementation for the GitHub plugin. Paired with @decentralion. Test Plan: Create a new directory `/tmp/test-instance`, with: ``` // sourcecred.json {"bundledPlugins": ["sourcecred/github"]} // config/sourcecred/github/config.json {"repositories": ["sourcecred/example-github"]} ``` Then, run ``` yarn backend && (cd /tmp/test-instance && node "$OLDPWD/bin/sc2.js" load) ``` and observe that the new instance has a cache directory containing a GitHub database. wchargin-branch: cli2-load

@decentralion

Summary: Paired with @decentralion. Test Plan: Follow the test plan for sourcecred#1810, then additionally run ``` (cd /tmp/test-instance && node "$OLDPWD/bin/sc2.js" graph) ``` and note that the `output/graphs/...` directory has a graph JSON file. wchargin-branch: cli2-graph

wchargin added 3 commits May 27, 2020 14:18

[update patch]

ce568bc

wchargin-branch: cli2-load wchargin-source: 0a48febe709d8c9e855a1ff74ae3f2429dcc136d

teamdandelion force-pushed the wchargin-cli2-skeleton branch from fddc669 to 3db9b91 Compare May 28, 2020 17:46

Beanow reviewed May 28, 2020

View reviewed changes

teamdandelion approved these changes May 29, 2020

View reviewed changes

Base automatically changed from wchargin-cli2-skeleton to master May 29, 2020 01:33

teamdandelion approved these changes May 29, 2020

View reviewed changes

teamdandelion reviewed May 29, 2020

View reviewed changes

wchargin added 2 commits May 28, 2020 18:55

[update diffbase]

ca2c555

wchargin-branch: cli2-load wchargin-source: 529877b5799df838fc42af78c5d56977d79111d8 # Conflicts: # src/cli2/sourcecred.js

[update patch]

01af94f

wchargin-branch: cli2-load wchargin-source: 529877b5799df838fc42af78c5d56977d79111d8

wchargin merged commit 80c3c38 into master May 29, 2020

wchargin deleted the wchargin-cli2-load branch May 29, 2020 01:57

wchargin mentioned this pull request May 29, 2020

sc2: add graph subcommand #1811

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cli2: add sketch of plugin loading #1810

cli2: add sketch of plugin loading #1810

wchargin commented May 27, 2020 •

edited

Beanow left a comment

Beanow May 28, 2020

wchargin May 28, 2020

Beanow May 28, 2020

wchargin May 28, 2020

Beanow May 28, 2020

wchargin May 28, 2020

Beanow May 28, 2020

wchargin May 28, 2020

Beanow May 28, 2020

teamdandelion May 29, 2020 •

edited

wchargin May 29, 2020

Beanow May 28, 2020

Beanow May 28, 2020

teamdandelion May 29, 2020

wchargin May 29, 2020

Beanow May 28, 2020

teamdandelion May 29, 2020

wchargin May 29, 2020

teamdandelion May 29, 2020 •

edited

teamdandelion May 29, 2020

teamdandelion May 29, 2020

cli2: add sketch of plugin loading #1810

cli2: add sketch of plugin loading #1810

Conversation

wchargin commented May 27, 2020 • edited

Beanow left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

teamdandelion May 29, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

teamdandelion May 29, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wchargin commented May 27, 2020 •

edited

teamdandelion May 29, 2020 •

edited

teamdandelion May 29, 2020 •

edited