Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli2: add sketch of plugin loading #1810

Merged
merged 5 commits into from May 29, 2020
Merged

cli2: add sketch of plugin loading #1810

merged 5 commits into from May 29, 2020

Conversation

wchargin
Copy link
Member

@wchargin wchargin commented May 27, 2020

Summary:
This adds a CliPlugin interface and a basic implementation for the
GitHub plugin.

Paired with @decentralion.

Test Plan:
Create a new directory /tmp/test-instance, with:

// sourcecred.json
{"bundledPlugins": ["sourcecred/github"]}

// config/sourcecred/github/config.json
{"repositories": ["sourcecred/example-github"]}

Then, run

yarn backend &&
(cd /tmp/test-instance && node "$OLDPWD/bin/sc2.js" load)

and observe that the new instance has a cache directory containing a
GitHub database.

wchargin-branch: cli2-load

Summary:
This patch creates a new binary, `./bin/sc2`, which will be the home for
a rewrite of the CLI intended to implement an instance system. See:
<https://discourse.sourcecred.io/t/sourcecred-instance-system/244>

Paired with @decentralion.

Test Plan:
Run `yarn backend` and `node ./bin/sc2.js`, which should nicely fail
with a “not yet implemented” message.

wchargin-branch: cli2-skeleton
wchargin-source: a76bc7625c5508b876e471f6c8a8f82363f76a12
Summary:
This adds a `CliPlugin` interface and a basic implementation for the
GitHub plugin.

Paired with @decentralion.

Test Plan:
Create a new directory `/tmp/test-instance`, with:

```
// sourcecred.json
{"bundledPlugins": ["sourcecred/github"]}

// config/sourcecred/github/config.json
{"repositories": ["sourcecred/example-github"]}
```

Then, run

```
yarn backend &&
(cd /tmp/test-instance &&
    NODE_PATH="$OLDPWD/node_modules/" node "$OLDPWD/bin/sc2.js")
```

and observe that the new instance has a cache directory containing a
GitHub database.

wchargin-branch: cli2-load
wchargin-source: fca19a521cac5f28b52622c3e6e2be8100f76378
wchargin-branch: cli2-load
wchargin-source: 0a48febe709d8c9e855a1ff74ae3f2429dcc136d
Copy link
Contributor

@Beanow Beanow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm missing all context for this prototype. So consider my comments as thinking out loud.
I'll defer review to @decentralion

Comment on lines +17 to +26
// Make a directory, if it doesn't exist.
function mkdirx(path: string) {
try {
fs.mkdirSync(path);
} catch (e) {
if (e.code !== "EEXIST") {
throw e;
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like fs-extra's mkdirp() would be easier to use here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does something else. We don’t want to make all the transitive
parent directories.

Comment on lines +38 to +43
const pathComponents = [...prefix, pluginOwner, pluginName];
let path = baseDir;
for (const pc of pathComponents) {
path = pathJoin(path, pc);
mkdirx(path);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like fs-extra's mkdirp() would be easier to use here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above.

Comment on lines +22 to +28
type JsonObject =
| string
| number
| boolean
| null
| JsonObject[]
| {[string]: JsonObject};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having read your linked: https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/

This is pretty nice. Previously been irked by the any JSON parse returns. Have tested https://gajus.github.io/flow-runtime for expected server response models and found some issues doing that.

Not sure if I like this library too much, due to fairly opaque babel magic. On the other hand, not having to declare the same type twice (once as a static Flow type, again as a parser) is really handy.

Have you tried any generators? Thoughts on them?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We (@decentralion and I) chatted about this a bit. We chose to write the
correct-style parser here (perhaps because I was driving ;-) ) because
it really does give us much stronger guarantees than the any-downcast,
though we definitely aren’t big fans of the verbosity. I’m not super
inclined to reflect Flow types into parsers, but I’m up for abstracting
this with an attoparsec-like or optics-based approach if we think that
that would be helpful. But later, not now; let’s find out how we
actually use these first.

): Promise<InstanceConfig> {
const projectFilePath = pathJoin(baseDir, "sourcecred.json");
const contents = await fs.readFile(projectFilePath);
return Promise.resolve(parseConfig(JSON.parse(contents)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: in an async function Promise.resolve is redundant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but only because of a JavaScript design flaw that has really
aged like milk with the prevalence of TypeScript and Flow. I prefer to
write the code in the sensible, principled style, even though it’s not
strictly required by the language.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly won't argue with you that Promises have a bunch of design flaws. I've built several client projects with https://github.com/fluture-js/Fluture instead of Promises and there's a lot of great properties there that I'm sorely missing having gone back to Promises.

On the other hand, the language integration definitely yields productivity benefits over going against the unfortunate standard, and reduces learning curve.

In this case return Promise.resolve() is a pattern that makes use of one of those Promise design choices, to silently unpack Promises at every opportunity.

In other words:

Promise.resolve(Promise.resolve(123))
> Promise { <state>: "fulfilled", <value>: 123 }

We get a Promise<number> not a Promise<Promise<number>>.
A surprising and undesirable property if you ask me.

My taste preference here is to avoid that surprise, so I generally encourage others not to return Promise.resolve().

Anyhow, if you feel like there's a benefit to going the other way, curious to hear about it. But for this PR, I'm fine with either :]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the non-redundant style, although haven't invested the effort to really understand why William avoids it. Since we don't have a lint rule to automatically enforce it, I say it's authors pick on which to use.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That resolve(resolve(123)) is equivalent to resolve(123) is
precisely the design flaw that I’m trying to avoid. If the standard
didn’t have that flaw, then you would have to return resolve(123)
here, so returning 123 is surfacing the flaw, not hiding it. When
I read async function foo() { return 123; }, which shouldn’t compile,
I have to remind myself that it works because of this flaw.

I’m happy to discuss in more detail, and I can provide a specific
example of code that this design flaw makes it impossible to write
cleanly (forcing ugly contortions in the form of boxing) as well as
point to some of the complexity that the flaw forces in real code. I’ll
keep it as is for now, and am fine with it being author’s pick, just as
so many other things are.

Comment on lines +11 to +22
const loadCommand: Command = async (args, std) => {
if (args.length !== 0) {
die(std, "usage: sourcecred load");
}
const baseDir = process.cwd();
const config = await loadInstanceConfig(baseDir);
for (const [name, plugin] of config.bundledPlugins) {
const dirContext = pluginDirectoryContext(baseDir, name);
plugin.load(dirContext);
}
return 0;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want to suggest that there's some complecting going on here.

One part is the transport implementation we just-so-happen to be using (CLI) and the other is a use-case.

Since we're taking a greenfield approach at the CLI now, it's a distinction I'd love to see at some point. Doing so would allow exposing a first-class javascript API, similar to the likes of babel and webpack. While making the CLI transport implementation simple.

In this case the distinction would look like:

const loadCommand: Command = async (args, std) => {
  // Positional arguments, transport detail.
  if (args.length !== 0) {
    die(std, "usage: sourcecred load");
  }

  // Environment data extracting, transport detail.
  const baseDir = process.cwd();

  // Call the use-case.
  loadUsecase(baseDir);

  // Exit codes, transport detail.
  return 0;
};

// Suitable to expose as Javascript API.
function loadUseCase(baseDir: string): void {
  const config = await loadInstanceConfig(baseDir);
  for (const [name, plugin] of config.bundledPlugins) {
    const dirContext = pluginDirectoryContext(baseDir, name);
    plugin.load(dirContext);
  }
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the dev call today. If we do want to run a local daemon we can send commands over HTTP, I think this will help out.

IPFS and Docker are examples that invested in this structure, as they expose their commands through multiple transports (library, unix sockets and HTTP)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Beanow that we'll want to expose a functional style for implementing this logic that isn't hard-wired into the CLI. I don't need that to happen in this commit--I think it will be easy to factor out in the future when we need it. Though it could also be nice to have. I approve either way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I of course also agree that it’d be great to have a JS API that
backs the CLI and is also exposed. This is basically what I describe at
the end of #945. But I don’t think that defining such an API should
block developing this CLI.

Comment on lines +33 to +35
// Shim to interface with `fetchGithubRepo`; TODO: refactor that to just
// take a directory.
class CacheProviderImpl implements CacheProvider {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm a huge fan of using these interfaces. It decouples the rest of the code from the implementation details of using filesystems. Would recommend applying the same idea to the configuration and tokens too like ConfigProvider and SecretsProvider.

On the flipside, I would rather not see fs.readFile or process.env in a use-case.

To give a common example where this practice pays off. Many applications use ENV for secrets. However interesting when using Docker Swarm, ENV is not suitable as it will be passed around in plaintext. Instead it should use "Docker Secrets", which most application implement by accepting a *_FILE suffix to secrets and reading it from that file path.

Having a central SecretsProvider implementation, adding this in would be easy and something the core can implement once. Having process.env scattered across the plugins, this would mean every plugin maintainer has to figure this out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the flipside, I would rather not see fs.readFile or process.env in a use-case.

I think we're in agreement that stuff like fs.readFile and process.env should be kept at the outermost layer possible, and decoupled from implementing the logic.

IMO, losing the CacheProvider is not a big deal because we aren't actually supporting other filesystems, so as developers we aren't getting feedback on whether we're (for example) using CacheProvider consistently across the codebase. Without that feedback, there's no particular reason to believe that we will actually be using it consistently when we need that flexibility, or even that this is giving us the right flexibility compared to what we'll actually need when we need it. So we don't benefit much from it, and it does add extra indirection and cognitive overhead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not really familiar with the way in which you’re using the term
use-case. Perhaps you could clarify?

Currently, fetchGithubRepo takes a simple token: GithubToken. It’s
up to the caller to fetch that, directly from an environment variable or
from a file or from a cast5-decryption or whatever. It’s beyond the
scope of fetchGithubRepo to care where that token came from. I don’t
see how introducing a secretProvider: () => GithubToken (presumably?)
makes that any simpler.

): Promise<InstanceConfig> {
const projectFilePath = pathJoin(baseDir, "sourcecred.json");
const contents = await fs.readFile(projectFilePath);
return Promise.resolve(parseConfig(JSON.parse(contents)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the non-redundant style, although haven't invested the effort to really understand why William avoids it. Since we don't have a lint rule to automatically enforce it, I say it's authors pick on which to use.

Base automatically changed from wchargin-cli2-skeleton to master May 29, 2020 01:33
Comment on lines +9 to +11
load(PluginDirectoryContext): Promise<void>;
graph(PluginDirectoryContext, ReferenceDetector): Promise<WeightedGraph>;
referenceDetector(PluginDirectoryContext): Promise<ReferenceDetector>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's provide a TaskReporter to each of these methods.

const config = await loadInstanceConfig(baseDir);
for (const [name, plugin] of config.bundledPlugins) {
const dirContext = pluginDirectoryContext(baseDir, name);
plugin.load(dirContext);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's have the task reporter report that load started for the plugin--that way we have consistent measurements for all the plugins, and the plugin can use the provided task reporter to give more detailed information (e.g. timing on sub-portions)

wchargin-branch: cli2-load
wchargin-source: 529877b5799df838fc42af78c5d56977d79111d8

# Conflicts:
#	src/cli2/sourcecred.js
wchargin-branch: cli2-load
wchargin-source: 529877b5799df838fc42af78c5d56977d79111d8
@wchargin wchargin merged commit 80c3c38 into master May 29, 2020
@wchargin wchargin deleted the wchargin-cli2-load branch May 29, 2020 01:57
wchargin added a commit that referenced this pull request May 29, 2020
Summary:
Paired with @decentralion.

Test Plan:
Follow the test plan for #1810, then additionally run

```
(cd /tmp/test-instance && node "$OLDPWD/bin/sc2.js" graph)
```

and note that the `output/graphs/...` directory has a graph JSON file.

wchargin-branch: cli2-graph
META-DREAMER pushed a commit to META-DREAMER/sourcecred that referenced this pull request Jun 18, 2020
Summary:
This adds a `CliPlugin` interface and a basic implementation for the
GitHub plugin.

Paired with @decentralion.

Test Plan:
Create a new directory `/tmp/test-instance`, with:

```
// sourcecred.json
{"bundledPlugins": ["sourcecred/github"]}

// config/sourcecred/github/config.json
{"repositories": ["sourcecred/example-github"]}
```

Then, run

```
yarn backend &&
(cd /tmp/test-instance && node "$OLDPWD/bin/sc2.js" load)
```

and observe that the new instance has a cache directory containing a
GitHub database.

wchargin-branch: cli2-load
META-DREAMER pushed a commit to META-DREAMER/sourcecred that referenced this pull request Jun 18, 2020
Summary:
Paired with @decentralion.

Test Plan:
Follow the test plan for sourcecred#1810, then additionally run

```
(cd /tmp/test-instance && node "$OLDPWD/bin/sc2.js" graph)
```

and note that the `output/graphs/...` directory has a graph JSON file.

wchargin-branch: cli2-graph
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants