New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cli2: add sketch of plugin loading #1810
Conversation
Summary: This patch creates a new binary, `./bin/sc2`, which will be the home for a rewrite of the CLI intended to implement an instance system. See: <https://discourse.sourcecred.io/t/sourcecred-instance-system/244> Paired with @decentralion. Test Plan: Run `yarn backend` and `node ./bin/sc2.js`, which should nicely fail with a “not yet implemented” message. wchargin-branch: cli2-skeleton wchargin-source: a76bc7625c5508b876e471f6c8a8f82363f76a12
Summary: This adds a `CliPlugin` interface and a basic implementation for the GitHub plugin. Paired with @decentralion. Test Plan: Create a new directory `/tmp/test-instance`, with: ``` // sourcecred.json {"bundledPlugins": ["sourcecred/github"]} // config/sourcecred/github/config.json {"repositories": ["sourcecred/example-github"]} ``` Then, run ``` yarn backend && (cd /tmp/test-instance && NODE_PATH="$OLDPWD/node_modules/" node "$OLDPWD/bin/sc2.js") ``` and observe that the new instance has a cache directory containing a GitHub database. wchargin-branch: cli2-load wchargin-source: fca19a521cac5f28b52622c3e6e2be8100f76378
wchargin-branch: cli2-load wchargin-source: 0a48febe709d8c9e855a1ff74ae3f2429dcc136d
fddc669
to
3db9b91
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm missing all context for this prototype. So consider my comments as thinking out loud.
I'll defer review to @decentralion
// Make a directory, if it doesn't exist. | ||
function mkdirx(path: string) { | ||
try { | ||
fs.mkdirSync(path); | ||
} catch (e) { | ||
if (e.code !== "EEXIST") { | ||
throw e; | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like fs-extra
's mkdirp()
would be easier to use here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That does something else. We don’t want to make all the transitive
parent directories.
const pathComponents = [...prefix, pluginOwner, pluginName]; | ||
let path = baseDir; | ||
for (const pc of pathComponents) { | ||
path = pathJoin(path, pc); | ||
mkdirx(path); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like fs-extra
's mkdirp()
would be easier to use here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above.
type JsonObject = | ||
| string | ||
| number | ||
| boolean | ||
| null | ||
| JsonObject[] | ||
| {[string]: JsonObject}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having read your linked: https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/
This is pretty nice. Previously been irked by the any
JSON parse returns. Have tested https://gajus.github.io/flow-runtime for expected server response models and found some issues doing that.
Not sure if I like this library too much, due to fairly opaque babel magic. On the other hand, not having to declare the same type twice (once as a static Flow type, again as a parser) is really handy.
Have you tried any generators? Thoughts on them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We (@decentralion and I) chatted about this a bit. We chose to write the
correct-style parser here (perhaps because I was driving ;-) ) because
it really does give us much stronger guarantees than the any
-downcast,
though we definitely aren’t big fans of the verbosity. I’m not super
inclined to reflect Flow types into parsers, but I’m up for abstracting
this with an attoparsec-like or optics-based approach if we think that
that would be helpful. But later, not now; let’s find out how we
actually use these first.
): Promise<InstanceConfig> { | ||
const projectFilePath = pathJoin(baseDir, "sourcecred.json"); | ||
const contents = await fs.readFile(projectFilePath); | ||
return Promise.resolve(parseConfig(JSON.parse(contents))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: in an async function Promise.resolve
is redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but only because of a JavaScript design flaw that has really
aged like milk with the prevalence of TypeScript and Flow. I prefer to
write the code in the sensible, principled style, even though it’s not
strictly required by the language.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Certainly won't argue with you that Promises have a bunch of design flaws. I've built several client projects with https://github.com/fluture-js/Fluture instead of Promises and there's a lot of great properties there that I'm sorely missing having gone back to Promises.
On the other hand, the language integration definitely yields productivity benefits over going against the unfortunate standard, and reduces learning curve.
In this case return Promise.resolve()
is a pattern that makes use of one of those Promise design choices, to silently unpack Promises at every opportunity.
In other words:
Promise.resolve(Promise.resolve(123))
> Promise { <state>: "fulfilled", <value>: 123 }
We get a Promise<number>
not a Promise<Promise<number>>
.
A surprising and undesirable property if you ask me.
My taste preference here is to avoid that surprise, so I generally encourage others not to return Promise.resolve()
.
Anyhow, if you feel like there's a benefit to going the other way, curious to hear about it. But for this PR, I'm fine with either :]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the non-redundant style, although haven't invested the effort to really understand why William avoids it. Since we don't have a lint rule to automatically enforce it, I say it's authors pick on which to use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That resolve(resolve(123))
is equivalent to resolve(123)
is
precisely the design flaw that I’m trying to avoid. If the standard
didn’t have that flaw, then you would have to return resolve(123)
here, so returning 123
is surfacing the flaw, not hiding it. When
I read async function foo() { return 123; }
, which shouldn’t compile,
I have to remind myself that it works because of this flaw.
I’m happy to discuss in more detail, and I can provide a specific
example of code that this design flaw makes it impossible to write
cleanly (forcing ugly contortions in the form of boxing) as well as
point to some of the complexity that the flaw forces in real code. I’ll
keep it as is for now, and am fine with it being author’s pick, just as
so many other things are.
const loadCommand: Command = async (args, std) => { | ||
if (args.length !== 0) { | ||
die(std, "usage: sourcecred load"); | ||
} | ||
const baseDir = process.cwd(); | ||
const config = await loadInstanceConfig(baseDir); | ||
for (const [name, plugin] of config.bundledPlugins) { | ||
const dirContext = pluginDirectoryContext(baseDir, name); | ||
plugin.load(dirContext); | ||
} | ||
return 0; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Want to suggest that there's some complecting going on here.
One part is the transport implementation we just-so-happen to be using (CLI) and the other is a use-case.
Since we're taking a greenfield approach at the CLI now, it's a distinction I'd love to see at some point. Doing so would allow exposing a first-class javascript API, similar to the likes of babel and webpack. While making the CLI transport implementation simple.
In this case the distinction would look like:
const loadCommand: Command = async (args, std) => {
// Positional arguments, transport detail.
if (args.length !== 0) {
die(std, "usage: sourcecred load");
}
// Environment data extracting, transport detail.
const baseDir = process.cwd();
// Call the use-case.
loadUsecase(baseDir);
// Exit codes, transport detail.
return 0;
};
// Suitable to expose as Javascript API.
function loadUseCase(baseDir: string): void {
const config = await loadInstanceConfig(baseDir);
for (const [name, plugin] of config.bundledPlugins) {
const dirContext = pluginDirectoryContext(baseDir, name);
plugin.load(dirContext);
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the dev call today. If we do want to run a local daemon we can send commands over HTTP, I think this will help out.
IPFS and Docker are examples that invested in this structure, as they expose their commands through multiple transports (library, unix sockets and HTTP)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Beanow that we'll want to expose a functional style for implementing this logic that isn't hard-wired into the CLI. I don't need that to happen in this commit--I think it will be easy to factor out in the future when we need it. Though it could also be nice to have. I approve either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I of course also agree that it’d be great to have a JS API that
backs the CLI and is also exposed. This is basically what I describe at
the end of #945. But I don’t think that defining such an API should
block developing this CLI.
// Shim to interface with `fetchGithubRepo`; TODO: refactor that to just | ||
// take a directory. | ||
class CacheProviderImpl implements CacheProvider { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I'm a huge fan of using these interfaces. It decouples the rest of the code from the implementation details of using filesystems. Would recommend applying the same idea to the configuration and tokens too like ConfigProvider
and SecretsProvider
.
On the flipside, I would rather not see fs.readFile
or process.env
in a use-case.
To give a common example where this practice pays off. Many applications use ENV for secrets. However interesting when using Docker Swarm, ENV is not suitable as it will be passed around in plaintext. Instead it should use "Docker Secrets", which most application implement by accepting a *_FILE
suffix to secrets and reading it from that file path.
Having a central SecretsProvider
implementation, adding this in would be easy and something the core can implement once. Having process.env
scattered across the plugins, this would mean every plugin maintainer has to figure this out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the flipside, I would rather not see fs.readFile or process.env in a use-case.
I think we're in agreement that stuff like fs.readFile
and process.env
should be kept at the outermost layer possible, and decoupled from implementing the logic.
IMO, losing the CacheProvider is not a big deal because we aren't actually supporting other filesystems, so as developers we aren't getting feedback on whether we're (for example) using CacheProvider consistently across the codebase. Without that feedback, there's no particular reason to believe that we will actually be using it consistently when we need that flexibility, or even that this is giving us the right flexibility compared to what we'll actually need when we need it. So we don't benefit much from it, and it does add extra indirection and cognitive overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m not really familiar with the way in which you’re using the term
use-case. Perhaps you could clarify?
Currently, fetchGithubRepo
takes a simple token: GithubToken
. It’s
up to the caller to fetch that, directly from an environment variable or
from a file or from a cast5-decryption or whatever. It’s beyond the
scope of fetchGithubRepo
to care where that token came from. I don’t
see how introducing a secretProvider: () => GithubToken
(presumably?)
makes that any simpler.
): Promise<InstanceConfig> { | ||
const projectFilePath = pathJoin(baseDir, "sourcecred.json"); | ||
const contents = await fs.readFile(projectFilePath); | ||
return Promise.resolve(parseConfig(JSON.parse(contents))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the non-redundant style, although haven't invested the effort to really understand why William avoids it. Since we don't have a lint rule to automatically enforce it, I say it's authors pick on which to use.
load(PluginDirectoryContext): Promise<void>; | ||
graph(PluginDirectoryContext, ReferenceDetector): Promise<WeightedGraph>; | ||
referenceDetector(PluginDirectoryContext): Promise<ReferenceDetector>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's provide a TaskReporter to each of these methods.
const config = await loadInstanceConfig(baseDir); | ||
for (const [name, plugin] of config.bundledPlugins) { | ||
const dirContext = pluginDirectoryContext(baseDir, name); | ||
plugin.load(dirContext); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's have the task reporter report that load started for the plugin--that way we have consistent measurements for all the plugins, and the plugin can use the provided task reporter to give more detailed information (e.g. timing on sub-portions)
wchargin-branch: cli2-load wchargin-source: 529877b5799df838fc42af78c5d56977d79111d8 # Conflicts: # src/cli2/sourcecred.js
wchargin-branch: cli2-load wchargin-source: 529877b5799df838fc42af78c5d56977d79111d8
Summary: Paired with @decentralion. Test Plan: Follow the test plan for #1810, then additionally run ``` (cd /tmp/test-instance && node "$OLDPWD/bin/sc2.js" graph) ``` and note that the `output/graphs/...` directory has a graph JSON file. wchargin-branch: cli2-graph
Summary: This adds a `CliPlugin` interface and a basic implementation for the GitHub plugin. Paired with @decentralion. Test Plan: Create a new directory `/tmp/test-instance`, with: ``` // sourcecred.json {"bundledPlugins": ["sourcecred/github"]} // config/sourcecred/github/config.json {"repositories": ["sourcecred/example-github"]} ``` Then, run ``` yarn backend && (cd /tmp/test-instance && node "$OLDPWD/bin/sc2.js" load) ``` and observe that the new instance has a cache directory containing a GitHub database. wchargin-branch: cli2-load
Summary: Paired with @decentralion. Test Plan: Follow the test plan for sourcecred#1810, then additionally run ``` (cd /tmp/test-instance && node "$OLDPWD/bin/sc2.js" graph) ``` and note that the `output/graphs/...` directory has a graph JSON file. wchargin-branch: cli2-graph
Summary:
This adds a
CliPlugin
interface and a basic implementation for theGitHub plugin.
Paired with @decentralion.
Test Plan:
Create a new directory
/tmp/test-instance
, with:Then, run
and observe that the new instance has a cache directory containing a
GitHub database.
wchargin-branch: cli2-load