Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persistent subgraphs #513

Merged
merged 12 commits into from
Oct 31, 2018
Merged

Persistent subgraphs #513

merged 12 commits into from
Oct 31, 2018

Conversation

timmclean
Copy link
Contributor

@timmclean timmclean commented Oct 23, 2018

I believe with this PR we can leave ethsf behind?

Resolves #404


Major changes:

  • Schemas, SubgraphInstances, BlockStreams, etc no longer have a subgraph name associated with them. Just a SubgraphId.
  • Store has 7 new methods related to reading/writing/deleting subgraph name->ID mappings, access tokens, etc. This information was previously stored in-memory in graph-node.
  • Components that used to track subgraph name->ID mappings in-memory now have an Arc<Store> and have been rewritten to use that (e.g. JSON RPC service, SubgraphProvider deploy/remove)
  • SubgraphProvider::new is now SubgraphProvider::init. It retrieves a list of subgraph names from the Store and starts those subgraphs.
  • MockStore has been extended with mock implementations for some of the new methods. MockStore is used in a bunch of new places now that more components use a Store.

SubgraphProvider deploy and remove are hopefully pretty readable. deploy is essentially:

  • Resolve the subgraph manifest
  • Call SubgraphProvider::remove to undeploy the subgraph name if it already exists
  • Add subgraph ID directives
  • Write this subgraph name->ID mapping to the Store
  • Look for other names in the Store pointing at this ID. If none, start processing on this subgraph, otherwise, done (subgraph is already be running).

remove is essentially:

  • Read the subgraph name->ID mapping from the Store
  • Delete the name->ID mapping in the Store
  • Look for other names in the Store pointing at this ID. If none, shut down processing on this subgraph, otherwise, done (leave the subgraph running).

@timmclean
Copy link
Contributor Author

timmclean commented Oct 23, 2018

Opening this up for code review now since there's a lot here, despite the fact that it's not done. Most of the work from the last few days has been trying to remove the assumption that each subgraph has one and only one name... would be interested in some feedback on whether that's the right approach. @Jannis

Update: done now!

@timmclean timmclean force-pushed the tim/persistent-subgraphs branch 2 times, most recently from ce01583 to 4141d40 Compare October 27, 2018 00:32
@timmclean timmclean changed the title WIP persistent subgraphs Persistent subgraphs Oct 27, 2018
@timmclean timmclean added this to the Hosted Service milestone Oct 29, 2018
@leoyvens
Copy link
Collaborator

In an architecture with a dedicated name service the node would not be managing names at all, but we still want built-in name management for standalone nodes. To make this mode switching easier, what do you think of extracting the name logic in SubgraphProvider to a SubgraphNames component (or whatever name you prefer). SubgraphNames would have methods analogous to deploy and remove, that would return whether or not the subgraph should be deployed or removed. When running with a dummy SubgraphNames, the name is completely ignored and the subgraph is always deployed or removed. @timmclean makes sense? What do you think?

@timmclean
Copy link
Contributor Author

Ah interesting. That could work. What would this look like externally? Right now I believe the JSON RPC service has deploy/remove/authorize/list. Should we use the existing verbs, revise them to make them more flexible, or add a separate JSON RPC service with a new set?

@leoyvens
Copy link
Collaborator

@timmclean I wouldn't touch that layer at all yet, we're gonna need to figure that out later (hopefully it will be graphql mutations rather than json-rpc) but for now I just wanna make sure it will be clean and easy to turn off the code that writes and removes names.

@timmclean
Copy link
Contributor Author

So how would deploy and remove work in SubgraphProvider in that case? They both have name arguments... would we pass an empty string? None?

@timmclean
Copy link
Contributor Author

timmclean commented Oct 29, 2018

Planned refactor for separating name functionality from SubgraphProvider:

  • In SubgraphProvider:
    • Does not depend on Store.
    • Track a HashSet of SubgraphIds that are currently running in this graph node instance.
    • deploy(name, link) will become start(link). Error if subgraph ID already in the HashSet. Add ID to HashSet.
    • remove(name) will become stop(id). Error if subgraph ID not in the HashSet. Remove ID from HashSet.
    • Emits events just like current SubgraphProvider.
  • New component: SubgraphProviderWithNames
    • Does depend on Store.
    • Used by JSON RPC service.
    • deploy(name, link): same functionality as current SubgraphProvider::deploy(name, link)
    • remove(name): same functionality as current SubgraphProvider::remove(name)
    • deploy and remove call start and stop on the SubgraphProvider as needed.

@leodasvacas

@leoyvens
Copy link
Collaborator

leoyvens commented Oct 29, 2018

@timmclean thanks for writing out the plan, looks solid!

@timmclean
Copy link
Contributor Author

Refactor done!

Copy link
Collaborator

@leoyvens leoyvens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this is great! Left some comments.

core/src/subgraph/provider_with_names.rs Outdated Show resolved Hide resolved
graph/src/components/subgraph/provider.rs Outdated Show resolved Hide resolved
core/src/subgraph/provider.rs Outdated Show resolved Hide resolved
core/src/subgraph/provider.rs Outdated Show resolved Hide resolved
core/src/subgraph/provider_with_names.rs Outdated Show resolved Hide resolved
graph/src/components/store.rs Show resolved Hide resolved
server/http/src/service.rs Outdated Show resolved Hide resolved
graph/src/components/store.rs Show resolved Hide resolved
@timmclean
Copy link
Contributor Author

timmclean commented Oct 30, 2018

@leodasvacas I pushed some commits, how does this look?

@timmclean
Copy link
Contributor Author

Latest commit adds separate endpoints to server/http and server/websocket for accessing subgraphs by ID or name. E.g. http://localhost:8000/by-name/adchain or http://localhost:8000/by-id/QmaMUYb8GBZBrK6664P822xYyfN2nqR5VVfpkWFyvtwyD9

Copy link
Collaborator

@leoyvens leoyvens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the new url scheme!

service.handle_not_found()
},
),
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now print urls on deploy but I guess it's still nice to have this back. Could you also bring back the comment that explains what this does?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh true, yeah, I don't feel too strongly on having a redirect or not! I was tempted to put in a quicky plaintext list of subgraph names...

Comment added though

server/websocket/src/server.rs Outdated Show resolved Hide resolved
leoyvens
leoyvens previously approved these changes Oct 31, 2018
Copy link
Contributor

@Jannis Jannis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! A few small comments, some just notes, a couple changes I'd like to see.

.event_sink
.clone()
.send(SubgraphProviderEvent::SubgraphStop(id))
.map_err(|e| panic!("failed to forward subgraph removal: {}", e))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is removal still the best word here?

logger.clone(),
Arc::new(subgraph_provider),
store.clone(),
).wait()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wait... can it not cause freezing (e.g. if something inside init were to spawn something)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my current level of tokio understanding, I think it's fine... async_main is already running inside the context of a tokio runtime, and tokio can use other threads

@leodasvacas ?

.header(
header::LOCATION,
header::HeaderValue::from_str(destination)
.expect("invalid redirect destination"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something that should make the node panic?


return Err(json_rpc_error(
JSON_RPC_UNAUTHORIZED_ERROR,
"API key is invalid".to_owned(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be Invalid access token, since API key is not what we call this any more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense!

@@ -160,36 +196,62 @@ impl<T: SubgraphProvider> JsonRpcServer<T> {
) -> Result<Value, jsonrpc_core::Error> {
info!(self.logger, "Received subgraph_authorize request"; "params" => params.to_string());
Self::require_master_token(auth)?;
*self.subgraph_api_keys.write().unwrap() = params.subgraph_api_keys;

for (subgraph_name, access_token) in params.subgraph_api_keys {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's rename subgraph_api_keys to subgraph_access_tokens as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a breaking change in the JSON-RPC API but that's fine IMHO.

server/websocket/src/server.rs Outdated Show resolved Hide resolved
@@ -434,6 +434,104 @@ impl Store {
}

impl StoreTrait for Store {
fn authorize_subgraph_name(&self, name: String, new_access_token: String) -> Result<(), Error> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're putting a bit much into the Store. At some point it will be worth thinking about splitting this up into separate traits and separate implementations. But not right now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. We should have a separate store component for subgraph name info

None => {
debug!(
self.logger,
"Subgraph name {:?} has no associated access token. Access denied by default.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use structured logging:

debug!(
  self.logger,
  "Subgraph has no associated access token. Access denied by default.";
  "subgraph_name" => &subgraph_name
);

Also: you're passing in access_token instead of subgraph_name anyway. 😉

@timmclean
Copy link
Contributor Author

I think I got everything!

@timmclean timmclean merged commit cf98685 into master Oct 31, 2018
@timmclean timmclean deleted the tim/persistent-subgraphs branch October 31, 2018 22:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants