Skip to content

gnd: Support multiple subgraphs, grafting, subgraph composition in dev mode #6000

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Jun 10, 2025

Conversation

incrypto32
Copy link
Member

@incrypto32 incrypto32 commented May 12, 2025

Supporting multiple subgraphs in dev mode is a bit tricky since FileLinkResolver needs a base dir, this works well for a single subgraph since we can scope the resolver to the build directory of that subgraph, but when there are multiple subgraphs, we need to set the base_dir of the FileLinkResolver dynamically. This PR implements the mechanisms for that.

This is done by using a link_resolver_override thats passed upto the subgraph runner which switches to using it when available.

For supporting subgraph datasource, aliases are used users can declare their subgraph datasources in the manifests normally like below example.

dataSources:
  - kind: subgraph
    name: Factory
    network: base
    source:
      address: 'QmSource'
      startBlock: 1759510

When they run gnd they have to pass in the flag

--source "QmSource:<PATH_TO_MANIFEST_OF_SOURCE_SUBGRAPH"

@incrypto32 incrypto32 changed the base branch from master to krishna/graph-dev May 12, 2025 12:12
@incrypto32 incrypto32 changed the title gnd: Support subgraph composition in dev mode gnd: Support multiple subgraphs, grafting, subgraph composition in dev mode May 12, 2025
@incrypto32 incrypto32 force-pushed the krishna/graph-dev-composition-2 branch from 0826b2a to b5bbf93 Compare May 12, 2025 14:54
@incrypto32 incrypto32 requested a review from lutter May 13, 2025 05:28
@incrypto32 incrypto32 mentioned this pull request May 16, 2025
@incrypto32 incrypto32 self-assigned this May 19, 2025
Copy link
Collaborator

@lutter lutter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice; I feel that we need a more general concept around finding related subgraphs in the code, maybe some kind of LinkMapper that is pretty much a noop for IPFS, and knows where different subgraphs are in the filesystem for dev mode. If we had that, I think we could set these up in the respective main methods, and wouldn't have to guess as much in the link resolver.

// The manifest path is the path of the subgraph manifest file in the build directory
// We use the parent directory as the base directory for the new resolver
let base_dir = canonical_manifest_path
.parent()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused now, isn't this basically base_dir/deployment_str/.., i.e. base_dir ?

Copy link
Member Author

@incrypto32 incrypto32 May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deployment_str can be something like "../subgraph2/subgraph.yaml" in that case the new base_dir is parent of "base_dir/../subgraph2/subgraph.yaml" which is "../subgraph2"

When deployment_str is an absolute path its simply the directory in which the subgraph.yaml is

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't deployment_str just the deployment hash because of the line let deployment_str = deployment.to_string(); ?

/// For other resolvers, this method will simply return the current resolver
/// This is required because paths mentioned in the subgraph manifest are relative paths
/// and we need a new resolver with the right base directory for the specific subgraph
fn for_deployment(&self, deployment: DeploymentHash) -> Result<Box<dyn LinkResolver>, Error>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading the comment, it might make sense to rather have a method for_manifest(manifest_path: PathBuf). By using the deployment here, we are still encoding a convention about where deployments live relative to each other. It would be nicer if that convention was only used to find manifests and everything else follows from there. Worst case, we could even store the path from which a manifest was read in the manifest itself to make it easier to get the path whereever it is needed. In a way, that would nicely express what changes with this PR: we used to know implicitly where manifests live because they were all on IPFS under their hash. Now, a manifest can live there, but also in the filesystem, and I think we should just express that explicitly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the method signature to fn for_manifest(&self, manifest: DeploymentHash i had to keep the type to be DeploymentHash since it can be an alias name as well its not strictly a file path.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fn for_manifest(&self, manifest_path: &str)

// The manifest path is the path of the subgraph manifest file in the build directory
// We use the parent directory as the base directory for the new resolver
let base_dir = canonical_manifest_path
.parent()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't deployment_str just the deployment hash because of the line let deployment_str = deployment.to_string(); ?

}

impl FileLinkResolver {
/// Create a new FileLinkResolver
///
/// All paths are treated as absolute paths.
pub fn new() -> Self {
pub fn new(base_dir: Option<PathBuf>, aliases: HashMap<String, PathBuf>) -> Self {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be clearer if aliases was a HashMap<DeploymentHash, PathBuf> which is really what it expresses: where in the filesystem one would find the files for a given deployment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it a bit strange to do that, because aliases represents is a mapping from the String aliases that the user provided as cli arguments, yes it does get converted into DeploymentHash but with the recent change to

fn for_manifest(&self, manifest_path: &str) from fn for_deployment(&self, deployment: DeploymentHash)

Also in the cat method we would need to convert the Link again back to DeploymentHash which is a bit wierd, since Link was created by calling to_ipfs_link onDeploymentHash. In cat we use this to resolve aliases properly

@incrypto32 incrypto32 force-pushed the krishna/graph-dev-composition-2 branch from a8a21a3 to bc4b351 Compare June 10, 2025 11:54
@incrypto32 incrypto32 force-pushed the krishna/graph-dev-composition-2 branch from bc4b351 to b349d60 Compare June 10, 2025 12:08
@incrypto32 incrypto32 changed the base branch from krishna/graph-dev to krishna/gnd-base June 10, 2025 12:09
@incrypto32 incrypto32 merged commit 655764d into krishna/gnd-base Jun 10, 2025
6 checks passed
incrypto32 added a commit that referenced this pull request Jul 16, 2025
…v mode (#6000)

* graph: Add clone_for_deployment to FileLinkResolver to create  FileLinkResolver with the right base dir for a subgraph

* graph: Add for_deployment to LinkResolverTrait

* core, graph: use for_deployment to get properly scoped resolver

* graph: Implement aliases for file link resolver

* node: Make gnd work with multiple subgraphs

* node: Support subgraph datasource in gnd

* node: correct the default value for manfiest

* core, node, graph: Ignore graft base in dev mode

* node: Allow providing a postgres url for gnd

* node: Do not use pgtemp in windows

* store: enable `vendored` feature for openssl crate

* chain/ethereum: Return error when ipc is used in non unix platform

* node: Refactor launcher

* node/dev : Better error message when database directory doesn't exist

* node: refactor watcher

* core, node, graph: Manipulate raw manifest instead of passing
ignore_graft_base

This reverts commit b5bbf93.

* node: Correct comments on `redeploy_all_subgraphs`

* node/gnd: Deploy all subgraphs first before wathcing files

* core, graph : Refactor LinkResolver trait
incrypto32 added a commit that referenced this pull request Jul 18, 2025
* Refactor main function (#5980)

* node: Refactor main execution flow and introduce launcher module

* node/launcher: extract setup_configuration helper  from run

* node/launcher: extract setup_metrics helper  from run

* node/launcher: extract setup_store helper  from run

* node/launcher: extract build_blockchain_map helper  from run

* node/launcher: extract cleanup_ethereum_shallow_blocks helper  from run

* node/launcher: extract spawn_block_ingestor helper  from run

* node/launcher: extract deploy_subgraph_from_flag helper  from run

* node/launcher: extract spawn_contention_checker helper  from run

* node/launcher: extract build_graphql_server helper  from run

* node/launcher: extract build_subgraph_registrar helper  from run

* Implement a File Link Resolver (#5981)

* graph: Add a new FIleLinkResolver

* graph: remove `/ipfs/` prefix when using file link resolver

* graph: Implement custom deserialise logic for Link to enable file link resolver

* tests: Add runner test that uses file link resolver

* graph: Conditionally disable deployment hash validation based on env var

* graph: use constant for "/ipfs/" prefix in `remove_prefix`

* graph: Simplify resolve_path by removing redundant path.is_absolute() check

* graph: Remove leftover println from file_resolver tests

* tests: Refactor runner tests extract test utils into recipe.rs

* tests: Add a test for file_link_resolver

* Graph node dev mode (#5982)

* node: Create a new binary for graph node dev mode

* graph, store: Add unassign_subgraph method to SubgraphStore

* node: Add helpers for graph node dev for  subgraph management

* node: Add helper functions for watching files in dev mode

* node: Wire file watching in dev mode to redeploy subgraphs

* node: fix formatting

* gnd: Support multiple subgraphs, grafting, subgraph composition in dev mode (#6000)

* graph: Add clone_for_deployment to FileLinkResolver to create  FileLinkResolver with the right base dir for a subgraph

* graph: Add for_deployment to LinkResolverTrait

* core, graph: use for_deployment to get properly scoped resolver

* graph: Implement aliases for file link resolver

* node: Make gnd work with multiple subgraphs

* node: Support subgraph datasource in gnd

* node: correct the default value for manfiest

* core, node, graph: Ignore graft base in dev mode

* node: Allow providing a postgres url for gnd

* node: Do not use pgtemp in windows

* store: enable `vendored` feature for openssl crate

* chain/ethereum: Return error when ipc is used in non unix platform

* node: Refactor launcher

* node/dev : Better error message when database directory doesn't exist

* node: refactor watcher

* core, node, graph: Manipulate raw manifest instead of passing
ignore_graft_base

This reverts commit b5bbf93.

* node: Correct comments on `redeploy_all_subgraphs`

* node/gnd: Deploy all subgraphs first before wathcing files

* core, graph : Refactor LinkResolver trait

* Workflow to build the gnd binary (#6013)

* .github: Create a workflow for building gnd binaries

* .github: Codesign gnd binary for macOs

* .github: notarize gnd binary for macOs

* gnd: Integration tests (#6035)

* node/gnd: Make ports configurable

* node/gnd: Deploy all subgraphs on startup

* tests: Refactor subgraph datasources in TestCase

* tests: refactor Testcase method for source subgraphs

* tests: Add integration tests for gnd

* store: Use bundled pq-sys

* gnd: remove temp database directory on exit

* gnd: use pgtemp from graphprotocol org

* gnd: add alias for pgtemp db for windows

* gnd: use deep codesigning for macos binaries

* update workflow to add entitlements.plist
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants