Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visualize the dependencies as a graph #13283

Open
yoav-orca opened this issue Oct 17, 2021 · 17 comments
Open

Visualize the dependencies as a graph #13283

yoav-orca opened this issue Oct 17, 2021 · 17 comments
Assignees

Comments

@yoav-orca
Copy link
Contributor

Is your feature request related to a problem? Please describe.
It's very useful to understand the pants dependencies structure, specifically when looking at transitive and 3rd party dependencies. Currently, it's not possible to generate the graph view of this information.

Describe the solution you'd like
I would like to be able to generate Graphviz graph of the dependency graph.

Describe alternatives you've considered
I have not considered alternatives

@benjyw
Copy link
Sponsor Contributor

benjyw commented Oct 17, 2021

To clarify - you can look at internal dependencies (including transitively) today, and you can see your immediate (non-transitive) 3rdparty dependencies today. So it sounds like what you're asking for is to be able to see transitive 3rdparty?

@benjyw benjyw self-assigned this Oct 17, 2021
@benjyw
Copy link
Sponsor Contributor

benjyw commented Oct 17, 2021

And, separately, you're asking for dot output?

@yoav-orca
Copy link
Contributor Author

How can I see as a graph (dot output) internal dependencies? I would also like to see this extends to 3rd party dependencies.

@benjyw
Copy link
Sponsor Contributor

benjyw commented Oct 18, 2021

Today you would take the JSON output of ./pants peek :: and have a script convert it to the dot format you want. Admittedly this isn't ideal, we could look into adding a dot output format to peek.

We are also looking into visualization using d3, will announce more when we have something.

@benjyw
Copy link
Sponsor Contributor

benjyw commented Oct 18, 2021

@stuhood @Eric-Arellano Thoughts? Emitting a dot graph seems like a reasonable ask. This is complicated by the question of which targets to show (generators vs generated) and at which granularity.

@benjyw
Copy link
Sponsor Contributor

benjyw commented Oct 18, 2021

I remain convinced that ./pants graph is a reasonable thing to have. It can deal with things like granularity (e.g., show me a graph at the package level) in a way that peek cannot.

@benjyw
Copy link
Sponsor Contributor

benjyw commented Oct 18, 2021

There is also the question of how transitive 3rdparty deps fit into this. It is helpful to show them to users (since we know them), but they aren't targets, so probably dependencies should not be the thing that emits them. So I'm not sure what should.

@Eric-Arellano
Copy link
Contributor

I think that makes sense now. Would the graph only have the target address and target type for each node? You can use peek if you need to create a more custom graph?

@benjyw
Copy link
Sponsor Contributor

benjyw commented Oct 18, 2021

I think that makes sense now. Would the graph only have the target address and target type for each node? You can use peek if you need to create a more custom graph?

Well, the nodes might not have addresses at all! They might be transitive 3rdparty deps, or "packages" (multiple targets rolled up to the directory level). And yeah I guess we'd put just a descriptive string on each node, and maybe edges would have a bit saying whether they were explicit or inferred.

@benjyw
Copy link
Sponsor Contributor

benjyw commented Oct 25, 2021

@stuhood thoughts on this? dependencies and peek, even together, do not appear to be sufficient for this use case, for at least two reasons:

  1. They are very address/target-centric, but we need to be able to generate graphs whose nodes are not targets (e.g., "packages", or "transitive third-party deps")
  2. They don't make sense as a place to output dot directives for graphviz

@stuhood
Copy link
Sponsor Member

stuhood commented Oct 25, 2021

Transitive thirdparty makes this challenging, because the only language for which we have transitive thirdparty deps as targets is go, and that's expensive (see #13152 (comment)), or requires tailor. Since we don't have them as targets we'd need a different in-memory datastructure for them, which would likely be language specific (or untyped as strings, maybe).

I'm not suggesting it (I really don't know what the best solution is here), but: peek could almost certainly support rendering as dot. PEX does by adding node information to the mouseover tooltip of nodes in the graph: pex-tool/pex#1132, as does Bazel.


It's maybe interesting to note that because Bazel forces you to declare transitive dependencies (such that you need to re-run something like tailor whenever thirdparty deps change), its graph introspection goals do end up rendering thirdparty deps. That's a pretty high cost if it's manual, but it's generally accomplished similarly to lockfile generation: when your root dependencies change, you have to re-generate your lockfile (which adjusts the targets generated by your lockfile).

@stuhood
Copy link
Sponsor Member

stuhood commented Oct 25, 2021

...all that to say: I wonder if we should generate the graph of transitive thirdparty deps from your named resolve lockfile. That might impact the design of named resolves a bit, because we don't have a target namespace per resolve currently.

@benjyw
Copy link
Sponsor Contributor

benjyw commented Oct 25, 2021

Transitive third-party is definitely challenging if we try to model them as targets. My opinion is that we should not do that. If we don't, then we do have ~easy access to them for Python at least.

As for rendering peek output as dot - that can be done, but you might not get the results you expect, since dot requires you to pre-declare all the vertices that the edges reference, so the best peek can do is render the subgraph induced on its input targets. Maybe that makes sense though.

But that is still target-centric, and doesn't allow rolling up to things the user actually cares about, e.g., packages.

@stuhood
Copy link
Sponsor Member

stuhood commented Oct 25, 2021

As for rendering peek output as dot - that can be done, but you might not get the results you expect, since dot requires you to pre-declare all the vertices that the edges reference, so the best peek can do is render the subgraph induced on its input targets. Maybe that makes sense though.

It doesn't require that, afaik. It will still render the node(s) for an edge if they are missing, just without any labels. See https://en.wikipedia.org/wiki/DOT_(graph_description_language)#Directed_graphs

@cognifloyd
Copy link
Member

I remain convinced that ./pants graph is a reasonable thing to have. It can deal with things like granularity (e.g., show me a graph at the package level) in a way that peek cannot.

The goal name graph sounds problematic to me.
Datascience projects are often built in python - I think it might be confusing to have a graph goal not do something with the project's data. Maybe that's fine, but my first reaction to the goal name is confusion.

@stuhood
Copy link
Sponsor Member

stuhood commented Dec 2, 2021

Relatedly: @yoav-orca also had the good suggestion to revive the paths goal: have opened #13774 about that.

@huonw
Copy link
Contributor

huonw commented Feb 25, 2023

Similar to #12733, the transitive 3rd-party deps issues with this might've become easier due to the newish concept of synthetic targets (#16979), when using lock files at least.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants