Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[issue-558] add visualization feature #562

Merged
merged 1 commit into from Apr 6, 2023

Conversation

meretp
Copy link
Collaborator

@meretp meretp commented Apr 3, 2023

This PR adds some functionality to visualize the structure of a SPDX document. The visualization includes the SPDXIDs of all elements in the document (Package, File, Snippet and any external SPDXIDs) and their connection based on the provided relationships. The new logic converts a Document to a pygraphviz AGraph which can either be stored in a .dot file or rendered to an image e.g. a .png.

Copy link
Collaborator

@armintaenzertng armintaenzertng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! :)
I've left a few change requests and comments below.

src/common/visualization/visualization.py Outdated Show resolved Hide resolved
src/common/visualization/visualization.py Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
src/common/visualization/visualization.py Outdated Show resolved Hide resolved
src/common/visualization/visualization.py Outdated Show resolved Hide resolved
src/common/visualization/visualization.py Outdated Show resolved Hide resolved
src/common/visualization/visualization.py Outdated Show resolved Hide resolved
src/common/visualization/visualization.py Outdated Show resolved Hide resolved
tests/common/test_visualization.py Outdated Show resolved Hide resolved
tests/common/test_visualization.py Outdated Show resolved Hide resolved
@zvr
Copy link
Member

zvr commented Apr 3, 2023

Ehmmm... a more general question: why did you choose to use pygraphviz and networkx instead of the much simpler graphviz module?

I mean, it's nice to have everything inside your code, but you then will have to eventually expose all functions somehow (e.g., produce SVG, produce PDF, scale factor, use different layout engine, etc., etc.)

The alternative would be:

import graphviz
g = graphviz.Digraph('SPDX diagram', filename=fname_out) 
for e in element_list:
    g.node(e.id, label=e.name)
for r in relationship_list:
    g.edge(r.from, r.to)
g.save()
# g.view() if you want

@maxhbr
Copy link
Member

maxhbr commented Apr 3, 2023

We also want to expose functions to convert to networkx. This is helpful to work with the complex data.

@zvr
Copy link
Member

zvr commented Apr 3, 2023

Ah, ok. Then this is not about "visualization", the requirement is to have the graph in networkx.

@meretp
Copy link
Collaborator Author

meretp commented Apr 4, 2023

According to the comments I restructured and renamed the methods and the description in the markdown. This PR now adds functionality to create a relationship graph from the document which can be rendered to a picture for visualization but doesn't have to be. I hope this fits the comments.
I have one open question:
How to include this feature in the cli? The current implementation uses the argument --graph_generation and expects a file path as input where the generated graph - or rendered picture- will be stored to. An alternative approach would be to use --graph_generation only as a flag and use --outfile for the file path which would be on the one hand more consistent, on the other hand could also be confusing as --outfile is reserved for the conversion of documents. What do you think?

@meretp meretp force-pushed the issue_558_add_graph_visualization branch from 9e01519 to aafb5e7 Compare April 4, 2023 09:44
README.md Outdated Show resolved Hide resolved
@meretp meretp force-pushed the issue_558_add_graph_visualization branch from f04cbce to e3228a5 Compare April 4, 2023 13:38
Copy link
Collaborator

@armintaenzertng armintaenzertng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few more comments regarding wording.

src/spdx/graph_generation.py Outdated Show resolved Hide resolved
src/spdx/graph_generation.py Outdated Show resolved Hide resolved
src/spdx/graph_generation.py Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@armintaenzertng
Copy link
Collaborator

How to include this feature in the cli? The current implementation uses the argument --graph_generation and expects a file path as input where the generated graph - or rendered picture- will be stored to. An alternative approach would be to use --graph_generation only as a flag and use --outfile for the file path which would be on the one hand more consistent, on the other hand could also be confusing as --outfile is reserved for the conversion of documents. What do you think?

One "advantage" of the current approach is that you can do both conversion and graph generation in one. I'm not sure, though, if this is actually something to be avoided. I don't have strong feelings either way.

armintaenzertng
armintaenzertng previously approved these changes Apr 5, 2023
Copy link
Collaborator

@armintaenzertng armintaenzertng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! :)

@zvr
Copy link
Member

zvr commented Apr 5, 2023

My vote is for

use --graph_generation only as a flag and use --outfile for the file path

It's better to be consistent with treating outfile as the output of the process. Yes, it limits that you cannot convert and generate the graph on the same step, but I don't think that would be a common use case.

Addiotnally, please rethink the --generate_graph CLI flag and try to avoid the underscore (isn't --graph enough? else, use -).

Ehhmmm, wait a minute: is it an option or a subcommand? Apologies for not checking, but is the rest of functionality invoked as --validate , --convert, etc. or as commands (like git foo).

@meretp
Copy link
Collaborator Author

meretp commented Apr 5, 2023

Ehhmmm, wait a minute: is it an option or a subcommand? Apologies for not checking, but is the rest of functionality invoked as --validate , --convert, etc. or as commands (like git foo).

The rest is invoked as options, e.g. --infile (-i as shortform). According to this I will add an option --graph which generates the graph and writes it to the file provided with --outfile.

Comment on lines +52 to +54
related_spdx_element_id = str(relationship.related_spdx_element_id)

if related_spdx_element_id not in graph.nodes():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the intended behavior of NOASSERTION/NONE Nodes? The way I understand this, there will be only a single one in the whole graph that all relationships that have NOASSERTION/NONE in them will point to. Is that the way it should be?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your interpretation for the current implementation is correct. I don't have an opinion on how it should be. In the current use case we would transform the generated graph to a tree, so I didn't worry about it. But probably it would be better (clearer, in relation to a picture) to have individual NOASSERTION nodes for each relationship.

armintaenzertng
armintaenzertng previously approved these changes Apr 6, 2023
Copy link
Collaborator

@armintaenzertng armintaenzertng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now, only a minor remark


elif graph:
try:
export_graph_from_document(document, outfile)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe insert a catch that outfile must be set, i.e. not None, and inform the user about it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds like a nice to have. Would merge it now

@meretp meretp force-pushed the issue_558_add_graph_visualization branch from d596d5b to e34d047 Compare April 6, 2023 08:22
Signed-off-by: Meret Behrens <meret.behrens@tngtech.com>
@meretp meretp merged commit 91bb744 into spdx:main Apr 6, 2023
32 checks passed
@meretp meretp deleted the issue_558_add_graph_visualization branch April 6, 2023 08:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants