Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract dependency graph for package in stack #2784

Open
wants to merge 10 commits into
base: develop
Choose a base branch
from

Conversation

ocaisa
Copy link
Member

@ocaisa ocaisa commented Feb 22, 2019

This is required for building a depgraph for an entire installation tree

@ocaisa
Copy link
Member Author

ocaisa commented Feb 22, 2019

As discussed in #2783, this forms the basis for an alternative approach where we map all the easyconfigs in the software directory to a graph with (something like):

mkdir temp
cd temp
cp $EASYBUILD_INSTALLPATH/software/*/*/easybuild/*.eb .
eb --robot=./ --dep-graph=depgraph.dot *.eb

This results in a depgraph for the entire installed software stack. The resulting file is too huge to deal with directly so I wrote a script that can extract all the software that depends (directly or indirectly) on a certain module:

find_children () {
 # The fact that we put the semicolon straight after means we exclude anywhere the module
 # is only required as a build dependency (since these cases have extra formatting the
 # semicolon comes later)
 grep ' "'$3'";' $1 >> $2
 grep ' "'$3'";' $1 | awk '{print $1}'| xargs -i bash -c "map_dep $1 $2 {}" 
}

map_dep () {
 # Once we support coloring of modules to indicate whether or not something is installed
 # this will need to be updated to be conditional on whether the module *is* installed
 echo \"$3\"\; >> $2
 find_children $1 $2 $3
}

export -f find_children
export -f map_dep

echo digraph graphname \{ > $2
map_dep $1 $2 $3
echo \} >> $2

and makes a specific dot file for that software. You call it with

./<script> <input dot file> <output dot file> "Compiler/GCCcore/7.3.0/h5py/2.8.0-serial-Python-3.6.6"

@ocaisa
Copy link
Member Author

ocaisa commented Feb 26, 2019

I made a set of updates to the script that leverages this:

find_children () {
 # The fact that we put the semicolon straight after means we exclude anywhere the module
 # is only required as a build dependency (since these cases have extra formatting the
 # semicolon comes later);
 # the sed command makes sure we ignore whether the module is hidden or not
 grep ' "'$3'";' $1 | sed s#/\\.#/#g >> $2
 grep ' "'$3'";' $1 | awk '{print $1}'| xargs -i bash -c "map_dep $1 $2 {}" 
}

map_dep () {
 # Search for a node that matches the string and add a comment marker at the end
 # to leverage with grep later
 # (if non-installed software had additional formatting they would be ignored)
 # the sed command makes sure we ignore whether the module is hidden or not
 grep "^"'"'$3'";' $1 | awk '{print $1" // xxnodexx"}' | sed s#/\\.#/#g >> $2
 if [ $? ]
 then
   find_children $1 $2 $3
 fi
}

export -f find_children
export -f map_dep

# Check command line
if [ "$#" -ne 3 ]; then
    echo -e "Expected command line:\n\t<script> <input dot file> <output dot file> <node to search for>"
    exit 1
fi

# Begin digraph in output file
echo digraph graphname \{ > $2

# Use a temporary file to store nodes and edges
> temp.dot
# Gather nodes and edges related to $3
map_dep $1 temp.dot $3

# There is potential duplication due to nodes being followed multiple times so let's
# remove it:
# put nodes first (we used a marker to identify them), make sure they are unique
cat temp.dot | sort | uniq | grep xxnodexx >> $2
# Then edges, also make sure they are unique
cat temp.dot | sort | uniq | grep -v xxnodexx >> $2

# Clean up
rm temp.dot
echo \} >> $2

@ocaisa
Copy link
Member Author

ocaisa commented Feb 26, 2019

Here are the timings to generate the graph for the current set of 2018b easyconfigs:

alanc@alanc-VirtualBox:~$ time eb  --dep-graph=depgraph.dot easybuild-easyconfigs/easybuild/easyconfigs/*/*/*2018b*.eb
== temporary log file in case of crash /tmp/eb-YPRS0G/easybuild-_A6fNF.log
Wrote dependency graph for 533 easyconfigs to depgraph.dot

real	0m29.663s
user	0m28.644s
sys	0m1.003s

@damianam
Copy link
Member

LGTM. But I think the script is where the real usefulness of this arises. Does it make sense to include it in https://github.com/easybuilders/easybuild-framework/tree/master/easybuild/scripts ?

@boegel boegel added this to the 3.x milestone Jun 3, 2019
easybuild/framework/easyconfig/tools.py Outdated Show resolved Hide resolved
easybuild/framework/easyconfig/tools.py Outdated Show resolved Hide resolved
easybuild/scripts/extract_node_from_dotfile.sh Outdated Show resolved Hide resolved
easybuild/scripts/extract_node_from_dotfile.sh Outdated Show resolved Hide resolved
@easybuilders easybuilders deleted a comment from boegelbot Jun 13, 2019
@ocaisa ocaisa closed this Aug 30, 2019
@ocaisa ocaisa reopened this Aug 30, 2019
@ocaisa ocaisa closed this Sep 10, 2019
@ocaisa ocaisa reopened this Sep 10, 2019
@boegel boegel modified the milestones: 3.x, 4.x Sep 10, 2019
@boegel
Copy link
Member

boegel commented Sep 10, 2019

@ocaisa Can we come up with a test that verifies that we allow pre-existing edges now?

@ocaisa
Copy link
Member Author

ocaisa commented Sep 13, 2019

I tried but I can't create a trigger this (at least based on the test easyconfigs). I'll take another look again soon

@ocaisa ocaisa changed the title Tolerate pre-existing edges in depgraph Extract dependency graph for package in stack Apr 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants