[WIP] [AMLS] shortest path#1254
Conversation
mboehm7
left a comment
There was a problem hiding this comment.
Thanks for getting started on this builtin function. Please find below a few detailed comments. I think we could simplify the algorithms by taking inspiration from the existing components() builtin.
| # | ||
| # Documentation; "Pregel: A System for Large-Scale Graph Processing" | ||
| # Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bilk, | ||
| # James C. Dehnert, Ikkan Horn, Naty Leiser and Grzegorz Czajkowski |
There was a problem hiding this comment.
Please use a similar reference style like the slicefinder() builtin function.
| # | ||
| # - Adjacency matrix of the labeled graph | ||
| # (also considered directed labeled graphs) | ||
| # |
There was a problem hiding this comment.
Could you please reformat that to a compact input/output documentation like the other scripts?
There was a problem hiding this comment.
Also you might want to specify if the graph is a 0/1 representation or holds the vertex distances.
| } | ||
|
|
||
| matrixSize = nrow(G) | ||
| infValue = sum(rowSums(G)) + 1 # value representing infinity, i.e. the nodes are not connected |
There was a problem hiding this comment.
why not use a real Inf (available as constant)?
| return (Matrix[Double] C) | ||
| { | ||
|
|
||
| print("SHORTEST PATH CALCULATION"); |
There was a problem hiding this comment.
Maybe add a verbose flag to guard such prints.
| # | ||
|
|
||
|
|
||
| s_shortestPath = function(Matrix[Double] G) |
There was a problem hiding this comment.
should take the graph, vertex distances, and vertex id of the single source id for comparing the shortest paths to all other nodes.
| # initialize the matrix of minimum distances with "infinity" values: | ||
| minDistMatrix = matrix(infValue,rows=matrixSize,cols=matrixSize) | ||
|
|
||
| for (sourceNode in 1:matrixSize){ |
There was a problem hiding this comment.
The single source node should be given as a parameter.
|
|
||
| # find the neighbours of the sourceNode and fill in the neighboursList: | ||
| nodeIdx = 1 | ||
| for(ineighbour in 1:matrixSize){ |
There was a problem hiding this comment.
You don't need to materialize these neighbors because the graph already holds this information.
|
|
||
| # define the distance between the not connected nodes as -1: | ||
|
|
||
| for (irow in 1:matrixSize){ |
There was a problem hiding this comment.
can be dropped - simply initialize the vector of min distances as inf and if a node is not reachable this entry will never be updated.
| } | ||
|
|
||
|
|
||
| while( as.integer(as.scalar(neighboursList[1,1])) > 0 ){ # loop of supersteps (see documentation) |
There was a problem hiding this comment.
try to replace this complex loop with the loop from components() and modify this algorithm accordingly. Instead of propagating the max id in connected components we simply can take the current min distance of neighbors plus the distance of each node. This way the entire algorithms because a small few line vectorized loop.
| } | ||
|
|
||
| C=minDistMatrix | ||
| print("SHORTEST PATH CALCULATION FINISHED, CHECK OUTPUT MATRIX OF MINIMUM DISTANCES"); |
There was a problem hiding this comment.
yes, we should add the respective unit test - this can be copied from components() too and encode a small graph with node distances and expected shortest paths. In general, always try to start with the test, which also then allows you consider and design the appropriate function API (arguments)
|
I can see the tests are working, do you think this is ready for review? |
@Baunsgaard, Hello, yes, it is ready for review, but I am still open to suggestions that you may have. Thank you. |
Baunsgaard
left a comment
There was a problem hiding this comment.
LGTM, only minor syntax errors.
| { | ||
|
|
||
| if(verbose) { | ||
| print("SHORTEST PATH CALCULATION"); |
There was a problem hiding this comment.
we use two spaces in dml for indentation, and tabs in java.
| # | ||
| #C Output matrix (double) of minimum distances (shortest-path) between vertices: | ||
| # - The value of the ith row and the jth column of the output matrix is | ||
| # the minimum distance shortest-path from vertex i to vertex j. |
There was a problem hiding this comment.
maybe use spaces for these descriptions. to align all descriptions
|
|
||
| private void runShortestPathNodeTest(int node, double [][] Res) | ||
| { | ||
| loadTestConfiguration(getTestConfiguration(TEST_NAME)); |
|
LGTM - during the merge I fixed the remaining warnings, formatting issues (e.g., input/output documentation), and resolved the merge conflicts due to recent name changes of other tests (capitalization). The git author of the commits in this PR does not seem to match though - @clarapueyoballarin please add the used email to your github handle to link the final commit to your account. |
AMLS project SS2021. Closes apache#1254.
edit builtin function
add test