Skip to content

[WIP] [AMLS] shortest path#1254

Closed
clarapueyoballarin wants to merge 7 commits into
apache:masterfrom
clarapueyoballarin:master
Closed

[WIP] [AMLS] shortest path#1254
clarapueyoballarin wants to merge 7 commits into
apache:masterfrom
clarapueyoballarin:master

Conversation

@clarapueyoballarin
Copy link
Copy Markdown
Contributor

@clarapueyoballarin clarapueyoballarin commented May 4, 2021

edit builtin function
add test

Copy link
Copy Markdown
Contributor

@mboehm7 mboehm7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting started on this builtin function. Please find below a few detailed comments. I think we could simplify the algorithms by taking inspiration from the existing components() builtin.

Comment thread scripts/builtin/shortestPath.dml Outdated
#
# Documentation; "Pregel: A System for Large-Scale Graph Processing"
# Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bilk,
# James C. Dehnert, Ikkan Horn, Naty Leiser and Grzegorz Czajkowski
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use a similar reference style like the slicefinder() builtin function.

Comment thread scripts/builtin/shortestPath.dml Outdated
#
# - Adjacency matrix of the labeled graph
# (also considered directed labeled graphs)
#
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please reformat that to a compact input/output documentation like the other scripts?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also you might want to specify if the graph is a 0/1 representation or holds the vertex distances.

Comment thread scripts/builtin/shortestPath.dml Outdated
}

matrixSize = nrow(G)
infValue = sum(rowSums(G)) + 1 # value representing infinity, i.e. the nodes are not connected
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use a real Inf (available as constant)?

Comment thread scripts/builtin/shortestPath.dml Outdated
return (Matrix[Double] C)
{

print("SHORTEST PATH CALCULATION");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a verbose flag to guard such prints.

Comment thread scripts/builtin/shortestPath.dml Outdated
#


s_shortestPath = function(Matrix[Double] G)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should take the graph, vertex distances, and vertex id of the single source id for comparing the shortest paths to all other nodes.

Comment thread scripts/builtin/shortestPath.dml Outdated
# initialize the matrix of minimum distances with "infinity" values:
minDistMatrix = matrix(infValue,rows=matrixSize,cols=matrixSize)

for (sourceNode in 1:matrixSize){
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The single source node should be given as a parameter.

Comment thread scripts/builtin/shortestPath.dml Outdated

# find the neighbours of the sourceNode and fill in the neighboursList:
nodeIdx = 1
for(ineighbour in 1:matrixSize){
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to materialize these neighbors because the graph already holds this information.

Comment thread scripts/builtin/shortestPath.dml Outdated

# define the distance between the not connected nodes as -1:

for (irow in 1:matrixSize){
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be dropped - simply initialize the vector of min distances as inf and if a node is not reachable this entry will never be updated.

Comment thread scripts/builtin/shortestPath.dml Outdated
}


while( as.integer(as.scalar(neighboursList[1,1])) > 0 ){ # loop of supersteps (see documentation)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try to replace this complex loop with the loop from components() and modify this algorithm accordingly. Instead of propagating the max id in connected components we simply can take the current min distance of neighbors plus the distance of each node. This way the entire algorithms because a small few line vectorized loop.

Comment thread scripts/builtin/shortestPath.dml Outdated
}

C=minDistMatrix
print("SHORTEST PATH CALCULATION FINISHED, CHECK OUTPUT MATRIX OF MINIMUM DISTANCES");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we should add the respective unit test - this can be copied from components() too and encode a small graph with node distances and expected shortest paths. In general, always try to start with the test, which also then allows you consider and design the appropriate function API (arguments)

@Baunsgaard
Copy link
Copy Markdown
Contributor

Hi, @clarapueyoballarin

I can see the tests are working, do you think this is ready for review?

@clarapueyoballarin
Copy link
Copy Markdown
Contributor Author

Hi, @clarapueyoballarin

I can see the tests are working, do you think this is ready for review?

@Baunsgaard, Hello, yes, it is ready for review, but I am still open to suggestions that you may have. Thank you.

Copy link
Copy Markdown
Contributor

@Baunsgaard Baunsgaard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, only minor syntax errors.

Comment thread scripts/builtin/shortestPath.dml Outdated
{

if(verbose) {
print("SHORTEST PATH CALCULATION");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use two spaces in dml for indentation, and tabs in java.

Comment thread scripts/builtin/shortestPath.dml Outdated
#
#C Output matrix (double) of minimum distances (shortest-path) between vertices:
# - The value of the ith row and the jth column of the output matrix is
# the minimum distance shortest-path from vertex i to vertex j.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe use spaces for these descriptions. to align all descriptions


private void runShortestPathNodeTest(int node, double [][] Res)
{
loadTestConfiguration(getTestConfiguration(TEST_NAME));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation

@mboehm7
Copy link
Copy Markdown
Contributor

mboehm7 commented Jul 4, 2021

LGTM - during the merge I fixed the remaining warnings, formatting issues (e.g., input/output documentation), and resolved the merge conflicts due to recent name changes of other tests (capitalization). The git author of the commits in this PR does not seem to match though - @clarapueyoballarin please add the used email to your github handle to link the final commit to your account.

@asfgit asfgit closed this in 1dc278a Jul 4, 2021
ilovemesomeramen pushed a commit to ilovemesomeramen/systemds that referenced this pull request Jul 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants