Skip to content

[MRESOLVER-93] PathRecordingDependencyVisitor to handle 3 cycles #855

@jira-importer

Description

@jira-importer

Tomo Suzuki opened MRESOLVER-93 and commented

PathRecordingDependencyVisitor cannot handle dependency graphs that have 3 or more cycles such as below:
 

gid:a:1 (1)
+- gid:b:0
|  \- ^1
+- gid:b:1
|  \- ^1
\- gid:b:2
   \- ^1

It fails with StackOverflowError or OutOfMemoryError. Test case.

 

Solutions

I came up with three solutions. I pick solution #1 for simplicity. 

1. Use "parents" to check the cycle, rather than visited set

This is the simplest. Checking array element member is usually discouraged especially for large data set. The implementation should confirm the overhead of this solution.

2. Use AbstractMapBag/Multiset for visited set

Creating a new class that extends AbstractMapBag and leverages IdentityHashMap. Although this solution would be theoretically more efficient than solution #1, I felt it's overkill to create a class just for this solution.

AbstractMapBag(new IdentityHashMap<DependencyNode, AbstractMapBag.MutableInteger>())

 

https://commons.apache.org/proper/commons-collections/apidocs/org/apache/commons/collections4/bag/AbstractMapBag.html

 

IdentityHashMap<DependencyNode, Integer>() would work as a multiset.

3. Call visitLeave only when visitEnter is true

The cause of this bug is DefaultDependencyNode calling visitLeave regardless of visitEnter result.

I'm not sure how many other visitors rely on visitLeave being called regardless of visitEnter result.

Illustration on why existing algorithm does not catch cycle 

The following illustration is the node traversal for the test case above by current algorithm. This illustration tracks the dependency node graph and the "visited" set maintained by the visitor.

  • visited set. An internal data structure in PathRecordingDependencyVisitor to avoid cycle (link).
  • visitEnter(node): PathRecordingDependencyVisitor's function (link). When returning true, the node's children is traversed by the algorithm. This function adds the node to visited set.
  • visitLeave(node): PathRecordingDependencyVisitor's function (link). This function removes the node from visited set.

  

The initial state starts with node "a" and visited set {a}.

!IMG_0234.jpg|width=334,height=252!

First child of a is b0. Because visited does not contain, visitEnter(b0) returns true, meaning that the algorithm traverses this b0's children next. B0 is added to visited.

!IMG_0235.jpg|width=359,height=191!

B0's children is "a". Because visited set contains "a", visitEnter(a) returns false. This means that the algorithm does not traverse this "a"'s children. A is added to visited set (already it has).

  !IMG_0236.jpg|width=438,height=197!

Now not traversing this "a"'s children, the algorithm calls visitLeave(a). This removes "a" from visited set.

!IMG_0237.jpg|width=434,height=165!

B0's children are all traversed. the algorithm calls visitLeave(b0). This removes "b0" from visited set.

!IMG_0238.jpg|width=459,height=197!

Now visited set is empty.

Next child of the root "a" is b1. B1 is not in visited set, thus visitEnter(b1) returns true. This means the algorithm traverses the children of this b1.

!IMG_0240.jpg|width=445,height=270!

B1's only child is a. "a" is not in visited set. visitEnter(a) returns true. This means to traverse "a"'s children.

!IMG_0241.jpg|width=418,height=262!

A's first children is b0. b0 is not in visited set. visitEnter(b0) returns true, meaning to traverse children of this b0.

!IMG_0242.jpg|width=422,height=208!  
(img 0242)

The only child of b0 is "a". Visited set contains "a", and thus not traversing its children.

!IMG_0243.jpg|width=491,height=191!

visitLeave(a) removes "a" from visited set.

!IMG_0244.jpg|width=481,height=189!

b0's children is all traversed. VisitLeave(b0) removes b0 from visited set.

!IMG_0245.jpg|width=498,height=182!

Next child of this "a" is b1. B1 is in visited set, and thus visitEnter(b1) returns false. This node's children is not to be traversed.

!IMG_0255.jpg|width=545,height=245!

(img 0255)

visitLeave(b1) removes b1 from visited set. Now visited is emtpy.

!IMG_0256.jpg|width=528,height=294!

The last child of "a" is b2. VisitEnter(b2) returns true. It's children is to be traversed. B2 is in visited set.

!IMG_0257.jpg|width=502,height=309!

 B2's only child is "a". "a" is not in visited set, thus visitEnter(a) returns true. The algorithm traverses this "a"'s children.

!IMG_0258.jpg|width=485,height=299!

(img 0258)

 

(...omit...)

 

IMG_0266 shows the step where I decided to give up. The algorithm does not seem to stop. Indeed the test shows that. The path from the root to the furthest a includes 5 "a" nodes. I concluded the visited set is not working as expected to avoid cycle.

!IMG_0266.jpg|width=656,height=252!

 

 


Affects: 1.3.3, 1.4.0, 1.4.1

Attachments:

Issue Links:

  • MNG-6737 StackOverflowError when version ranges are unsolvable and graph contains a cycle

  • MRESOLVER-38 SOE/OOME in DefaultDependencyNode.accept
    ("is depended upon by")

Remote Links:

1 votes, 4 watchers

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions