Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RealMaskRealInterval OR and AND hang when combining too many masks #60

Closed
tischi opened this issue Apr 22, 2022 · 28 comments
Closed

RealMaskRealInterval OR and AND hang when combining too many masks #60

tischi opened this issue Apr 22, 2022 · 28 comments

Comments

@tischi
Copy link
Contributor

tischi commented Apr 22, 2022

@ctrueden @tpietzsch

This code starts to hang for me around iteration 10:

		final RealMaskRealInterval mask = GeomMasks.closedBox( new double[]{ 0, 0, 0 }, new double[]{ 1, 1, 1 } );
		RealMaskRealInterval test = mask;
		for ( int i = 0; i < 20; i++ )
		{
			test = test.or( mask );
			System.out.println( i );
			System.out.println( test.realMin( 0 ));
		}

Am I doing something wrong or is it an issue with the code?

(I know that this specific example does not make sense as I am always combining the same mask, but it also hangs for me when doing it with a similar amount of different masks).

@ctrueden
Copy link
Member

ctrueden commented Apr 24, 2022

What is the code doing while the hang is happening? https://imagej.net/learn/troubleshooting#if-imagej-freezes-or-hangs

In an IDE you can simply hit the pause button and inspect the stack trace. Or use the IDE or VisualVM to profile the code to find out where it's spending its time.

@tischi
Copy link
Contributor Author

tischi commented Apr 24, 2022

I am running this in IntelliJ...

VisualVM gives this:

image

Can I do more to help the debugging?

@ctrueden
Copy link
Member

Hmm, it's a lot of recursion, yeah. But I'm still surprised; it's not exponential growth, but rather a linear increase.

I cannot spend time digging on this right now. Main thought off the top of my head is: can we introduce some memoization to avoid repeated recursion on unchanging intervals? The realMin, realMax functions of UnionRealInterval would be good candidates for that. Here is a quick patch (untested) for you to try:

diff --git a/src/main/java/net/imglib2/roi/Bounds.java b/src/main/java/net/imglib2/roi/Bounds.java
index 3f9e1ab..7c5145b 100644
--- a/src/main/java/net/imglib2/roi/Bounds.java
+++ b/src/main/java/net/imglib2/roi/Bounds.java
@@ -535,40 +535,60 @@ public abstract class Bounds< I extends RealInterval, B extends Bounds< I, B > >
 
 		private final RealInterval i2;
 
+		private final double[] min;
+
+		private final double[] max;
+
 		public UnionRealInterval( final RealInterval i1, final RealInterval i2 )
 		{
 			super( i1.numDimensions() );
 			this.i1 = i1;
 			this.i2 = i2;
 			assert ( i1.numDimensions() == i2.numDimensions() );
+			min = new double[ i1.numDimensions() ];
+			max = new double[ i1.numDimensions() ];
+			Arrays.fill( min, Double.NaN );
+			Arrays.fill( max, Double.NaN );
 		}
 
 		@Override
 		public double realMin( final int d )
 		{
-			if ( Intervals.isEmpty( i1 ) )
+			if ( Double.isNaN( min[ d ] ) )
 			{
-				if ( Intervals.isEmpty( i2 ) )
-					return Double.POSITIVE_INFINITY;
-				return i2.realMin( d );
+				if ( Intervals.isEmpty( i1 ) )
+				{
+					if ( Intervals.isEmpty( i2 ) )
+						min[ d ] = Double.POSITIVE_INFINITY;
+					else
+						min[ d ] = i2.realMin( d );
+				}
+				else if ( Intervals.isEmpty( i2 ) )
+					min[ d] = i1.realMin( d );
+				else
+					min[ d ] = Math.min( i1.realMin( d ), i2.realMin( d ) );
 			}
-			if ( Intervals.isEmpty( i2 ) )
-				return i1.realMin( d );
-			return Math.min( i1.realMin( d ), i2.realMin( d ) );
+			return min[ d ];
 		}
 
 		@Override
 		public double realMax( final int d )
 		{
-			if ( Intervals.isEmpty( i1 ) )
+			if ( Double.isNaN( max[d] )
 			{
-				if ( Intervals.isEmpty( i2 ) )
-					return Double.NEGATIVE_INFINITY;
-				return i2.realMax( d );
+				if ( Intervals.isEmpty( i1 ) )
+				{
+					if ( Intervals.isEmpty( i2 ) )
+						max[ d ] = Double.NEGATIVE_INFINITY;
+					else
+						max[ d ] = i2.realMax( d );
+				}
+				else if ( Intervals.isEmpty( i2 ) )
+					max[ d ] = i1.realMax( d );
+				else
+					max[ d ] = Math.max( i1.realMax( d ), i2.realMax( d ) );
 			}
-			if ( Intervals.isEmpty( i2 ) )
-				return i1.realMax( d );
-			return Math.max( i1.realMax( d ), i2.realMax( d ) );
+			return max[ d ];
 		}
 	}

@tischi
Copy link
Contributor Author

tischi commented Apr 25, 2022

Thanks! I will try to find time to test it!

tischi added a commit to tischi/imglib2-roi that referenced this issue Apr 25, 2022
@constantinpape
Copy link

Here's a link for how to apply the patch, @tischi: https://stackoverflow.com/questions/2249852/how-to-apply-a-patch-generated-with-git-format-patch
(Sent to you via mattermost earlier, but not sure if you have seen it in between the other messages.)

@tischi
Copy link
Contributor Author

tischi commented Apr 26, 2022

@ctrueden

Some new finding: it also hangs when combining the masks with and, and your path does not help in this case.

If you have any pointers for me where the look for the culprit, let me know.

@tischi
Copy link
Contributor Author

tischi commented Apr 26, 2022

I may have found something (sorry for potential spamming, using this to keep notes in case I cannot fully fix it).

This works fine (when commenting out the isEmpty calls).

		public IntersectionRealInterval( final RealInterval i1, final RealInterval i2 )
		{
			super( i1.numDimensions() );
			this.i1 = i1;
			this.i2 = i2;
			assert ( i1.numDimensions() == i2.numDimensions() );
		}

		@Override
		public double realMin( final int d )
		{
			//if ( Intervals.isEmpty( i1 ) || Intervals.isEmpty( i2 ) )
			//	return Double.POSITIVE_INFINITY;
			return Math.max( i1.realMin( d ), i2.realMin( d ) );
		}

		@Override
		public double realMax( final int d )
		{
			//if ( Intervals.isEmpty( i1 ) || Intervals.isEmpty( i2 ) )
			//	return Double.NEGATIVE_INFINITY;
			return Math.min( i1.realMax( d ), i2.realMax( d ) );
		}

@tischi
Copy link
Contributor Author

tischi commented Apr 26, 2022

This is fast as well, indicating that the recursion is in i1.

		@Override
		public double realMin( final int d )
		{
			//if ( Intervals.isEmpty( i1 ) || Intervals.isEmpty( i2 ) )
			if ( Intervals.isEmpty( i2 ) )
					return Double.POSITIVE_INFINITY;
			return Math.max( i1.realMin( d ), i2.realMin( d ) );
		}

		@Override
		public double realMax( final int d )
		{
			//if ( Intervals.isEmpty( i1 ) || Intervals.isEmpty( i2 ) )
			if ( Intervals.isEmpty( i2 ) )
				return Double.NEGATIVE_INFINITY;
			return Math.min( i1.realMax( d ), i2.realMax( d ) );
		}

@tischi
Copy link
Contributor Author

tischi commented Apr 26, 2022

I still don't get why there are sooo many recursions, but this is not very efficient:

if ( Intervals.isEmpty( i1 ) || Intervals.isEmpty( i2 ) )
	return Double.POSITIVE_INFINITY;
return Math.max( i1.realMin( d ), i2.realMin( d ) );

Because Intervals.isEmpty( i1 ) internally will call i1.realMin( d ), which is then called again in return Math.max( i1.realMin( d ), i2.realMin( d ) ); (the same holds true for i2). Thus, one could add explicit code for the Intervals.isEmpty( ... ) calls and then keep the i1.realMin( d ), i2.realMin( d ) values for computing the maximum.

@tischi
Copy link
Contributor Author

tischi commented Apr 26, 2022

Another place for optimization is that Intervals.isEmpty() contains this line of code: if ( interval.realMin( d ) > interval.realMax( d ) ) which will cause interval.isEmpty() to be evaluated twice (both within realMin() and realMax()).

@tischi
Copy link
Contributor Author

tischi commented Apr 26, 2022

Because Intervals.isEmpty( i1 ) internally will call i1.realMin( d ), which is then called again in return Math.max( i1.realMin( d ), i2.realMin( d ) ); (the same holds true for i2). Thus, one could add explicit code for the Intervals.isEmpty( ... ) calls and then keep the i1.realMin( d ), i2.realMin( d ) values for computing the maximum.

Adding the mentioned explicit code seems to help a bit but only gives a factor of 2 or so in speed up.

Right now, it feels to me that some (if only short-term, if that is possible) caching of at least whether the contained Intervals i1 and i2 isEmpty might be needed, but I am not sure.

@tischi
Copy link
Contributor Author

tischi commented Apr 26, 2022

Some new finding: it also hangs when combining the masks with "and", and your patch does not help in this case.

This is of course the case because there are UnionRealInterval and IntersectionRealInterval (for and and or), which both suffer independently from the same issue.

Another place for optimization is that Intervals.isEmpty() contains this line of code: if ( interval.realMin( d ) > interval.realMax( d ) ) which will cause interval.isEmpty() to be evaluated twice (both within realMin() and realMax()).

Maybe this is the reason for the explosion? Maybe that this leads to a 2 * 2 * 2 ... type of behavior?

In other words just to get one, e.g., realMin( 0 ) it currently accesses both realMin and realMax for all dimensions (for checking for emptiness), which in turn do the same again. That means realMin( d ) will cause (for a 3-D interval), 6 calls, where each of them again causes 6 calls; so maybe it is rather 6 * 6 * 6 ...?

Would it be possible to change the code (e.g., for interval instanceOf AbstractAdaptingRealInterval) such that there would be a realMinMax( double[][] ) which would get all dimensions in one go with checking only once for emptiness?

Or implement a special isEmpty() method...

@tischi
Copy link
Contributor Author

tischi commented Apr 26, 2022

@ctrueden @tpietzsch

I think I may have found a solution 🥳

All test run and it is fast ( only 8 ms even for a 1000 combined AND intervals ) .

	/**
	 * The intersection of two intervals. Adapts to changes of the source
	 * intervals.
	 */
	public static class IntersectionRealInterval extends AbstractAdaptingRealInterval
	{
		private final RealInterval i1;

		private final RealInterval i2;

		public IntersectionRealInterval( final RealInterval i1, final RealInterval i2 )
		{
			super( i1.numDimensions() );
			this.i1 = i1;
			this.i2 = i2;
			assert ( i1.numDimensions() == i2.numDimensions() );
		}

		@Override
		public double realMin( final int d )
		{
			if ( isEmpty() )
				return Double.POSITIVE_INFINITY;
			return Math.max( i1.realMin( d ), i2.realMin( d )  );
		}

		@Override
		public double realMax( final int d )
		{
			if ( isEmpty() )
				return Double.NEGATIVE_INFINITY;
			return Math.min( i1.realMax( d ), i2.realMax( d )  );
		}

                // Efficiently test emptiness, avoiding insane recursions            
	        public boolean isEmpty()
		{
			return isEmpty( i1 ) || isEmpty( i2 );
		}

                // Efficiently test emptiness, avoiding insane recursions
                // TODO: move somewhere "higher up" and add a case for UnionRealInterval
		public static boolean isEmpty( RealInterval interval )
		{
			if ( interval instanceof AbstractWrappedRealInterval )
			{
				final RealInterval source = ( ( AbstractWrappedRealInterval ) interval ).getSource();
				if ( source instanceof IntersectionRealInterval )
					return ( ( IntersectionRealInterval ) source ).isEmpty();
			}

			return Intervals.isEmpty( interval );
		}
	}

What do you think?

If you approve, I will continue to work on a PR tomorrow (getting late here).

@tpietzsch
Copy link
Member

 // Efficiently test emptiness, avoiding insane recursions            
public boolean isEmpty()
{
	return isEmpty( i1 ) || isEmpty( i2 );
}

This is most likely wrong.
The condition is sufficient but not necessary: the intersection of i1 and i2 may still be empty if neither i1 nor i2 are empty.

@tischi
Copy link
Contributor Author

tischi commented Apr 27, 2022

@tpietzsch In fact, I agree, I had the same thought, but in the original code it looked like this:

@Override
public double realMax( final int d )
{
	if ( Intervals.isEmpty( i1 ) || Intervals.isEmpty( i2 ) )
		return Double.NEGATIVE_INFINITY;
	return Math.min( i1.realMax( d ), i2.realMax( d )  );
}

The point is to replace the Intervals.isEmpty( i1 ) || Intervals.isEmpty( i2 ) by something computationally more efficient. I managed to do this with the code above.

Do you think that

a) already the original code is wrong, or
b) if I rename the new isEmpty() method to something more appropriate would be fine?

@tischi
Copy link
Contributor Author

tischi commented Apr 27, 2022

My feeling would be that it is wrong in the original code.

Maybe we should replace:

Intervals.isEmpty( i1 ) || Intervals.isEmpty( i2 )

with

Intervals.isEmpty( Intervals.intersect( i1, i2 ) )

?

@tischi
Copy link
Contributor Author

tischi commented Apr 27, 2022

I am confused.

In OperatorsTest.testAndMovingOperands() there are these two lines of code directly below each other:

		assertTrue( rm.isEmpty() );
		assertEquals( rm.realMax( 0 ), 13.5, 0 );

To me this indicates that we apparently do not want rm.realMax( 0 ) to return Double.NEGATIVE_INFINITY if the interval is empty.

I don't know what would be correct here, to continue I would need some help/explanation about what should happen here and why.

@tischi
Copy link
Contributor Author

tischi commented Apr 27, 2022

The behavior with just standard imglib2 is in fact such that it does return some finite value for the interval bounds even if it is empty:

		final FinalInterval i01 = FinalInterval.createMinMax( 0, 1 );
		final FinalInterval i23 = FinalInterval.createMinMax( 2, 3 );
		final FinalInterval empty = Intervals.intersect( i01, i23 );
		final boolean isEmpty = Intervals.isEmpty( empty );
		final double realMin = empty.realMin( 0 );
		System.out.println( isEmpty ); // yields: true
		System.out.println( realMin ); // yields: 2

Then I would suggest to proceed with my refactoring approach, but reconsider the name of the new isEmpty() method that I introduced.

@tischi
Copy link
Contributor Author

tischi commented Apr 27, 2022

No, confused again... ;-)

I don't get why we do these isEmpty checks in IntersectionRealInterval at all:

@Override
public double realMax( final int d )
{
	if ( Intervals.isEmpty( i1 ) || Intervals.isEmpty( i2 ) )
		return Double.NEGATIVE_INFINITY;
	return Math.min( i1.realMax( d ), i2.realMax( d )  );
}

In fact, If I just comment it out all test pass... 😕

@Override
public double realMax( final int d )
{
	//if ( Intervals.isEmpty( i1 ) || Intervals.isEmpty( i2 ) )
	//	return Double.NEGATIVE_INFINITY;
	return Math.min( i1.realMax( d ), i2.realMax( d )  );
}

So, maybe it should just be removed?
It would solve the recursion issue 😄

@ctrueden any opinions?

@tischi
Copy link
Contributor Author

tischi commented Apr 28, 2022

Could IntersectionRealInterval be considered a lazy version of Intervals.intersect?

If yes, then I think they should both behave similar and the currently existing isEmpty checks of the intersecting intervals within IntersectionRealInterval should be removed, because they do not exist in Intervals.intersect.

The same argumentation applies to UnionRealInterval and Intervals.union.

@tpietzsch
Copy link
Member

Could IntersectionRealInterval be considered a lazy version of Intervals.intersect?

If yes, then I think they should both behave similar and the currently existing isEmpty checks of the intersecting intervals within IntersectionRealInterval should be removed, because they do not exist in Intervals.intersect.

I think: Yes to both...

The same argumentation applies to UnionRealInterval and Intervals.union.

No.
Please see discussion here: imglib/imglib2#300

Intervals.union is broken for empty intervals. (Broken with respect to the expectation that union of two empty intervals should be empty.) In this case, I think the Intervals.union should be fixed, not the UnionRealInterval made also broken.

@bogovicj
Copy link
Contributor

A fix for Intervals.union is now here imglib/imglib2#301

@tischi tischi changed the title RealMaskRealInterval OR hangs when combining too many masks RealMaskRealInterval OR and AND hang when combining too many masks May 10, 2022
@tpietzsch
Copy link
Member

I merged imglib/imglib2#301. See imglib/imglib2#315 for follow-up discussion.

@tpietzsch
Copy link
Member

Some new finding: it also hangs when combining the masks with "and", and your patch does not help in this case.

This is of course the case because there are UnionRealInterval and IntersectionRealInterval (for and and or), which both suffer independently from the same issue.

Another place for optimization is that Intervals.isEmpty() contains this line of code: if ( interval.realMin( d ) > interval.realMax( d ) ) which will cause interval.isEmpty() to be evaluated twice (both within realMin() and realMax()).

Maybe this is the reason for the explosion? Maybe that this leads to a 2 * 2 * 2 ... type of behavior?

Yes, that's the problem exactly.

Would it be possible to change the code (e.g., for interval instanceOf AbstractAdaptingRealInterval) such that there would be a realMinMax( double[][] ) which would get all dimensions in one go with checking only once for emptiness?

I think this is the right idea.
I prototyped it here, and it fixes the performance issue. (union is now as fast as intersection in a similar test),
56bab03

What needs to be done:

  • Do the same for UnionInterval (I only fixed the UnionRealInterval).
  • Pull up the realMinMax(double[], double[]) into AbstractAdaptingRealInterval as an abstract method, and implement all other realMin/realMax in AbstractAdaptingRealInterval methods to use it. Then AbstractAdaptingRealInterval subclasses (only) need to implement realMinMax(double[], double[]). (I'm not 100% sure this is a good idea, but looks like the right direction. I think otherwise we may still get the 2 * 2 * 2 blow-up if we for example have a chain of union-transform-union-transform-...)

If somebody wants to work making this into a PR that would be great!

@tischi
Copy link
Contributor Author

tischi commented May 25, 2022

Just double checking: should we still do all of this even if we would have a isIntervalEmpty() method as discussed in imglib/imglib2#315?

If so, I could try to give this a shot next week.

@tischi
Copy link
Contributor Author

tischi commented May 25, 2022

@tpietzsch
I started working on this here: #63
I will ping you once it's ready.

@tpietzsch
Copy link
Member

Just double checking: should we still do all of this even if we would have a isIntervalEmpty() method as discussed in imglib/imglib2#315?

Yes, I think isIntervalEmpty() alone would not solve it.

@imagejan
Copy link
Member

Can this be closed, now that #63 is merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants