Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sim pauses when large numbers of bunnies are born #60

Closed
pixelzoom opened this issue Apr 21, 2020 · 73 comments
Closed

sim pauses when large numbers of bunnies are born #60

pixelzoom opened this issue Apr 21, 2020 · 73 comments

Comments

@pixelzoom
Copy link
Contributor

pixelzoom commented Apr 21, 2020

Related to phetsims/tandem#173 (PhetioGroup is not scalable).

As the number of bunnies increases, we start to experience problems with responsiveness and performance. This is even before starting to collect the large amounts of data mentioned in #46.

When large numbers of bunnies mate at 12:00 on the generation clock, that results in many calls to PhetioGroup.createNextElement. For example, if I have 800 bunnies, that's 400 pairs x 4 per litter = 1600 bunnies that will be created, and 1600 calls to PhetioGroup.createNextElement. This results in a very noticeable pause in animation - seconds in some cases.

Is this pause in animation acceptable?

@pixelzoom
Copy link
Contributor Author

pixelzoom commented Apr 21, 2020

This problem can be demonstrated with

  1. Run https://phet-dev.colorado.edu/html/natural-selection/1.0.0-dev.31/phet/natural-selection_en_phet.html?secondsPerGeneration=1
  2. Go to the Lab screen
  3. Press the "Add a Mate" button
  4. Observe how the pause in animation increases with each successive generation.

@pixelzoom
Copy link
Contributor Author

pixelzoom commented Apr 23, 2020

Discussed at 4/23/20 design meeting. This pause is not desirable.

@samreid determined that 58% of time is in allocating BunnyNode, 31% is in allocating the PressListener for each BunnyNode (used to select bunnies). See profiler screenshot below. He suggested pre-allocating 3000 BunnyNode instances. Since we don't know what a BunnyNode will look like until a Bunny is born, BunnyNode would need to be fully mutable.

image

@samreid
Copy link
Member

samreid commented Apr 23, 2020

I also mentioned on slack:

If you preallocate all Bunny and BunnyNode now before studio is optimized, it would probably make it unusable.???

@pixelzoom
Copy link
Contributor Author

pixelzoom commented Apr 23, 2020

Notes on computations of how many Bunny instances are needed.

Live bunnies: Bunnies take over the world when the population is >= 1000. So the maximum number of live bunnies occurs when the number of bunnies that mate is as close to 1000 as possible. That would be 999 bunnies. But only 998 of those can mate because they are paired. 998 bunnies + 998/2 pairs * 4 bunnies per litter = 2994 live bunnies.

Dead bunnies: For the selected bunny, the Pedigree graph will display 5 generations of ancestors, including the selected bunny and 4 generations of ancestors. That is 31 bunnies, possibly all dead. (The selected bunny itself could die while it's being displayed in the Pedigree, and must remain displayed.) Bunnies share ancestors, so we don't need 2994 * 31 dead bunnies. Bunnies live for 2 generations, then die of old age. So disposing of all bunnies who were born >= 7 generations ago might be a reasonable strategy. More generally, (number of generations shown in Pedigree, including select bunny) + (max bunny age). In any case, I'm not sure how to accurately compute the maximum number of dead bunnies.

@pixelzoom
Copy link
Contributor Author

More discussion on total number of bunnies with @samreid on Slack. We think the total number of Bunny instances maybe be ~4000 per screen.

Sam Reid 4:06 PM
So a loose upper bound for number would be 4000, since you need to keep dead bunnies <=4 generations away?

Chris Malley 4:07 PM
I don't follow. 4 generations is 30 ancestors per bunny.

Sam Reid 4:28 PM
Right but generation N-1 has a max of 1000 bunnies, N-2 has max 1000 bunnies, N-3 has max 1000 bunnies, N-4 has max 1000 bunnies, and you said “I need to keep all dead bunnies who are <= 4 generations away from a living bunny.” So the max amount of dead bunnies you need to keep is 4000, right?

Chris Malley 4:28 PM
I see. Yes, that would probably work.

@pixelzoom
Copy link
Contributor Author

pixelzoom commented Apr 23, 2020

More thoughts on pre-allocation, after discussing with @samreid via Zoom ...

Pre-allocating would mean that PhetioGroup is unnecessary. That represents a lot of work here, but at least it has informed PhetioGroup evolution.

Pre-allocating BunnyNode instances seems doable. It's currently a matter of associating a new Bunny instance with the BunyNode, and changing the BunnyNode's image based on the phenotype of the Bunny instance. It will become a bit more complicated for the BunnyNode instances shown in the Pedigree graph, since they show more info, but maybe they should be a subclass of BunnyNode (with no PressListener) and created on demand.

Pre-allocating Bunny instances seems potentially problematic, because:

  • Studio (and the sim) would need to be able to start up with 4000 Bunny instances per screen allocated.
  • Bunny is a complicated data structure (see outline below), so mutation would not be trivial.
  • Bunny mutation will modify instrumented Properties, and that will result in data stream messages. Is that OK? Or do we need a way to turn off data stream while an element from a pool is being mutated?
  • Bunny mutation would involve the need to reinstantiate or mutate instances of other classes, specifically Genotype, GenePair, and Phenotype.
Bunny

  number generation
  Bunny father
  Bunny mother

  // instrumented Properties
  Property.<Vector3> positionProperty
  EnumerationProperty.<SpriteDirection> directionProperty
  NumberProperty ageProperty
  BooleanProperty isAliveProperty

  // genetic blueprint, inherited from parents
  Genotype genotype 
    GenePair furPair
      Allele fatherAllele
      Allele motherAllele
    GenePair earsPair
      Allele fatherAllele
      Allele motherAllele
    GenePair teethPair
      Allele fatherAllele
      Allele motherAllele

  // appearance, based on genotype and allele dominance
  Phenotype phenotype 
    Allele furAllele
    Allele earsAllele
    Allele teethAllele

@pixelzoom
Copy link
Contributor Author

pixelzoom commented Apr 24, 2020

In #60 (comment), @samreid noted that PressListener allocation for each BunnyNode was a significant cost. In the above commit, I'm now using a single PressListener for the entire collection of BunnyNodes, rather than one per BunnyNode.

The pause is now much shorter. But when bunnies mate to produce a total population of 1434, the time is still ~900 ms (unbuilt, ?ea) and ~400 ms (built and unbuilt, no ?ea)

@pixelzoom
Copy link
Contributor Author

@samreid and I looked at this briefly today via Zoom. ~10% is spent creating the selection Rectangle in BunnyNode. It's currently part of BunnyNode because the selection Rectangle needs to move and scale with BunnyNode. But perhaps I can find a way to use a single selection Rectangle for the entire BunnyNode collection.

@pixelzoom
Copy link
Contributor Author

pixelzoom commented May 5, 2020

I tried creating the selection Rectangle on demand, here's the patch:

Patch
Index: js/common/model/BunnyCollection.js
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- js/common/model/BunnyCollection.js	(revision 507a91db0f1d6261bac156e8a712ecf9a2a80d01)
+++ js/common/model/BunnyCollection.js	(date 1588718689922)
@@ -268,27 +268,29 @@
     }
 
     // Mate adjacent pairs from the collection, applying mutations where appropriate.
-    for ( let i = 1; i < bunnies.length; i = i + 2 ) {
+    NaturalSelectionUtils.logTime( 'mate', () => {
+      for ( let i = 1; i < bunnies.length; i = i + 2 ) {
 
-      const father = bunnies[ i ];
-      const mother = bunnies[ i - 1 ];
+        const father = bunnies[ i ];
+        const mother = bunnies[ i - 1 ];
 
-      for ( let j = 0; j < NaturalSelectionConstants.LITTER_SIZE; j++ ) {
+        for ( let j = 0; j < NaturalSelectionConstants.LITTER_SIZE; j++ ) {
 
-        this.createBunny( {
-          father: father,
-          mother: mother,
-          generation: generation,
-          genotypeOptions: {
-            furMutation: ( furIndices && furIndices.indexOf( bornCount ) >= 0 ) ? furMutation : null,
-            earsMutation: ( earsIndices && earsIndices.indexOf( bornCount ) >= 0 ) ? earsMutation : null,
-            teethMutation: ( teethIndices && teethIndices.indexOf( bornCount ) >= 0 ) ? teethMutation : null
-          }
-        } );
+          this.createBunny( {
+            father: father,
+            mother: mother,
+            generation: generation,
+            genotypeOptions: {
+              furMutation: ( furIndices && furIndices.indexOf( bornCount ) >= 0 ) ? furMutation : null,
+              earsMutation: ( earsIndices && earsIndices.indexOf( bornCount ) >= 0 ) ? earsMutation : null,
+              teethMutation: ( teethIndices && teethIndices.indexOf( bornCount ) >= 0 ) ? teethMutation : null
+            }
+          } );
 
-        bornCount++;
-      }
-    }
+          bornCount++;
+        }
+      }
+    } );
 
     assert && assert( bornCount === numberToBeBorn, 'unexpected number of bunnies were born' );
     phet.log && phet.log( `${bornCount} bunnies born` );
Index: js/common/view/BunnyNode.js
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- js/common/view/BunnyNode.js	(revision 507a91db0f1d6261bac156e8a712ecf9a2a80d01)
+++ js/common/view/BunnyNode.js	(date 1588719237750)
@@ -50,18 +50,8 @@
       bottom: 0
     } );
 
-    // Rectangle that appears around this Node when bunny is selected
-    const selectionRectangle = new Rectangle( wrappedImage.bounds.dilated( 5 ), {
-      fill: 'rgba( 0, 0, 0, 0.25 )',
-      stroke: 'blue',
-      lineWidth: 2.5,
-      cornerRadius: NaturalSelectionConstants.CORNER_RADIUS,
-      center: wrappedImage.center,
-      pickable: false
-    } );
-
     assert && assert( !options.children, 'BunnyNode sets children' );
-    options.children = [ selectionRectangle, wrappedImage ];
+    options.children = [ wrappedImage ];
 
     // Red dot at the origin
     if ( NaturalSelectionConstants.SHOW_ORIGIN ) {
@@ -70,11 +60,25 @@
 
     super( bunny, options );
 
+    // Rectangle that appears around this Node when bunny is selected. Created on demand.
+    let selectionRectangle = null;
+
     // Indicate that this bunny is selected
     const selectedBunnyListener = someBunny => {
+      if ( !selectionRectangle ) {
+        selectionRectangle = new Rectangle( wrappedImage.bounds.dilated( 5 ), {
+          fill: 'rgba( 0, 0, 0, 0.25 )',
+          stroke: 'blue',
+          lineWidth: 2.5,
+          cornerRadius: NaturalSelectionConstants.CORNER_RADIUS,
+          center: wrappedImage.center,
+          pickable: false
+        } );
+        this.insertChild( 0, selectionRectangle );
+      }
       selectionRectangle.visible = ( someBunny === bunny );
     };
-    selectedBunnyProperty.link( selectedBunnyListener );
+    selectedBunnyProperty.lazyLink( selectedBunnyListener );
 
     // @private
     this.disposeBunnyNode = () => {

Here's the performance comparison before and after, measuring the for loop in BunnyNode.mateAllBunnies using NaturalSelectionUtils.logTime:

generation bunnies born before (ms) after (ms)
1 4 8 7
2 12 21 17
3 36 60 51
4 108 144 119
5 320 360 402
6 952 924 967

Note that the patch actually takes more time for larger numbers of bunnies. (I did several runs of this test to confirm.) My guess is that creating and lazyLink'ing the input listener is taking more time than is saved by deferring creation of the selection Rectangle.

Deleting the selection Rectangle and selection listener entirely, generation 5 & 6 times dropped to 305ms and 817ms respectively -- still not acceptable performance (< 500 ms).

@pixelzoom
Copy link
Contributor Author

Labeling for design meeting, to get input from team on what acceptable responsiveness is.

@pixelzoom
Copy link
Contributor Author

pixelzoom commented May 7, 2020

5/7/2020 design meeting @amanda-phet @kathy-phet @ariel-phet @arouinfar

The current pause would be acceptable to publish, but still worth investigating if there are opportunities to make it more efficient.

@ariel-phet suggested having a "standard minimum pause" when bunnies are born, but consensus was that we don't need that.

We will need to revisit this when more of the sim is fleshed out. Deferring until then.

@pixelzoom
Copy link
Contributor Author

pixelzoom commented Jun 30, 2020

Most of the sim is now fleshed out, so this can be revisited.

For this scenario:

  1. start sim, go to Lab screen
  2. press "Add a Mate" button
  3. wait for bunnies to take over the world

... here's the performance of the unbuilt version (with assertions enabled) vs the built version on a 2019 MacBook Pro + macOS 11.15.5 + Chrome 83:

generation bunnies born unbuilt + ?ea (ms) built (ms)
1 4 9.6 4.7
2 12 20.9 11.4
3 36 58.3 30.4
4 108 134.8 68.6
5 320 430.9 163.4
6 952 892.4 410.5

The unbuilt version with assertions enabled is about 2x slower, because of the time required to check assertion expressions. The unbuilt version with assertions disabled (not shown in the above table) is about the same as the built version, just a tad slower.

Less than 500 ms seems acceptable to me, and #60 (comment) indicated that we were willing to publish with as much as 1 second pause. So I'm inclined to close this issue, but I'll leave that up to @amanda-phet. If the current performance is OK, feel free to close this issue. If you'd like me to try speeding things up more, assign back to me, let me know what an acceptable pause time is, and I'll be happy to investigate.

@pixelzoom
Copy link
Contributor Author

Labeled for design meeting to get feedback on what an acceptable delay is.

@samreid
Copy link
Member

samreid commented Aug 21, 2020

In https://github.com/phetsims/phet-io/issues/1695 we added caching to phetioEngine.getAllRelatedTypes which speeds up the generation 6 start time by about 5.5%.

@pixelzoom
Copy link
Contributor Author

Excellent. I'll take my iPad6 for another performance test sometime early next week.

@pixelzoom
Copy link
Contributor Author

pixelzoom commented Aug 24, 2020

Testing the 5.5% improvement to phetioEngine.getAllRelatedTypes mentioned by @samreid in #60 (comment) ...

Following the testing steps in #60 (comment) with 1.1.0-dev.14 ...

For brand=phet, https://phet-dev.colorado.edu/html/natural-selection/1.1.0-dev.14/phet/natural-selection_en_phet.html?showTimes&secondsPerGeneration=1

test device OS browser times (ms) average (ms) previous (ms)
PixelZoom - iPad6 iOS 13.6.1 Safari 13 236, 173, 159 189 191
PixelZoom - 2019 MacBook Pro macOS 10.15.6 Chrome 84.0.4147.135 166, 122, 101 130 139

For brand=phet-io, https://phet-dev.colorado.edu/html/natural-selection/1.1.0-dev.14/phet-io/natural-selection_all_phet-io.html?postMessageOnError&phetioStandalone&showTimes&secondsPerGeneration=1

test device OS browser times (ms) average (ms) previous (ms)
PixelZoom - iPad6 iOS 13.6.1 Safari 13 1215, 1403, 691 1103 542
PixelZoom - 2019 MacBook Pro macOS 10.15.6 Chrome 84.0.4147.135 232, 186, 166 195 202

Performance looks the same (maybe a slight improvement?) on Chrome. Performance of brand=phet is about the same on iPad6. And performance of brand=phet-io on iPad6 is significantly worse - almost 2x slower (I ran those tests 3x to be sure.)

@samreid any idea why the performance on iPad has suddenly taken a turn for the worse? Is phetioEngine.getAllRelatedTypes actually relevant to this issue, or is it relevant to #140? ("Start Over" button).

@pixelzoom pixelzoom assigned samreid and unassigned samreid Aug 24, 2020
@pixelzoom
Copy link
Contributor Author

pixelzoom commented Aug 24, 2020

Over in #140 (comment), @samreid said:

I think getAllRelatedTypes is called during object creation, so more likely related to #60 which is about bunny creation.

So a 5.5% performance increase is indeed expected for this issue. I don't see that increase, and in fact see a huge decrease in performance (2x slower) on iPad6. Any idea why?

@samreid
Copy link
Member

samreid commented Aug 24, 2020

Some notes from recent runs:

Sam iPad 6
OS 13.3
Time to mate: 396ms, 391ms, 392ms, 372ms
Start over: 175ms, 141ms
Time to mate: 332ms, 331ms

Chris 13.6.1
590, 757, 757, 452, 348

@pixelzoom
Copy link
Contributor Author

pixelzoom commented Aug 24, 2020

@samreid and I collaborated on this via Zoom. During our session, he reported the runs in the previous comment.

Testing 1.1.0-dev.16 brand=phet-io again on my iPad6, ~15 runs were all in the 450-600 ms range. I'm not sure what was going on with the iPad6 results in #60 (comment), but I guess I won't worry about it.

On a Lenovo 300e Chromebook, @samreid reports for 1.1.0-dev.16 brand=phet-io:

time to mate = 1131
time to Start Over = 621 ms

That seems reasonable to me.

@pixelzoom
Copy link
Contributor Author

@samreid says there are no additional PhET-iO performance improvements planned.

Performance now seems acceptable to me, including on lower end devices like Chromebooks. So I think we're ready to move to the performance "sign-off" phase in #191.

@KatieWoe
Copy link
Contributor

@ariel-phet asked for the #60 (comment) tests to be done on Lovelace since it is a slower system. Results:

test device OS test times (ms) average (ms)
Lovelace ChromeOS 85.0.4183.59 phet brand time to mate 1073, 656, 545 758
Lovelace ChromeOS 85.0.4183.59 phet brand time to start over 427, 407, 327 387
Lovelace ChromeOS 85.0.4183.59 phet io time to mate 1541, 1250, 1092 1294
Lovelace ChromeOS 85.0.4183.59 phet io time to start over 860, 710, 859 810

Note that the bug where clicking "Start Over" twice skips the button to add a friend. This seems like it would be relatively easy to do on even a faster moving device.

@pixelzoom
Copy link
Contributor Author

pixelzoom commented Aug 25, 2020

Performance for Lovelace looks reasonable to me. "Start Over" is < 1 second, and time to mate is not huge.

@KatieWoe said:

Note that the bug where clicking "Start Over" twice skips the button to add a friend. This seems like it would be relatively easy to do on even a faster moving device.

I brought this usability issue up in #166. @amanda-phet didn't see evidence of this in interviews, so she decided it's not a problem. I'm going to re-open that issue, add @KatieWoe's comment there, and let @amanda-phet revisit.

@amanda-phet
Copy link
Contributor

@pixelzoom said

Performance for Lovelace looks reasonable to me. "Start Over" is < 1 second, and time to mate is not huge.

It looks like time to mate with PhET-iO is kind of huge (up to 1541 ms), but that seems like the worst case scenario. I am totally fine with the performance in all other areas, and since I can't really experience the 1.5s pause I'll just have to defer to you all if that seems acceptable.

I'll respond to #166 directly. Closing this issue.

@pixelzoom
Copy link
Contributor Author

pixelzoom commented Aug 26, 2020

I agree that 1541 ms is "kind of huge". I think we could live with it, but it's not the best UX.

I have a couple of new ideas about this issue that I'd like to bounce off of @amanda-phet. Reopening and assigning to me so that I remember.

@pixelzoom pixelzoom reopened this Aug 26, 2020
@pixelzoom pixelzoom assigned pixelzoom and unassigned amanda-phet Aug 26, 2020
@pixelzoom
Copy link
Contributor Author

@amanda-phet and I discussed a couple of other idea that I had:

(1) Create bunnies in small batches, say 50 at a time. This would still cause the clock to pause at 12:00 (and maybe even longer), but you'd see something happening during that pause. The downside is that this isn't a good match for the "single data point" that mating produces.

(2) Pool (reuse) bunny SpriteInstances. Unlikely to have a big performance impact, likely to make the code more complicated (and it's already complicated).

So we decided to leave this where it is, and live with the brief pause at 12:00.

@pixelzoom
Copy link
Contributor Author

pixelzoom commented Aug 26, 2020

A few more notes about:

(2) Pool (reuse) bunny SpriteInstances. ....

I investigated this approach briefly with @jonathanolson, and it was pretty easy to get pooling working. But pooling turned out to be a wash (didn't change the performance) and was more complicated, so I opted to keep things as is.

But this exercise did result in improvements. @jonathanolson identified some inefficiencies (use of merge, object allocations,...) in my SpriteInstances class hierarchy. I improved those things in the above commit, and I'm seeing ~35% performance improvement for "time to mate" on my MacBook Pro (unbuilt brand=phet version in Chrome). Time to mate for the 6th generation dropped from 130ms to 85ms.

@pixelzoom pixelzoom mentioned this issue Aug 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants