Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible ANR fix #6160

Merged
merged 5 commits into from Nov 3, 2021
Merged

Possible ANR fix #6160

merged 5 commits into from Nov 3, 2021

Conversation

AGulev
Copy link
Contributor

@AGulev AGulev commented Nov 2, 2021

No description provided.

@AGulev AGulev requested review from britzl and JCash November 2, 2021 13:16
if (_glfwWin.iconified) {
timeout = 1000 * 3000;
timeoutMillis = 1000 * 3;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according to Android documentation the timeout value is in ms not nanoseconds: https://developer.android.com/ndk/reference/group/looper#alooper_pollall

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, ok, that's interesting!

Copy link
Contributor Author

@AGulev AGulev Nov 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! This maybe the main reason of ANRs we see, actually. We have 5 sec top before ANR because of input (and 10 sec for ANR because of BroadcastReceiver) and here we had a long 3000sec timeout.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, using microseconds is common, but I guess someone didn't read the docs properly.

Q: Given that we are investigating ANR's which have a threshold of 500ms, should we really wait 3000ms then?

Copy link
Contributor Author

@AGulev AGulev Nov 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a good question
I see messages like this:

Input dispatching timed out (Waiting to send non-key event because the touched window has not finished processing certain input events that were delivered to it over 500.0ms ago. Wait queue length: 54. Wait queue head age: 8508.7ms.)

But in docs:

No response to an input event (such as key press or screen touch events) within 5 seconds.

I don't see any ANRs where Wait queue head age is less that 5000ms. But I'm not sure if ANR happens after 500ms because App has not finished processing certain input events that were delivered to it over 500.0ms ago or it's just an info message and it waits up to 5000ms like in docs

}

if (app->destroyRequested != 0) {
return;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If app destroy requested we can exit the loop

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, and this is set from android_native_app_glue.c?

Copy link
Contributor Author

@AGulev AGulev Nov 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, android_native_app_glue.c rise this flag and then waits when app destroys (exit from android_main() in our case this is engine_main())

But I'm not sure If I should directly call return (and exit loop) here or maybe it would be better to set _glfwWin.opened = 0; here and make sure we exit app (even if we didn't get APP_CMD_DESTROY yet. Because if we exit this loop it's possible we will not get it)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we've had issues with OpenAL and shutting down, won't it cause more issues if we simply exit here? If not, then I guess it's safe to exit here.

If we should wait for the game/sound threads, then I think shutting down sooner is preferred. So somehow make sure the engine starts the shutdown sequence?

britzl
britzl previously approved these changes Nov 2, 2021
Copy link
Contributor

@britzl britzl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good I think. Might be worth to get some feedback from @JCash as well.

if (_glfwWin.iconified) {
timeout = 1000 * 3000;
timeoutMillis = 1000 * 3;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, ok, that's interesting!

}

if (app->destroyRequested != 0) {
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, and this is set from android_native_app_glue.c?

JCash
JCash previously approved these changes Nov 3, 2021
@AGulev AGulev dismissed stale reviews from JCash and britzl via b25e311 November 3, 2021 09:58
@AGulev AGulev requested a review from britzl November 3, 2021 10:22
@AGulev AGulev merged commit 357ebbd into dev Nov 3, 2021
@AGulev AGulev deleted the possible_anr_fix branch November 3, 2021 12:46
AGulev added a commit that referenced this pull request Nov 3, 2021
* timeout is in milliseconds

* exit ALooper_pollAll loop if app destroy requested

* fix wrong structure name

* Revert "fix wrong structure name"

This reverts commit efc2686.

* Revert "exit ALooper_pollAll loop if app destroy requested"

This reverts commit 653b17f.
britzl added a commit that referenced this pull request Dec 10, 2021
…#6230)

* Possible ANR fix (#6160)

* timeout is in milliseconds

* exit ALooper_pollAll loop if app destroy requested

* fix wrong structure name

* Revert "fix wrong structure name"

This reverts commit efc2686.

* Revert "exit ALooper_pollAll loop if app destroy requested"

This reverts commit 653b17f.

* Restrict PRs to Defold org

* Update main-ci.yml

* Changed encoding (#6163)

* Break out of loop when printing stack trace (#6156)

* Check for resource key in both local and remote cache (#6148)

* Check for key in both local and remote cache

* Forgot to throw exception

* Issue-6151 Template override with layouts bug (#6166)

* Apply Default layout before applying another layout overridings

* Test

* fix naming

* Tests polishment after review

* New onCreate hook for Android in dmSDK (and breaking change for RegisterAndroidOnActivityResultListener) (#6171)

* onCreate hook for android

* Review fixes

* fix types

* Added OP Games logo

skip-ci

* Add support for DualSense Wireless Controller (mac)

* Ignore some project options when calculating cache key (#6159)

* Ignore some project options when calculating cache key

* Associate bob options with how the option impacts the resource cache

* Changed from ignoreOption to includeOption

* Update Bob.java

* Fixed resource cache key tests

* Update ResourceCacheKeyTest.java

* Code cleanup

* Bumped version to 1.2.190

* Make sure to create vertex declarations first time a custom buffer is set (#6178)

* Added the generated texture set protobuf output to the AtlasNode (#6182)

* Added the generated texture set protobuf output to the AtlasNode

* review fix

* Added release notes for console platforms (#6187)

* Added support for ABGR in texc, in order to skip a java data copy loop (#6192)

* Added support for ABGR in texc, in order to skip a java data copy loop

Saved ~20-25% in the case of the spineboy atlas

* Update README_SETUP_LINUX.md

* Issue 6199: Fixed issue of getting/setting render constants on label (#6200)

* Update README_SETUP_LINUX.md

* Avoid using memset on hashtable/array members (#6197)

* Issue 6189: Added dmSpinlock to dmSDK (#6196)

* Issue 6189: Added dmSpinlock to dmSDK

* typo

* Updated NE plugin build rules to work for macOS/Win32/Linux (#6202)

* Updated to jna-5.10.0 for improved error messages

* Added jna+jna-platform jar files

* moved files

* Updated linkCmdCXXSh to support start/end groups for different platforms

* Fix implicit cast of jobject to ANativeActivity (#6209)

* fix implicit cast of jobject to ANativeActivity

* Revert "fix implicit cast of jobject to ANativeActivity"

This reverts commit e9d8c83.

* use Activity jobject instead of ANativeActivity

* revert `jobject thiz` changes

* Issue 6210: Fix when setting more than 4 render constants (#6211)

* Added documentation example for gui.get_font_resource() (#6212)

* Remove mobile dev app (#6208)

* Don't build the mobile dev app

* Update waf_dynamo.py

* Cleaned up script a bit more

* Removed apkc

* Removed go

* Change to apksigner

* Delete apkc

* Use correct string length and number format (#6204)

* Use correct string length

Also make sure to use same number format

* Compare tostring and concat result

* Improved calculation of max buffer length for vmath userdata

* Issue 3775 collision sticking - II (#6205)

* More flexible box2d velocityThreshold setting - ref #3775

Updatable from a respective game.project property

* Fixes for box2d velocity threshold overrides - ref #3775

- The feature works when editing game.projectc directly
- The role of the physics 'scale'property seems to affect when the threshold value
gets effective.

* 'velocityThreshold' project property for the editor and 'scale' compensation - ref #3775

* WIP: Testing: Added scale and velocityThreshold properties to test GamesysTest base class

* Testing: (WIP) VelocityThreshold test in gamesys - ref #3775

* Issue 3775: Wrapped testing code and final code review - ref #3775

* Initialize velocity threshold in physics context. Fixes test behavior - ref #3775

* Removed WebP remnants (#6213)

* Possible fix for ANR #2 (#6172)

* Possible fix for ANR #2

* typo fixes

* Exit the main loop

* change place where we call `androidDestroyWindow();`

* add missed include

* Revert "add missed include"

This reverts commit ea8f07c.

* Revert "change place where we call `androidDestroyWindow();`"

This reverts commit dc12019.

* immediate app exit when exit requested

* Added comments

* Issue-6089 Set go.property() for textures and fonts in `gui` component

* sent font (without deleting for now)

* Release resources

* Revert "sent font (without deleting for now)"

This reverts commit 2731c37.

* use key for hashtable parameters

* use dmArray for resorce pointers

* Revert "Revert "sent font (without deleting for now)""

This reverts commit af482bb.

* use `hash` instead of `char*` all the way as hashtable key

* documentation fix

* Review fixes

* review fixes

* go.get() fonts

* fix error reporting

* fix names

* better error handling

* better error handling

* get textures from gui component

* Add go.set() for textures in GUI component

* Documentation

* initial test

* cleanup tests

* fix uniform test

* fix test done counter

* fix test_done counter

* fix typo

* more tests for fonts

* go.set/go.get "textures" test

* remove unused array size

* fix operators order

* make sure property_options.m_HasKey == 0

* make sure property_options.m_HasKey == 0 for get function as well

* review fixes

* review fixes

* review fixes

* review fixes

* README.md: HTTP => HTTPS (#6219)

* Add function `gui.set_alpha()` and `gui.get_alpha()` (#6218)

* Added missing documentation for indices/keys in go.get/set (#6220)

* Added missing documentation for indices/keys in go.get/set

* review fixes

* Issue-5791 `gui.animate` doesn't set value if duration = 0 and delay > 0 (#6216)

* take into account delay of animation in gui

* remove comment

* make negative values consistent with how `go.animate` works

* test animate itself

* Add callback testing

* review fixes

* Split spinlock types into multiple files for easier integration to console repos (#6221)

* Added VSCode + Calva readme (#6226)

* Compile fix

* Update to windows-latest machines (#6225)

* Updated to windows-latest for CI

Use local python2 on macOS

* Updated to new version of setup-python action

* Ensure editor builds can see changes from pre-build hooks

Co-authored-by: Alexey Gulev <me@agulev.com>
Co-authored-by: Björn Ritzl <bjorn.ritzl@gmail.com>
Co-authored-by: JCash <mwesterdahl76@gmail.com>
Co-authored-by: Orestis Tsakiridis <otsakiridis@freemail.gr>
Co-authored-by: Kai <Schweinepriester@users.noreply.github.com>
AGulev added a commit that referenced this pull request Dec 13, 2021
* timeout is in milliseconds

* exit ALooper_pollAll loop if app destroy requested

* fix wrong structure name

* Revert "fix wrong structure name"

This reverts commit efc2686.

* Revert "exit ALooper_pollAll loop if app destroy requested"

This reverts commit 653b17f.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants