Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[INFO] URSACHE: 9002, Etwas ist schiefgelaufen. Timeout #998

Closed
Jo-Achim opened this issue Aug 11, 2020 · 34 comments · Fixed by #1138
Closed

[INFO] URSACHE: 9002, Etwas ist schiefgelaufen. Timeout #998

Jo-Achim opened this issue Aug 11, 2020 · 34 comments · Fixed by #1138
Assignees
Labels
bug Something isn't working mirrored-to-jira This item is also tracked internally in JIRA

Comments

@Jo-Achim
Copy link

Jo-Achim commented Aug 11, 2020

[INFO] CAUSE: 9002, Something went wrong. Time-out

Android 10 with
Android Sicherheitsupdate / Security-Update: 1. Juli 2020
Nokia 7.1, Build: 00WW_4_15C_SP02
Internet: WLAN (only)
CWA-Version: 1.1.1

Das Problem trat heue erstmalig auf und war wiederholbar.
The problem performed for the first time today and was repeatable.

Hier die Dokumentation...
Here the documentation...

  1. Check-after-Start:
    1  Check-after-Start

  2. Error-Message:
    2  Error-Message

  3. Error-Message-Details:
    3  Error-Message-Details

  4. Display-after-Error (normal):
    4  Display-after-Error (normal)

  5. Additional-Info:
    5  Additional-Info

Anmerkung: Diese Screen-Shots wurden beim 3. oder 4. Aufruf der CWA mit jeweils gleichem Verlauf / Fehlermeldungen erstellt.
Note: These screen shots were created with the 3rd or 4th call of the CWA with the same progression / error messages.


Update von heute Nachmittag...

Zusätzliche Information: Klickt man auf den Pfeil in Abbildung "4. Display-after-Error (normal)" passiert der gleiche Timeout nochmals (exakte Details konnte ich nicht vergleichen); "Prüfung läuft..."; siehe "6. Test-in-progress".

Beim Versuch diesen Vorgang ebenfalls inkl. der Details zu fotografieren, ereignete sich dieser Fehler nicht nochmals.
Erstaunlich ist jedoch, dass die Angabe "Aktualisiert: Heute, 16:12" exakt 23 Stunden nach der vorherigen Aktualisierung stattfand; siehe Aktualisiert-Daten in "7. Your-Risk-Status" im Vergleich zu "4. Display-after-Error (normal)".
D.h. der oben dokumentierte Fehler könnte mit dem Versuch der Aktualisierung innerhalb von 23:00 Stunden zusammenhängen. 23:00 Stunden (Heute: 16:12) nach der vorherigen Aktualisierung (Gestern: 17:12) trat der Fehler nicht mehr auf.

Update from this afternoon ...

Additional information: If you click on the arrow in Figure "4. Display-after-Error (normal)", the same timeout happens again (I couldn't compare the exact details); "Prüfung läuft..."; see "6. Test in progress".

When trying to photograph this process including the details, this error did not occur again.
What is astonishing, however, is that the information "Updated: Today, 16:12" took place exactly 23 hours after the previous update; see updated data in "7. Your risk status" compared to "4. Display-after-Error (normal)".
I.e. the error documented above could be related to an attempt to update within 23:00 hours. 23:00 hours (today: 16:12) after the previous update (yesterday: 17:12) the error no longer occurred.

  1. Test-in-progress:
    6  Test-in-progress

Nach Abschluss der Prüfung wurde vor 16:12 Uhr der obige Fehler "Ursache: 9002 ... Timeout" angezeigt. (Das Foto entstand beim Versuch auch diesen Timeout zu dokumentieren, aber hier entstand der Fehler nicht mehr (16:12 Uhr).)
After completing the test, the above error "Cause: 9002 ... Timeout" was displayed before 4:12 pm. (The photo was taken while trying to document this timeout, but the error no longer occurred here (4:12 p.m.).)

  1. Your-Risk-Status (Updated):
    7  Your-Risk-Status

Internal Tracking ID: EXPOSUREAPP-2190

@Jo-Achim Jo-Achim added the bug Something isn't working label Aug 11, 2020
@ghost ghost added the mirrored-to-jira This item is also tracked internally in JIRA label Aug 12, 2020
@ghost ghost assigned JoachimFritsch Aug 12, 2020
@ghost
Copy link

ghost commented Aug 12, 2020

Hello @Jo-Achim,

thanks for reaching out. I have forwarded it to out development team.

Thanks,
LMM

Corona-Warn-App Open Source Team

@vaubaehn
Copy link
Contributor

vaubaehn commented Aug 12, 2020

Hi @Jo-Achim , in the screenshots, there is a shield visible in the notification bar. Do you use any anti virus app, or what is the shield related to?
Also, did you have enabled a power saving mode (Android built-in battery saver or any nokia specific one) at any time, or anything else what might have decreased speed of CPU of your phone?

@Gladdi
Copy link

Gladdi commented Aug 12, 2020

Same issus here.

No Energy saving Mode active. Use Kaspersky Anti-Virus.

@vaubaehn
Copy link
Contributor

vaubaehn commented Aug 12, 2020

@Gladdi thanks for reporting!

I overlooked in above screenshots, that timeout occurred during a web request (OkHttp).

@Gladdi @Jo-Achim do you use any firewall on your phone, or does your anti virus app / Kaspersky contain a firewall module?

Are you able to reproduce the issue

  • when CWA is closed & running in background, and you turn off Kaspersky anti virus / anti virus app / firewall BEFORE CWA's scheduled task for exposure checking is started, for example CWA is usually checking around 4am in the morning, and you turn off Kaspersky / anti virus / firewall before you sleep, wake up, open CWA (and everything worked alright hopefully), and then turn on Kasperky / anti virus / firewall again?
  • when you turn off Kaspersky / anti virus / firewall, before you open CWA anytime in daytime?

@Jo-Achim
Copy link
Author

Jo-Achim commented Aug 12, 2020

@vaubaehn: the shield in the notification bar comes from: "NetGuard Pro", v2.285 (https://netguard.me/).
And I have installed "Norton Mobile Security", v4.8.0.4542.

Corona-Warn-Configuration under NetGuard: Allow WLAN when the screen is switched on; not cellular.

Both, "NetGuard Pro" and "Norton Mobile Security" was installend long before first installation of CWA (June 16th) - and no (configuration-) changes made here! (Between the last error-free use of CWA and the problem described.)

Battery usage / optimization: Corona-Warn-App: Not optimized.
Prioritized background activity: on.

A reproduce of the issue is hardly possible:

  1. After 4:12 pm (August 11th, 2020) the problem no longer occurred (CWA 1.1.1).
  2. I updated to CWA 1.2.0 on August 12th, 2020. Sorry.

After the above error message, the problem has not reappeared; neither under CWA 1.1.1 nor under CWA 1.2.0.

Best regards.

@Jo-Achim
Copy link
Author

Jo-Achim commented Aug 13, 2020

Android 10 with
Android Sicherheitsupdate / Security-Update: 1. Juli 2020
Nokia 7.1, Build: 00WW_4_15C_SP02
Internet: WLAN (only)
CWA-Version: 1.2.0

Hello,
same problem this morning - but only temporarily; again close to the 24-hour frame.
Today's screenshots 2a, 3a and 4a show the corresponding error (the screenshots are marked accordingly with "Xa"; "X" corresponds to the images from above).
Screenshot 7a - taken only 4 minutes after 4a - shows: everything is ok.

Screenshot 2a ("Error-Message"):
2a  Screenshot_20200813-062740

Screenshot 3a ("Error-Message-Details"):
3a  Screenshot_20200813-062754

Screenshot 4a ("Display-after-Error (normal)":
4a  Screenshot_20200813-062859

Screenshot 7a ("Your-Risk-Status (Updated)"):
7a  Screenshot_20200813-063746

@vaubaehn
Copy link
Contributor

vaubaehn commented Aug 13, 2020

hi @Jo-Achim , thanks for all information and for reproducing the issue!

As stated

the shield in the notification bar comes from: "NetGuard Pro", v2.285 (https://netguard.me/).
And I have installed "Norton Mobile Security", v4.8.0.4542.

I think, we found one suspect: NetGuard Pro

Corona-Warn-Configuration under NetGuard: Allow WLAN when the screen is switched on; not cellular.

This might explain, why the error occurs: while the screen is switched off, network is disabled.
When you turn on your screen, and immediately open CWA, then the network just starts to be turned on again. However, most likely CWA will immediately try to download the diagnosis keys from Telekom-Server - but there is a chance, that the network connection is still not established at that point. Hence, a time out for the web request may occur, and the message of the 9002 exception 'time out' may show up.
When your screen is switched on, and a stable network connection is already established, before you open CWA, then it explains why you don't run into that exception (everything works fine).

Both, "NetGuard Pro" and "Norton Mobile Security" was installend long before first installation of CWA (June 16th) - and no (configuration-) changes made here! (Between the last error-free use of CWA and the problem described.)

This doesn't change above explanation.

Battery usage / optimization: Corona-Warn-App: Not optimized.
Prioritized background activity: on.

Good.

What I saw today in the screenshot in your notification bar, the 'key symbol' is active - are you using VPN? This, with a low chance, may also be a source of error, or adding in the severity of the error seen with Netguard.

To help to ultimately find the source of error, to provide users experiencing similar issues a work around in the future, would you be willingful to disable/change the setting 'Allow WLAN when the screen is switched on; not cellular.' for CWA in netguard, that network is always allowed for CWA for testing purposes?
If you can't reproduce the issue with that changed behaviour, we can target the error and spread the news.
Thank you :)

@Gladdi
Copy link

Gladdi commented Aug 13, 2020

I try.it with disabled Kasper and after thus, i uninstall Kasper. Nothing Help.

With WLAN and with 4g. Same issus.

Screenshot_20200813-151130
Screenshot_20200813-151104
Screenshot_20200813-151056

@vaubaehn
Copy link
Contributor

vaubaehn commented Aug 13, 2020

@Gladdi The stacktraces look a little bit different from @Jo-Achim's ones. For me, yours looks more like your connection is blocked.

Did you restart your phone after uninstalling Kaspersky? Maybe a firewall is still active...

@Gladdi
Copy link

Gladdi commented Aug 13, 2020

I found the problem. All sliders in the app-setting were deactiveted

Screenshot_20200813-153127

@vaubaehn
Copy link
Contributor

vaubaehn commented Aug 13, 2020

@Gladdi Do you think Kaspersky set the sliders to deactivated?

@Jo-Achim
Copy link
Author

Jo-Achim commented Aug 13, 2020

Hi @vaubaehn,

This might explain, why the error occurs: while the screen is switched off, network is disabled.
When you turn on your screen, and immediately open CWA, then the network just starts to be turned on again. However, most likely CWA will immediately try to download the diagnosis keys from Telekom-Server - but there is a chance, that the network connection is still not established at that point. Hence, a time out for the web request may occur, and the message of the 9002 exception 'time out' may show up.
When your screen is switched on, and a stable network connection is already established, before you open CWA, then it explains why you don't run into that exception (everything works fine).

Ok, I understand; see NetGuard "Allow in restricted access mode" below.

What I saw today in the screenshot in your notification bar, the 'key symbol' is active - are you using VPN? This, with a low chance, may also be a source of error, or adding in the severity of the error seen with Netguard.

NetGuard uses VPN to provide its functionality. Take a look at https://www.kuketz-blog.de/netguard-firewall-android-unter-kontrolle-teil4/ in part "2. NetGuard", keyword: "VPN-Schnittstelle".

To help to ultimately find the source of error, to provide users experiencing similar issues a work around in the future, would you be willingful to disable/change the setting 'Allow WLAN when the screen is switched on; not cellular.' for CWA in netguard, that network is always allowed for CWA for testing purposes?
If you can't reproduce the issue with that changed behaviour, we can target the error and spread the news.

Yes, sorry, my mistake. The NetGuard option "Allow WLAN when the screen is switched on" ("WLAN erlauben, wenn Bildschirm eingeschaltet ist") is grayed out here! And I assumed it would be unrestricted.
But now I have explicitly activated the "Allow in restricted access mode" ("Im Zugriffsbeschränkungsmodus erlauben") option. I think that could be the right configuration - and should allow CWA to work with the network in the background as well (although I suspected the cause of the timeouts was more on the server side).
If necessary, I will deactivate NetGuard as a test and then see further.

However... I've been using NetGuard (including Norton) longer than CWA is installed and the 9002 timeout error has never occurred ;-). So I'm curious.

Thank you for information. I will contact you again here as soon as I have new knowledge.

Best regards.

@Jo-Achim
Copy link
Author

Jo-Achim commented Aug 15, 2020

CWA: 1.2.1
NetGuard: .2.285.

Even if it is far too early for a final assessment, there is a strong indication that @vaubaehn was probably correct in his assumption that "Ursache 9002 ... Timeout" had something to do with 'my NetGuard configuration' .

I have activated the already mentioned function "Im Zugriffsbeschränkungsmodus erlauben" ("Allow in restricted access mode") in NetGuard.

The result is that CWA now starts apparently faster and - more importantly - completes the "Prüfung läuft ..." much faster.

My CWA configuration in NetGuard now looks like this:
Screenshot_20200815-102251

NetGuard Pro, v.2.285 (https://netguard.me/). Basic configuration: forbid everything and then specific activation of functionalities for each app individually.

Thanks and best regards.

@vaubaehn
Copy link
Contributor

vaubaehn commented Aug 15, 2020

Hi @Jo-Achim , thanks for your helpful feedback. Happy to hear it works for you up to now.
Would you think middle of coming week would be a good point for final assessment? Could be nice to get your final statement then.

Hello @JoachimFritsch : please leave open issue until then. Thank you

Note to myself: Race condition. Anti virus -> FAQ

@Jo-Achim
Copy link
Author

Jo-Achim commented Aug 15, 2020

Ok, I think more towards the end of next week (week 34).

@Jo-Achim
Copy link
Author

Jo-Achim commented Aug 20, 2020

CWA: 1.2.1
NetGuard Pro: 2.285 / 2.286 / 2.287.

Hello @vaubaehn, hello everybody,

as reported on August 15, 2020, the error "CAUSE: 9002, Etwas ist schiefgelaufen. Timeout" / "CAUSE: 9002, an error has occurred. Timeout" with NetGuard setting "Im Zugriffsbeschränkungsmodus erlauben" / "Allow in restricted access mode" activated, has not occurred until today.

Since, as also said there, both the CWA start and "Prüfung läuft…" / "Exam running..." run significantly faster, it can be assumed that the cause of 'my error' is now resolved.

Thanks and best regards.

@vaubaehn
Copy link
Contributor

vaubaehn commented Aug 20, 2020

Hi @Jo-Achim , that's great news! Thank you very much!
Kind regards, V.

@vaubaehn
Copy link
Contributor

vaubaehn commented Aug 20, 2020

Hi @GPclips and @JoachimFritsch ,

we now located two different sources for the cause of '9002: timeout', and how they can be resolved.
Do you think it would be helpful for similar affected users, if a FAQ entry would be created for that error with a more generalized walk through to resolve it?
If yes, would you want to get some support for that? I would volunteer to create related texts as a starting point, that the responsibles can adjust by their own means. If you want any assistance, just leave a note.

One general question: Are there already any concrete plans by product management / developer team to enhance error handling and UX? Or to make CWA more resilient against race conditions (CWA trying to fetch data from server, while network connection is still being established, in this example)?

Kind regards

@ghost
Copy link

ghost commented Aug 21, 2020

Hello @vaubaehn ,

thank you for driving this thread!
I talked to our development team regarding the error description as well as the provided solutions:

  1. [INFO] URSACHE: 9002, Etwas ist schiefgelaufen. Timeout #998 (comment)
  2. [INFO] URSACHE: 9002, Etwas ist schiefgelaufen. Timeout #998 (comment)

As well as you do, I also see the need for having such an FAQ entry for this issue. I will discuss this in our today's meeting and suggest the idea of having you writing this article. We need to make sure that we mention 9002 as the error code of an unknown error and really have to rely on the error description, in our case the timeout (probably also including the exception; at least partly)

I will come back to you and let you know when we will be ready to add the FAQ entry to our FAQ-page.

Thank you for your support @vaubaehn, @Gladdi and @Jo-Achim .

Best regards,
LMM

Corona-Warn-App Open Source Team

@ghost
Copy link

ghost commented Aug 21, 2020

Hello @vaubaehn,

I just checked with the team and happy to ask you, if you please could create a PR in the website repository. Please add your FAQ entry to https://github.com/corona-warn-app/cwa-website/blob/master/src/data/faq.json (I think the best section would be "id": "notification"). Afterwards, as with every PR, it will be reviewed and we will translate it into German, that the FAQ versions are in sync.

Thanks,
LMM

Corona-Warn-App Open Source Team

@vaubaehn
Copy link
Contributor

vaubaehn commented Aug 21, 2020

Hi @GPclips , thank you very much clearifying with the team and responding to this issue.
I'm happy to contribute a small piece.

[...] if you please could create a PR in the website repository. Please add your FAQ entry to https://github.com/corona-warn-app/cwa-website/blob/master/src/data/faq.json (I think the best section would be "id": "notification").

I will do. I should be finished until Monday night. If you don't hear anything from me until Tuesday, then 'something went wrong', and please feel free to ping me.

Afterwards, as with every PR, it will be reviewed and we will translate it into German, that the FAQ versions are in sync.

Please feel free to change anything according to your needs. I will note that also in the related PR.

We need to make sure that we mention 9002 as the error code of an unknown error and really have to rely on the error description, in our case the timeout (probably also including the exception; at least partly)

I'll write it as brief and as precise as possible.

Thank you!

@vaubaehn
Copy link
Contributor

vaubaehn commented Aug 23, 2020

Hi @GPclips , @Jo-Achim and @Gladdi , a PR to add an entry to the FAQ is on the way. Thanks again!

@Marco2907
Copy link
Member

Marco2907 commented Aug 26, 2020

Hello @vaubaehn, thanks for providing your PR. We only had to do some minor corrections as mentioned in the PR's conversation history. Thank you for your awesome contribution. I will keep this thread open for our further conversations.

Thank you,
MP

Corona-Warn-App Open Source Team

@d4rken
Copy link
Member

d4rken commented Aug 27, 2020

FYI
We'll increase the related timeouts with a future update. This should fix this for those devices that run into this error because they just take too long due to slow CPU or network.
I think that's the main reasons for this issue, whether there are other possible underlying issues will become more clear after that.

@vaubaehn
Copy link
Contributor

vaubaehn commented Aug 27, 2020

Hi @d4rken , thank you very much for your update on this timeout issue. Increasing the value sounds good, and let's hope, there are no other underlying issues to come up.
In general, are there already plans to change UX/UI related to handling of (already well-examined) errors/exceptions, i. e. providing in-app information on errors, contexts and solutions?

@daimpi
Copy link

daimpi commented Aug 27, 2020

@vaubaehn

In general, are there already plans to change UX/UI related to handling of (already well-examined) errors/exceptions, i. e. providing in-app information on errors, contexts and solutions?

@Ein-Tim has recently opened an issue in the wishlist repo for this: corona-warn-app/cwa-wishlist#165 🙂

@d4rken
Copy link
Member

d4rken commented Aug 27, 2020

In general, are there already plans to change UX/UI related to handling of (already well-examined) errors/exceptions, i. e. providing in-app information on errors, contexts and solutions?

Nothing official yet, but I'm pushing discussion on something like that.
Lot's of things to consider, let's see where it goes. We definitely want to make the collaboration easier, including nicer UI/UX for errors.

Checking whether something is a known issue (rel corona-warn-app/cwa-wishlist#165) is a step further, but I think getting rid of the need for users to have to screenshot popups with stacktraces would already be a huge gain 😅 .

@vaubaehn
Copy link
Contributor

vaubaehn commented Aug 27, 2020

@d4rken

In general, are there already plans to change UX/UI related to handling of (already well-examined) errors/exceptions, i. e. providing in-app information on errors, contexts and solutions?

Nothing official yet, but I'm pushing discussion on something like that.
Lot's of things to consider, let's see where it goes. We definitely want to make the collaboration easier, including nicer UI/UX for errors.

Yes, cool, thanks for having an eye on it!

Checking whether something is a known issue (rel corona-warn-app/cwa-wishlist#165) is a step further, but I think getting rid of the need for users to have to screenshot popups with stacktraces would already be a huge gain 😅 .

Definitively 😅

@Ein-Tim and @daimpi : Yes, my idea was going into a quite similar direction, I drew a primitive sketch about it -> here, but didn't take time yet for polishing and putting it into the wishlist repo.
In short, I describe a Concept On Viewing important Reliability Information of Device = COVirD (to be pronounced: co-weird) ;)
Sounds like a big thing to implement, but if some very small basics are set up first, it may be extended release by release (quite modular). So, if an error / exception / incompatibility would be detected, a card would show up, and link to in-app information or external FAQ. The 'root' card in main screen could provide genral information, if 'everything works fine' with CWA, and in case of problems it could be the starting point for finding information and solutions.

@Zaelnorth
Copy link

Zaelnorth commented Sep 12, 2020

Keep having this issue when on mobile data, the moment I go into wifi, absolutely no problem. I have no netguard/Kaspersky or similar tool installed.

@BlueFire-
Copy link

BlueFire- commented Sep 13, 2020

I had the same issue as Zaelnorth mentioned in #1122. It happened only on mobile network. I'm using a Vodafone SIM. After enabling a VPN or using WiFi it worked without issues.

@ghost
Copy link

ghost commented Sep 29, 2020

Hello community,

this thread has been closed by mistake, therefore I will go on and reopen it.
We will close the issue when we have confirmation of the fix for the next version.
If you have additional questions, please feel free to reach out.

Thanks,
LMM

Corona-Warn-App Open Source Team

@maethes
Copy link

maethes commented Oct 24, 2020

Hi, I also have a 9002 Timout exception when updating the data. Not sure if this the same exception which is mentioned above in this ticket because my phone shows a different stacktrace. Please let me know if it is required to create a separate issue.

error message: URSACHE 9002 Etwas ist schief gelaufen.
Timed out waiting for 180000 ms

image

Device: Huawei Honor 6
Android 6
Corona-Warn Version 1.5.0

The error is reproducable on my device when starting the app. After some hours of trying to update the data I receive a different error message: 39508

image

Waiting some hours to next morning I get the 9002 timeout exception again.

I am not able to use the CWA app because it is showing unknown risk ("Unbekanntes Risiko") for several months now. It was actually working for a few days in may or june (not sure which version it was).

Prior to the error messages mentioned in this ticket I was facing the issues mentioned in #933 and #1053.
After changing several settings I am now facing the timeout issue mentioned here.

This is what I tried so far:

  • Activated mobile data, WLAN and background update for CWA app
  • allowed background activity for the app
  • protected the app in power saving settings
  • activated prioritized background activity
  • changed power savings of my phone to full power
  • freed internal disk space
  • disabled Huawei Mobile Services in "apps having access to usage data" ("Apps mit Nutzungsdatenzugriff")
  • uninstalled system apps using following commands:
    adb shell pm uninstall -k --user 0 com.huawei.powergenie
    adb shell pm uninstall -k --user 0 com.huawei.android.hwaps

Covid19 notifications in google app settings are activated showing 290 checks in the past 14 days.

Please let me know if there is anything I can do to help solving this issue.

Thanks, Matthias

@daimpi
Copy link

daimpi commented Oct 24, 2020

@maethes the issue you're experiencing is tracked over here: #1187 (originally the timeout limit was 60s, with CWA 1.5 it's 180s that's why you're seeing the message "timed out after 180000 ms" even though the issue says 60000 ms).

The 39508 message you're experiencing after a while is now tracked over here: #1459

@heinezen
Copy link
Member

heinezen commented May 9, 2021

Hello everyone,

The specific error reported in here was fixed by increasing the timeout limits in CWA release 1.4. Other 9002 errors should get their own Github issues. We will close this ticket.


Corona-Warn-App Open Source Team

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment