Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client crash (every 3-4 games) #1080

Closed
tomaskir opened this issue Dec 29, 2018 · 53 comments · Fixed by #1133
Closed

Client crash (every 3-4 games) #1080

tomaskir opened this issue Dec 29, 2018 · 53 comments · Fixed by #1133
Labels
bug S1 critical severity 1 - critical - crashes, loss of data, severe memory leak
Milestone

Comments

@tomaskir
Copy link

Since 0.9.x, client crashes on me quite often.
Game stays up, but since client crashes, I lose conn to all players.

Here is the Java crash log:
https://gist.github.com/tomaskir/c88e45eea3acdb3a17d80ae10084bbcd

@micheljung micheljung added bug S1 critical severity 1 - critical - crashes, loss of data, severe memory leak labels Dec 29, 2018
@germanicianus
Copy link
Contributor

germanicianus commented Dec 29, 2018

Until this issue is resolved, you could go back to 0.8.x if the client crashes to often. In this case you may have to change the setting "language" (e. g. AUTO) in your client preferences file C:\Users\<username>\AppData\Roaming\Forged Alliance Forever\client.prefs to either lower or upper case or just delete the file.

@germanicianus
Copy link
Contributor

The following OpenJDK bugs seem to be similar to this issue, but have been closed as Incomplete:

@tomaskir
Copy link
Author

tomaskir commented Dec 31, 2018

Is there anything I can provide / do so this can be properly tracked to the cause and fixed?
Or is this a JRE / JDK issue?

Just had another crash today.

@germanicianus
Copy link
Contributor

germanicianus commented Dec 31, 2018

Yes, you can provide all hs_err_pid<number>.log files for the crashes you had. They should still be somewhere on your harddisk, but currently I cannot tell you where exactly. This would help us to find out if the crash is always caused by the same code.

As I said it seems to be some Java runtime issue related to JavaScript calls. Extract from your log:

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
J 10629  com.sun.webkit.dom.JSObject.callImpl(JILjava/lang/String;[Ljava/lang/Object;Ljava/security/AccessControlContext;)Ljava/lang/Object; javafx.web@10.0.2 (0 bytes) @ 0x00000000123f2052 [0x00000000123f1fc0+0x0000000000000092]
J 19038 c2 com.sun.webkit.dom.JSObject.call(Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/Object; javafx.web@10.0.2 (23 bytes) @ 0x000000001307dc54 [0x000000001307d5e0+0x0000000000000674]
J 10943 c1 com.faforever.client.chat.ChannelTabController.lambda$removeUserMessageClass$13(Lcom/faforever/client/chat/ChatChannelUser;Ljava/lang/String;)V (40 bytes) @ 0x000000000c0301dc [0x000000000c02f1e0+0x0000000000000ffc]

@tomaskir
Copy link
Author

tomaskir commented Dec 31, 2018

@1-alex98
Copy link
Member

1-alex98 commented Jan 2, 2019

I wonder if that happens more often

@1-alex98
Copy link
Member

1-alex98 commented Jan 2, 2019

Hmm really bad bug for us because we wont be able to do much about it(most probably)

@tomaskir
Copy link
Author

tomaskir commented Jan 2, 2019

Well, the fact that the lobby crashes would not be a problem if it didn't bring down the ICE adapter with it.

I suppose for players with public IPs and port forwards this is not critical, since it doesn't influence their game. However for anyone running NAT-T over ICE, client crash brings down ICE, and then the game is dead. (at least according to my current understanding of the FAF architecture)

Is there any chance to decouple ICE adapter runtime from the lobby?
(so in the even of a lobby crash, at least I can finish the game?)

@Geosearchef
Copy link
Member

Without the ice adapter a client crash means you're disconnected as all traffic is relayed via the client. Using the adapter we could let it stay open but as there is no client, ice messages won't be forwarded anymore and you couldn't reconnect, reporting scores and maybe disconnecting from players may stop working aswell

@germanicianus
Copy link
Contributor

germanicianus commented Jan 2, 2019

I just had a look at the crash logs. All of the crashes are caused by JavaScript calls (in JSObject) which are needed for the chat functionality. If I am correct and tab unloading/loading on demand is already in the currently released version, I assume you can prevent these calls by just switching to tab „Play“ and staying on this tab while you are in a game. To verify that I would need to have a close look at the related code.

@micheljung
Copy link
Member

micheljung commented Jan 3, 2019

Chat tabs currently don't unload, and the crash in the reported case happened when calling removeUserMessageClass which is called as users join/leave

More logs would be appreciated, in order to see whether it's always the same method that fails.
But of course such crashes are unacceptable :-(

@tomaskir
Copy link
Author

tomaskir commented Jan 3, 2019

There is a link for 5 log files in my 2nd comment on this issue.

Is that sufficient or should I provide more?

@micheljung
Copy link
Member

micheljung commented Jan 3, 2019

Sorry I just saw it :-) thanks a lot

As a first step, I'd like to avoid the nested calls to Platform.runLater() by replacing such calls with JavaFxUtil.runLater, if anyone has time to do that?

@germanicianus
Copy link
Contributor

Please create a batch file with the following content in the directory where you installed the client (downlords-faf-client.exe is located there) and always use it for launching the client. If the JVM then crashes, a core dump will be created.

set faf_program_dir=%~dp0\lib
set java_binary=%~dp0\jre\bin\java
set java_class_path=downlords-faf-client-0.9.3-beta.jar:commons-compress-1.9.jar;.\*;.\
cd "%faf_program_dir%"
%java_binary% -cp %java_class_path% -DnativeDir=%faf_program_dir% -Dprism.dirtyopts=false -Xms128m -Xmx512m -XX:+CreateCoredumpOnCrash -XX:MinHeapFreeRatio=15 -XX:MaxHeapFreeRatio=33 -XX:+HeapDumpOnOutOfMemoryError -XX:+UseStringDeduplication -javaagent:%faf_program_dir%\webview-patch.jar -Dsun.java2d.opengl=true com.faforever.client.FafClientApplication
pause

@tomaskir
Copy link
Author

tomaskir commented Jan 4, 2019

Script would not launch as provided.
Had to adjust line 5 for %java_binary% to be enclosed in " (since the path contains spaces).

After that fix, I now get:

Error: Could not find or load main class FAF
Caused by: java.lang.ClassNotFoundException: FAF

@germanicianus
Copy link
Contributor

germanicianus commented Jan 4, 2019

I adopted the batch script from my bash script and had to remove all the double quotes which I put to prevent issues with spaces. I had to remove them because Windows 7 couldn't cope with them and I usually have no spaces in my paths. The error you now get is exactly because of spaces in your path(s) and double quotes at the incorrect place. The easiest (temporary) solution for you is to copy the complete client directory to a path with no spaces and launch it there. Later you can try to fix the quotes placing.

I guess the problem is the path C:\Games\Downlord's FAF Client\. Depending on quotes placing the command is split. With your current quotes placing, Java thinks FAF is the main class.

@germanicianus
Copy link
Contributor

germanicianus commented Jan 5, 2019

A better solution is to use relative paths - see below (yet untested). I hope that works, otherwise I will fix it asap. I fixed the quotes. Use below as batch content.

set faf_program_dir=%~dp0\lib
set java_binary=%~dp0\jre\bin\java
set java_class_path=downlords-faf-client-0.9.3-beta.jar:commons-compress-1.9.jar;.\*;.\
cd "%faf_program_dir%"
"%java_binary%" -cp %java_class_path% -DnativeDir="%faf_program_dir%" -Dprism.dirtyopts=false -Xms128m -Xmx512m -XX:+CreateMinidumpOnCrash -XX:MinHeapFreeRatio=15 -XX:MaxHeapFreeRatio=33 -XX:+HeapDumpOnOutOfMemoryError -XX:+UseStringDeduplication -javaagent:"%faf_program_dir%\webview-patch.jar" -Dsun.java2d.opengl=true com.faforever.client.FafClientApplication
pause

@germanicianus
Copy link
Contributor

germanicianus commented Jan 5, 2019

I fixed the quotes of the batch content. See my changed comment above. Please give feedback when you successfully ran it.

@tomaskir
Copy link
Author

tomaskir commented Jan 18, 2019

So I have been running the lobby using the script for the past 2 weeks (since you posted the original script).

I am still running the original script which worked only with a non-spaced path.
(so current path to the directory with the script and the client does not contain a space)

I have not had a single crash since.
Everything has been working fine for about 20 games over those past 2 weeks.

Is there anything different about the way the script launches the lobby as compared to just running the .exe?

EDIT: As an additional comment - a friend I play with (he is also on 0.9.3 Beta) has also experienced 2 crashes in this time-frame, so it seems this bug is not isolated to me only.

@1-alex98
Copy link
Member

There should not be anything different @germanicianus right?

@germanicianus
Copy link
Contributor

Of course there is a big difference.

This is the command which my current script generates on Linux (manually formatted):

<my-java-path>/jdk-10/bin/java

-cp
downlords-faf-client-0.9.3-beta.jar:
commons-compress-1.9.jar:
./*:
./

-DnativeDir=<dir-in-user-home>/downlords_faf_client/downlords-faf-client-0.9.3-beta/lib
-Dprism.dirtyopts=false
-Xms128m
-Xmx512m
-XX:MinHeapFreeRatio=15
-XX:MaxHeapFreeRatio=33
-XX:+HeapDumpOnOutOfMemoryError
-XX:+UseStringDeduplication
-javaagent:<dir-in-user-home>/downlords_faf_client/downlords-faf-client-0.9.3-beta/lib/webview-patch.jar
-Dsun.java2d.opengl=true
com.faforever.client.FafClientApplication

These are the JVM arguments from his crash log (jvm_args, manually formatted):

-Dexe4j.semaphoreName=Local\c:_games_downlord's_faf_client_downlords-faf-client.exe
-Dexe4j.isInstall4j=true
-Dexe4j.moduleName=C:\Games\Downlord's FAF Client\downlords-faf-client.exe
-Dexe4j.tempDir=
-Dexe4j.unextractedPosition=0

-Djava.library.path=
C:\Games\Downlord's FAF Client\.\lib;
C:\Program Files (x86)\Common Files\Oracle\Java\javapath;
C:\Program Files (x86)\Razer Chroma SDK\bin;
C:\Program Files\Razer Chroma SDK\bin;
C:\Windows\system32;
C:\Windows;
C:\Windows\System32\Wbem;
C:\Windows\System32\WindowsPowerShell\v1.0\;
C:\ProgramData\Oracle\Java\javapath;
C:\Software\Maven\apache-maven-3.3.9\bin;
C:\WINDOWS\system32;
C:\WINDOWS;
C:\WINDOWS\System32\Wbem;
C:\WINDOWS\System32\WindowsPowerShell\v1.0\;
C:\WINDOWS\SysWOW64\WindowsPowerShell\v1.0\Modules\TShell\TShell\;
C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;
C:\Program Files (x86)\Skype\Phone\;
C:\Users\Tomas\AppData\Local\Microsoft\WindowsApps;
C:\WINDOWS\system32;
C:\WINDOWS;
C:\WINDOWS\System32\Wbem;
C:\WINDOWS\System32\WindowsPowerShell\v1.0\;
C:\Users\Tomas\AppData\Local\Atlassian\SourceTree\git_local\bin;
C:\Program Files (x86)\nodejs\;
C:\WINDOWS\System32\OpenSSH\;
C:\Users\Tomas\AppData\Local\Microsoft\WindowsApps;
C:\Users\Tomas\AppData\Local\atom\bin;
C:\Users\Tomas\AppData\Local\Microsoft\WindowsApps;
C:\Users\Tomas\AppData\Roaming\npm;
;
c:\games\downlord's faf client\jre\bin

-Dexe4j.consoleCodepage=cp0
-DnativeDir=lib
-Dprism.dirtyopts=false
-XX:+HeapDumpOnOutOfMemoryError
-javaagent:lib/webview-patch.jar
-Dinstall4j.launcherId=25
-Dinstall4j.swt=false
-Dsun.java2d.dpiaware=true
-Xmx512m

@tomaskir
Copy link
Author

Any idea why this would fix the crashes?

@germanicianus
Copy link
Contributor

Not exactly. My command is a minimal one and leaves out all unnecessary parameters which means it doesn’t use any path outside of the client directory. Furthermore it doesn’t use the included exe but directly the Java binary. Therefore it is likely that the original command causes loading of a component/library/dll which conflicts with another one.

Verifying that needs quite some effort because it means to compare the components loaded by my command with the components indicated as loaded by your crash log. Even this will most likely leave a lot of room because most likely there will be a lot of differences.

For the time being just continue using the script. For new client versions just adopt the version number string inside.

@micheljung
Copy link
Member

micheljung commented Jan 23, 2019

@tomaskir which java version are you using? Would be interesting to know if you have the same issue if you use the one shipped in c:\games\downlord's faf client\jre\bin

@tomaskir
Copy link
Author

tomaskir commented Jan 23, 2019

C:\Users\Tomas> java -version
java version "1.8.0_181"
Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)

It seems the client ships with Java 10 - so it is possible this is only an issue in Java 10.

Any chance for a modification to the script to run with the packaged JRE?

@germanicianus
Copy link
Contributor

germanicianus commented Jan 23, 2019

My script already uses the JRE packaged with the client:

Furthermore it doesn’t use the included exe but directly the [included] Java binary.

%~dp0 is a variable holding the path where the batch file is located. All paths are built using that base path. That is why I wrote:

Please create a batch file [...] in the directory where you installed the client (downlords-faf-client.exe is located there)

@1-alex98
Copy link
Member

what if we use engine.executescript and call the function in the script as a workaround.

@1-alex98
Copy link
Member

@tomaskir can build u a version to try out if u want to try

@tomaskir
Copy link
Author

Sure, I can test.

@1-alex98
Copy link
Member

File was to big for GitHub so uploaded at Gdrive
@tomaskir https://drive.google.com/file/d/1sk8yYXY-azKpQf6swI5OOhUYL313qZT2/view?usp=sharing

@1-alex98
Copy link
Member

1-alex98 commented Mar 22, 2019

@tomaskir But it happend like all the time right? And with different versions so if it works now it does most likely fix it right?

@tomaskir
Copy link
Author

I happened about once every 5-8 games since 0.9.0.
So quite often.

I will let you know how it goes after playing with the build you provided.

@tomaskir
Copy link
Author

tomaskir commented Mar 22, 2019

The build you provided sadly doesn't work.

I can't host a game:

You are already in a game or haven't run the connectivity test yet

Also a bunch of exceptions when running the debug mode script:
https://gist.github.com/tomaskir/49c3c844b967f35de8281de82fb395da

@1-alex98
Copy link
Member

1-alex98 commented Mar 23, 2019

oh wait :D i think i built an ICE version ... ups
ICE is the new protocol and is already in the main branch however it is not yet deployed at FAF
So i need to rebase an extra branch on top for it to work

Gonna do this now then.

@tomaskir sorry for the inconvenience

@1-alex98
Copy link
Member

There are some other Exceptions in there related to a change i did lately thx every much it seems the change I made is not done correctly i opened a separate issue for that

1-alex98 pushed a commit that referenced this issue Mar 24, 2019
@1-alex98
Copy link
Member

@1-alex98
Copy link
Member

@tomaskir tell me if you tired we would be really keen on fixing this problem :D there are also others that face this problem however they are mostly not reachable after they report the problem ;)

@tomaskir
Copy link
Author

Testing the build today/tomorrow.

Need multiple games to know if this is fixed or not.

@1-alex98
Copy link
Member

Actually you probably dont need to play games having it open for a long time will probably also do. But yeah play a few games as well it might be connected tho i don't think so.

@tomaskir
Copy link
Author

tomaskir commented Mar 31, 2019

I have been running the build for 3 days straight.
No crashes.

However, due to work and no free time, I have not played a game until today.
Still can't host a game with the build you provided.

Telling me I have not run a connectivity test or am not connected.

However due to no crash in 3 days, I think this might be a valid fix for this issue.

@micheljung
Copy link
Member

This may have been fixed by #1144

You can test the fix at https://drive.google.com/file/d/1-QKQK85UvaaQewHJrrH01BrZAe_dGPa0/view?usp=drivesdk

@tomaskir
Copy link
Author

Sad to report that running 0.10.4, the crashes are still present...
Installer was used to install.

ICE client also crashes when the client crashed, so I was yet again disconnected from the game :(

---------------  T H R E A D  ---------------

Current thread (0x000000003635a800):  JavaThread "JavaFX Application Thread" [_thread_in_native, id=39212, stack(0x00000000375e0000,0x00000000376e0000)]

Stack: [0x00000000375e0000,0x00000000376e0000],  sp=0x00000000376dc860,  free space=1010k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  0x00007ffa5723ed1a

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
J 10202  com.sun.webkit.dom.JSObject.callImpl(JILjava/lang/String;[Ljava/lang/Object;Ljava/security/AccessControlContext;)Ljava/lang/Object; javafx.web@10.0.2 (0 bytes) @ 0x0000000012bf2b52 [0x0000000012bf2ac0+0x0000000000000092]
J 10201 c1 com.sun.webkit.dom.JSObject.call(Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/Object; javafx.web@10.0.2 (23 bytes) @ 0x000000000c5e0b6c [0x000000000c5e0720+0x000000000000044c]
J 10394 c1 com.faforever.client.chat.ChannelTabController.lambda$removeUserMessageClass$15(Lcom/faforever/client/chat/ChatChannelUser;Ljava/lang/String;)V (40 bytes) @ 0x000000000c64589c [0x000000000c6452c0+0x00000000000005dc]
J 10393 c1 com.faforever.client.chat.ChannelTabController$$Lambda$1876.run()V (16 bytes) @ 0x000000000c644e2c [0x000000000c644dc0+0x000000000000006c]
J 14103 c2 com.sun.javafx.application.PlatformImpl$$Lambda$224.run()Ljava/lang/Object; javafx.graphics@10.0.2 (8 bytes) @ 0x0000000012750664 [0x0000000012750620+0x0000000000000044]
v  ~StubRoutines::call_stub
J 7227  java.security.AccessController.doPrivileged(Ljava/security/PrivilegedAction;Ljava/security/AccessControlContext;)Ljava/lang/Object; java.base@10.0.2 (0 bytes) @ 0x0000000012accd26 [0x0000000012acccc0+0x0000000000000066]
J 13749 c2 com.sun.glass.ui.InvokeLaterDispatcher$Future.run()V javafx.graphics@10.0.2 (91 bytes) @ 0x0000000012fd6508 [0x0000000012fd63e0+0x0000000000000128]
v  ~StubRoutines::call_stub
j  com.sun.glass.ui.win.WinApplication._runLoop(Ljava/lang/Runnable;)V+0 javafx.graphics@10.0.2
j  com.sun.glass.ui.win.WinApplication.lambda$runLoop$3(ILjava/lang/Runnable;)V+8 javafx.graphics@10.0.2
j  com.sun.glass.ui.win.WinApplication$$Lambda$212.run()V+12 javafx.graphics@10.0.2
j  java.lang.Thread.run()V+11 java.base@10.0.2
v  ~StubRoutines::call_stub

siginfo: EXCEPTION_ACCESS_VIOLATION (0xc0000005), reading address 0x0000000300b43e50

@1-alex98
Copy link
Member

Hmm what now? We could still try my workaround

@1-alex98
Copy link
Member

Well i mean we tested it maybe we should implement it then

@1-alex98
Copy link
Member

Any reactions?

@1-alex98
Copy link
Member

Did not know it was only me doing this project... 🤔

1-alex98 added a commit that referenced this issue Jun 11, 2019
@1-alex98
Copy link
Member

So reopened the old PR of mine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug S1 critical severity 1 - critical - crashes, loss of data, severe memory leak
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants