Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: go program crach, it seems fall into infinite loop #35733

Closed
mmli519 opened this issue Nov 21, 2019 · 5 comments
Closed

runtime: go program crach, it seems fall into infinite loop #35733

mmli519 opened this issue Nov 21, 2019 · 5 comments

Comments

@mmli519
Copy link

@mmli519 mmli519 commented Nov 21, 2019

What version of Go are you using (go version)?

go version go1.12.9 linux/arm64

Does this issue reproduce with the latest release?

What operating system and processor architecture are you using (go env)?

go env Output
aarch64

What did you do?

do nothing, just run the go program

What did you expect to see?

program run normally

What did you see instead?

program crashed!

after gdb core file, we see one thread falled into infinite loop, see details below

(gdb) bt
#0  runtime.futex () at /opt/tools/go/src/runtime/sys_linux_arm64.s:417
#1  0x000000000042cb68 in runtime.futexsleep (addr=0x14f23b0 <runtime.sched+272>, val=0, ns=60000000000) at /opt/tools/go/src/runtime/os_linux.go:63
#2  0x000000000040be7c in runtime.notetsleep_internal (n=0x14f23b0 <runtime.sched+272>, ns=60000000000, ~r2=<optimized out>) at /opt/tools/go/src/runtime/lock_futex.go:193
#3  0x000000000040bf50 in runtime.notetsleep (n=0x14f23b0 <runtime.sched+272>, ns=60000000000, ~r2=<optimized out>) at /opt/tools/go/src/runtime/lock_futex.go:216
#4  0x000000000043b614 in runtime.sysmon () at /opt/tools/go/src/runtime/proc.go:4305
#5  0x000000000043350c in runtime.mstart1 () at /opt/tools/go/src/runtime/proc.go:1206
#6  0x000000000043342c in runtime.mstart () at /opt/tools/go/src/runtime/proc.go:1172
#7  0x0000000000b683b8 in crosscall1 () at gcc_arm64.S:40
...
#38643 0x0000000000b683b8 in crosscall1 () at gcc_arm64.S:40
#38644 0x0000000000b683b8 in crosscall1 () at gcc_arm64.S:40
#38645 0x0000000000b683b8 in crosscall1 () at gcc_arm64.S:40
#38646 0x0000000000b683b8 in crosscall1 () at gcc_arm64.S:40
#38647 0x0000000000b683b8 in crosscall1 () at gcc_arm64.S:40
#38648 0x0000000000b683b8 in crosscall1 () at gcc_arm64.S:40
#38649 0x0000000000b683b8 in crosscall1 () at gcc_arm64.S:40
#38650 0x0000000000b683b8 in crosscall1 () at gcc_arm64.S:40
#38651 0x0000000000b683b8 in crosscall1 () at gcc_arm64.S:40
#38652 0x0000000000b683b8 in crosscall1 () at gcc_arm64.S:40

register values are

(gdb) i r
x0             0x4000874848        274886772808
x1             0x80                128
x2             0x0                 0
x3             0x0                 0
x4             0x0                 0
x5             0x0                 0
x6             0x45                69
x7             0x1                 1
x8             0x62                98
x9             0x92865ea           153642474
x10            0x5dcee731          1573840689
x11            0x18                24
x12            0xffffffffa235c94b  -1573533365
x13            0x0                 0
x14            0xfffe75ffa5c0      281468366464448
x15            0x0                 0
x16            0x4000874738        274886772536
x17            0xfffe75ffa870      281468366465136
x18            0x1                 1
x19            0x8                 8
x20            0x40009c2f20        274888142624
x21            0x4000874700        274886772480
x22            0x65f4              26100
x23            0x0                 0
x24            0x4000c673d8        274890912728
x25            0x6c1a1cae73a3c66a  7789570041080694378
x26            0xd5a720            14001952
x27            0x14b0380           21693312
x28            0x4000688180        274884755840
x29            0xfffe75ffa6f8      281468366464760
x30            0x42caf4            4377332
sp             0xfffe75ffa700      0xfffe75ffa700
pc             0x45be04            0x45be04 <runtime.futex+28>
cpsr           0x60000000          [ EL=0 C Z ]
fpsr           0x10                16
fpcr           0x0                 0
gcc_arm64.S code snapshot
25 .globl EXT(crosscall1)
26 EXT(crosscall1):
27 stp x19, x20, [sp, #-16]!
28 stp x21, x22, [sp, #-16]!
29 stp x23, x24, [sp, #-16]!
30 stp x25, x26, [sp, #-16]!
31 stp x27, x28, [sp, #-16]!
32 stp x29, x30, [sp, #-16]!
33 mov x29, sp
34
35 mov x19, x0
36 mov x20, x1
37 mov x0, x2
38
39 blr x20
40 blr x19
41
42 ldp x29, x30, [sp], #16
43 ldp x27, x28, [sp], #16
44 ldp x25, x26, [sp], #16
45 ldp x23, x24, [sp], #16
46 ldp x21, x22, [sp], #16
47 ldp x19, x20, [sp], #16
48 ret

as see above, x19's value is strange

@mmli519

This comment has been minimized.

Copy link
Author

@mmli519 mmli519 commented Nov 21, 2019

on 14 Oct, I submitted one similar issue: #34886

@ALTree ALTree changed the title go program crach, it seems fall into infinite loop runtime: go program crach, it seems fall into infinite loop Nov 21, 2019
@ALTree

This comment has been minimized.

Copy link
Member

@ALTree ALTree commented Nov 21, 2019

Thanks for reporting this.

A few questions:

  • can you provide a reproducer? I'll be harder to debug the issue without a way to reproduce it
  • how often does this happen? Is it easily reproducible? Does it happen every time? Or once in a while?
  • does your program use cgo and or unsafe?
  • if you run it with -race, does the race detector print any warning?
  • have you tried running the latest released version (1.13.4)? Does your program crash with 1.13?
@mmli519

This comment has been minimized.

Copy link
Author

@mmli519 mmli519 commented Nov 21, 2019

Thanks for reporting this.

A few questions:

  • can you provide a reproducer? I'll be harder to debug the issue without a way to reproduce it

A: I happened in production environment occasionally,still don't know how to reproduce it.

  • how often does this happen? Is it easily reproducible? Does it happen every time? Or once in a while?

A: It is not easily reproducible. It happen occasionally

  • does your program use cgo and or unsafe?

A: not use

  • if you run it with -race, does the race detector print any warning?

A: yes, it detect some data race warnings

  • have you tried running the latest released version (1.13.4)? Does your program crash with 1.13?

A: not yet. the version in production enrionment is 1.12.9. We plan to upgrade to 1.13. since it is not easy to reproduce, we are trying to produce it in dev environment.

race warning example:
==============================
WARNING: DATA RACE
Write at 0x00000127b680 by goroutine 47:
  service.KafkaClientProducer()
      /home/ossadm/smpagent/src/Apollo_EyeHand_go/src/service/kafkaDataUpload.go:28 +0x1f0

Previous read at 0x00000127b680 by goroutine 50:
  service.uploadSaaSBasicInfoToKafka()
      /home/ossadm/smpagent/src/Apollo_EyeHand_go/src/service/initSaaSConf.go:316 +0x4f

Goroutine 47 (running) created at:
  main.main()
      /home/ossadm/smpagent/src/Apollo_EyeHand_go/src/main.go:23 +0xc4

Goroutine 50 (finished) created at:
  service.uploadSaaSBasicInfo()
      /home/ossadm/smpagent/src/Apollo_EyeHand_go/src/service/initSaaSConf.go:59 +0x42
  service.InitUploadSaaSConf()
      /home/ossadm/smpagent/src/Apollo_EyeHand_go/src/service/initSaaSConf.go:18 +0x5c
@agnivade

This comment has been minimized.

Copy link
Contributor

@agnivade agnivade commented Nov 21, 2019

A program with races is not a valid program. Please fix the races first and if you still see the issue, then open an issue.

@ALTree

This comment has been minimized.

Copy link
Member

@ALTree ALTree commented Nov 21, 2019

@mmli519 Thanks.

What @agnivade said. The data races are probably corrupting random memory and causing the crash. You have to fix every data race in your program, or it'll keep crashing.

I'm closing here, since this is caused by an issue in the program.

@ALTree ALTree closed this Nov 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.