Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault when terminating upf #16

Closed
sumichaaan opened this issue Dec 7, 2020 · 4 comments
Closed

Segmentation fault when terminating upf #16

sumichaaan opened this issue Dec 7, 2020 · 4 comments

Comments

@sumichaaan
Copy link

sumichaaan commented Dec 7, 2020

Currently, I've built gtp5g kernel module v0.2.0 in Fedora CoreOS 32 and Kernel 5.8.x.
In my environment, I've checked that gtp5g kernel module is read correctly and free5gc upf is up.
But a segmentation fault occur when terminating upf.

The log at that time is as follows.

<...snip…>

2020-11-25T16:24:57Z [INFO][UPF][Util] Removing DNN routes
2020-11-25T16:24:57Z [DEBU][UPF][Util] Pool Free successful, total capacity[1024], available[1023]
[ 1109.305515] stack segment: 0000 [#1] SMP PTI
[ 1109.306562] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G           OE     5.8.17-200.fc32.x86_64 #1
[ 1109.308537] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[ 1109.310568] RIP: 0010:pdr_context_free+0x90/0x120 [gtp5g]
[ 1109.311819] Code: 48 8b 7d 08 e8 21 4e 9a f7 48 8b bb 58 ff ff ff e8 15 4e 9a f7 48 8b bb 60 ff ff ff e8 09 4e 9a f7 48 8b 6d 10 48 85 ed 74 4c <48> 8b 45 00 48 85 c0 74 1f 48 8b 78 18 e8 ee 4d 9a f7 48 8b 45 00
[ 1109.315911] RSP: 0018:ffffb518800a8ef0 EFLAGS: 00010286
[ 1109.317118] RAX: ffff9ec6f87a2b01 RBX: ffff9ec6f95970f8 RCX: 0000000000003674
[ 1109.318729] RDX: 0000000000003673 RSI: e8df0a46b05900e6 RDI: 000000000002f040
[ 1109.320345] RBP: 90b4338f06190d33 R08: ffff9ec6f3683b00 R09: 0000000000000000
[ 1109.321959] R10: ffff9ec6f87a2e30 R11: 000000000000b801 R12: ffff9ec6f9597000
[ 1109.323571] R13: 0000000000000000 R14: ffff9ec6fd63a6c0 R15: ffff9ec6fdd2b090
[ 1109.325199] FS:  0000000000000000(0000) GS:ffff9ec6fdd00000(0000) knlGS:0000000000000000
[ 1109.327029] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1109.328352] CR2: 000000c0003f4000 CR3: 00000000307f8005 CR4: 00000000003606e0
[ 1109.329970] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1109.331583] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1109.333218] Call Trace:
[ 1109.333840]  <IRQ>
[ 1109.334381]  rcu_do_batch+0x197/0x3e0
[ 1109.335272]  rcu_core+0x189/0x2e0
[ 1109.336089]  __do_softirq+0xd9/0x2c4
[ 1109.336960]  asm_call_irq_on_stack+0x12/0x20
[ 1109.337975]  </IRQ>
[ 1109.338524]  do_softirq_own_stack+0x37/0x40
[ 1109.339518]  irq_exit_rcu+0xc2/0x100
[ 1109.340391]  sysvec_apic_timer_interrupt+0x34/0x80
[ 1109.341520]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[ 1109.342718] RIP: 0010:native_safe_halt+0xe/0x10
[ 1109.343793] Code: 02 20 48 8b 00 a8 08 75 c4 e9 7b ff ff ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc e9 07 00 00 00 0f 00 2d 06 5a 49 00 fb f4 <c3> 90 e9 07 00 00 00 0f 00 2d f6 59 49 00 f4 c3 cc cc 0f 1f 44 00
[ 1109.347925] RSP: 0018:ffffb51880073ed0 EFLAGS: 00000246
[ 1109.349139] RAX: ffffffffb8b73450 RBX: 0000000000000001 RCX: 0000000000000000
[ 1109.350763] RDX: 0000000000000001 RSI: ffffb51880073ea0 RDI: 000001025d329955
[ 1109.352398] RBP: 0000000000000001 R08: 0000000000000001 R09: ffff9ec6f89af200
[ 1109.354021] R10: 00000000000003ec R11: 0000000000000000 R12: 0000000000000000
[ 1109.355646] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 1109.357274]  ? __sched_text_end+0x3/0x3
[ 1109.358194]  default_idle+0x1a/0x140
[ 1109.359057]  do_idle+0x1f3/0x2a0
[ 1109.359851]  ? arch_cpu_idle_exit+0x40/0x40
[ 1109.360846]  cpu_startup_entry+0x19/0x20

Maybe I suspect a timing of kfree(pdr->pdi) in pdr_context_free() (line 1097, gtp5g.c) is a mistake.
So, I think this code should be written after run the lines related to freeing SDF fields (line 1100 to 1110.)

How do you think about this?

Best Regards.

@NAYANSEN90
Copy link

NAYANSEN90 commented Apr 29, 2021

I do not get a kernel crash, but my entire VM hangs. I have also isolated the problem to "pdr_context_delete". I see the problem even when during PDU session release. I commented out the "pdr_context_delete" in the "gtp5g_del_pdr" call, and then the VM didn't hang.

I think somehow the rcu calls are not proper?

I use Ubuntu 20.04

I reviewed the code as well, the pdi kfree should be at the end, Since pdi is used after the kfree is called.

@muthuramanecs03g
Copy link
Collaborator

@sumichaaan/@NAYANSEN90,

There is some problem with releasing a gtp interface.

We will fix it soon.

@muthuramanecs03g
Copy link
Collaborator

muthuramanecs03g commented May 20, 2021

Hi @sumichaaan,

One more thing, I would like to update you, this repo is part of free5GC project. Now, this repo forked into free5GC and maintain it there only.

https://github.com/free5gc/gtp5g

@muthuramanecs03g
Copy link
Collaborator

Closing this issue due to no longer support this repo.'

Please, consider opening the issues in https://github.com/free5gc/gtp5g

coolshou pushed a commit to coolshou/gtp5g that referenced this issue Nov 30, 2023
…zOwO#16)

* modify makefile to decide to match IP address(in F-TEID) or not

* update DRV_VERSION

Co-authored-by: ycchen <chen042531.cs03@nctu.edu.tw>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants