Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMD64 FreeBSD Non-Debug 3.x: out of swap space (test process killed by signal 9) #83502

Closed
vstinner opened this issue Jan 13, 2020 · 18 comments
Closed
Assignees
Labels
3.9 only security fixes tests Tests in the Lib/test dir

Comments

@vstinner
Copy link
Member

BPO 39321
Nosy @vstinner, @koobs, @pablogsal

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = 'https://github.com/koobs'
closed_at = <Date 2020-05-26.06:48:17.424>
created_at = <Date 2020-01-13.14:18:07.367>
labels = ['tests', '3.9']
title = 'AMD64 FreeBSD Non-Debug 3.x: out of swap space (test process killed by signal 9)'
updated_at = <Date 2020-05-26.13:33:10.196>
user = 'https://github.com/vstinner'

bugs.python.org fields:

activity = <Date 2020-05-26.13:33:10.196>
actor = 'vstinner'
assignee = 'koobs'
closed = True
closed_date = <Date 2020-05-26.06:48:17.424>
closer = 'koobs'
components = ['Tests']
creation = <Date 2020-01-13.14:18:07.367>
creator = 'vstinner'
dependencies = []
files = []
hgrepos = []
issue_num = 39321
keywords = []
message_count = 18.0
messages = ['359904', '359905', '359907', '359908', '359943', '359944', '359950', '359953', '363733', '363877', '364627', '365024', '366354', '367782', '367821', '369949', '369963', '369979']
nosy_count = 3.0
nosy_names = ['vstinner', 'koobs', 'pablogsal']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue39321'
versions = ['Python 3.9']

@vstinner
Copy link
Member Author

https://buildbot.python.org/all/#/builders/214/builds/152

...
0:08:21 load avg: 3.66 [240/420] test_wait3 passed -- running: test_multiprocessing_forkserver (1 min 51 sec)
0:08:22 load avg: 3.66 [241/420] test_uuid passed -- running: test_multiprocessing_forkserver (1 min 53 sec)
0:08:25 load avg: 3.53 [242/420] test_tuple passed -- running: test_multiprocessing_forkserver (1 min 55 sec)
0:08:32 load avg: 3.56 [243/420] test___all__ passed -- running: test_multiprocessing_forkserver (2 min 3 sec)
*** Signal 9
Stop.
make: stopped in /usr/home/buildbot/python/3.x.koobs-freebsd-9e36.nondebug/build
program finished with exit code 1
elapsedTime=519.823452

@vstinner vstinner added 3.9 only security fixes tests Tests in the Lib/test dir labels Jan 13, 2020
@vstinner
Copy link
Member Author

It seems like Signal 9 is SIGKILL.

@vstinner vstinner changed the title AMD64 FreeBSD Non-Debug 3.x: main regrtest process killed by Signal 9 AMD64 FreeBSD Non-Debug 3.x: main regrtest process killed by SIGKILL (Signal 9) Jan 13, 2020
@vstinner vstinner changed the title AMD64 FreeBSD Non-Debug 3.x: main regrtest process killed by Signal 9 AMD64 FreeBSD Non-Debug 3.x: main regrtest process killed by SIGKILL (Signal 9) Jan 13, 2020
@vstinner
Copy link
Member Author

@vstinner
Copy link
Member Author

@koobs
Copy link

koobs commented Jan 14, 2020

Identified a kernel/userland mismatch which may have caused this. Have restarted the server and worker, and will rebuild https://buildbot.python.org/all/#/builders/214/builds/152

@koobs
Copy link

koobs commented Jan 14, 2020

Rebuilding now

@koobs
Copy link

koobs commented Jan 14, 2020

Looks OK now: https://buildbot.python.org/all/#/builders/214

If it fails again in the same manner, please re-open

@vstinner
Copy link
Member Author

Identified a kernel/userland mismatch which may have caused this. Have restarted the server and worker, and will rebuild https://buildbot.python.org/all/#/builders/214/builds/152

Aha, interesting bug. Thanks for fixing it ;-)

@vstinner
Copy link
Member Author

vstinner commented Mar 9, 2020

If it fails again in the same manner, please re-open

The issue is back. Two examples.

--

Today: https://buildbot.python.org/all/#/builders/214/builds/405

(...)
0:15:40 load avg: 3.11 [366/420] test_pickletools passed -- running: test_multiprocessing_forkserver (2 min 43 sec), test_multiprocessing_fork (1 min 17 sec)
0:15:40 load avg: 3.11 [367/420] test_webbrowser passed -- running: test_multiprocessing_forkserver (2 min 43 sec), test_multiprocessing_fork (1 min 18 sec)
0:15:42 load avg: 3.11 [368/420] test_codecmaps_hk passed -- running: test_multiprocessing_forkserver (2 min 45 sec), test_multiprocessing_fork (1 min 19 sec)
fetching http://www.pythontest.net/unicode/BIG5HKSCS-2004.TXT ...
0:15:43 load avg: 3.11 [369/420] test_pprint passed -- running: test_multiprocessing_forkserver (2 min 45 sec), test_multiprocessing_fork (1 min 20 sec)
*** Signal 9

--

1 day ago: https://buildbot.python.org/all/#/builders/214/builds/395

(...)
0:14:53 load avg: 3.29 [269/420/1] test_keywordonlyarg passed -- running: test_multiprocessing_forkserver (2 min 25 sec)
0:15:00 load avg: 2.94 [270/420/1] test_pprint passed -- running: test_multiprocessing_forkserver (2 min 31 sec)
0:15:00 load avg: 2.94 [271/420/2] test_io crashed (Exit code -9) -- running: test_multiprocessing_forkserver (2 min 31 sec)
0:15:05 load avg: 2.87 [272/420/2] test_positional_only_arg passed -- running: test_multiprocessing_forkserver (2 min 37 sec)
*** Signal 9

@vstinner vstinner reopened this Mar 9, 2020
@vstinner vstinner reopened this Mar 9, 2020
@koobs
Copy link

koobs commented Mar 11, 2020

Investigating

@vstinner
Copy link
Member Author

The bug still occurs time to time. AMD64 FreeBSD Non-Debug 3.x:
https://buildbot.python.org/all/#/builders/214/builds/475

@vstinner
Copy link
Member Author

New failure: https://buildbot.python.org/all/#/builders/214/builds/512

test.pythoninfo says:

datetime.datetime.now: 2020-03-25 18:59:08.424147
socket.hostname: 121-RELEASE-p2-amd64

/var/log/messages says:

Mar 25 18:41:13 121-RELEASE-p2-amd64 kernel: pid 65447 (python), jid 0, uid 1002, was killed: out of swap space

121-RELEASE-p2-amd64% sysctl hw | egrep 'hw.(phys|user|real)'
hw.physmem: 1033416704
hw.usermem: 745279488
hw.realmem: 1073676288

=> 985.5 MB of memory

121-RELEASE-p2-amd64% sysctl vm|grep swap
vm.swap_enabled: 1
vm.domain.0.stats.unswappable: 0
vm.swap_idle_threshold2: 10
vm.swap_idle_threshold1: 2
vm.swap_idle_enabled: 0
vm.disable_swapspace_pageouts: 0
vm.stats.vm.v_swappgsout: 5793651
vm.stats.vm.v_swappgsin: 3322252
vm.stats.vm.v_swapout: 1390626
vm.stats.vm.v_swapin: 875591
vm.nswapdev: 1
vm.swap_fragmentation:
vm.swap_async_max: 4
vm.swap_maxpages: 1964112
vm.swap_total: 4294864896
vm.swap_reserved: 7942307840

=> 4095.9 MB of swap (total)

121-RELEASE-p2-amd64% swapinfo -h
Device 1K-blocks Used Avail Capacity
/dev/da0p3 4194204 70M 3.9G 2%

121-RELEASE-p2-amd64% swapinfo
Device 1K-blocks Used Avail Capacity
/dev/da0p3 4194204 72164 4122040 2%

@vstinner vstinner changed the title AMD64 FreeBSD Non-Debug 3.x: main regrtest process killed by SIGKILL (Signal 9) AMD64 FreeBSD Non-Debug 3.x: out of swap space (test process killed by signal 9) Mar 25, 2020
@vstinner vstinner changed the title AMD64 FreeBSD Non-Debug 3.x: main regrtest process killed by SIGKILL (Signal 9) AMD64 FreeBSD Non-Debug 3.x: out of swap space (test process killed by signal 9) Mar 25, 2020
@vstinner
Copy link
Member Author

koobs: Any update on this swap issue?

@vstinner
Copy link
Member Author

The worker still has the same issue:
https://buildbot.python.org/all/#/builders/214/builds/674
0:19:08 load avg: 3.11 running: test_multiprocessing_forkserver (1 min 38 sec)
*** Signal 9

Any update on this issue Koobs?

@koobs
Copy link

koobs commented May 1, 2020

Provisioning new/additional swap to both of FreeBSD BB workers in the next few days. Apologies for the delay

@koobs
Copy link

koobs commented May 26, 2020

Added 8gb swap disk to each BB worker

@koobs koobs closed this as completed May 26, 2020
@koobs koobs closed this as completed May 26, 2020
@vstinner
Copy link
Member Author

Added 8gb swap disk to each BB worker

Great! Thank you very much!

@vstinner
Copy link
Member Author

Oh. I'm not sure that it works as expected :-(

AMD64 FreeBSD Non-Debug 3.x build 804, Finished 18 minutes ago:

https://buildbot.python.org/all/#/builders/214/builds/804
0:33:28 load avg: 4.64 [329/425/1] test_code_module passed -- running: test_multiprocessing_forkserver (2 min 13 sec)
*** Signal 9

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.9 only security fixes tests Tests in the Lib/test dir
Projects
None yet
Development

No branches or pull requests

2 participants