Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARPL 106 total crash #94

Open
nemesis122 opened this issue Dec 5, 2022 · 47 comments
Open

ARPL 106 total crash #94

nemesis122 opened this issue Dec 5, 2022 · 47 comments
Labels

Comments

@nemesis122
Copy link

problem arpl106.txt

@nemesis122
Copy link
Author

with 1.03 all is working fine i dont know what is the issue pls let me know when you need more informations

@nemesis122 nemesis122 changed the title ARPL 106 toal crash ARPL 106 total crash Dec 5, 2022
@nemesis122
Copy link
Author

Is here also the ixgbe driver the issue ?
Modules linked in: ixgbe(OE) vxlan etxhci_hcd ip6_udp_tunnel udp_tunnel dca button(E) usb_storage xhci_pci xhci_hcd ehci_pci ehci_hcd usbcore usb_common mv14xx(O) redpill(OE) [last unloaded: broadwellnk_synobios]
[ 104.918219] CPU: 1 PID: 4881 Comm: scsi_id Tainted: P OEL 4.4.180+ #42962
[ 104.955211] Hardware name: HP ProLiant MicroServer Gen8, BIOS J06 04/04/2019
[ 104.989043] task: ffff88040729b300 ti: ffff880407d40000 task.ti: ffff880407d40000
[ 105.025071] RIP: 0010:[] [] queued_spin_lock_slowpath+0xe2/0x150
[ 105.068681] RSP: 0018:ffff880407d43de0 EFLAGS: 00000202
[ 105.094652] RAX: 0000000000080001 RBX: ffff8803f2c103c0 RCX: 0000000000080000
[ 105.128590] RDX: ffff880409656e00 RSI: 0000000000000000 RDI: ffffffffa001f020
[ 105.163416] RBP: ffff880407d43de0 R08: 0000000000000001 R09: 0000000000002285
[ 105.197503] R10: fffffffffffff12b R11: 0000000000000202 R12: 000000000002005d
[ 105.231455] R13: 0000000000002285 R14: 00007ffc920b3210 R15: 00007ffc920b3210
[ 105.265739] FS: 00007f806ebf8c00(0000) GS:ffff880409640000(0000) knlGS:0000000000000000
[ 105.304953] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 105.332612] CR2: 00007f806e21ca30 CR3: 0000000407706000 CR4: 00000000001606f0
[ 105.367133] Stack:
[ 105.376966] ffff880407d43df0 ffffffff81566a7c ffff880407d43e20 ffffffffa0002293
[ 105.412219] 00000000ffffffe7 00007ffc920b3210 00007ffc920b3210 ffff8803f2c103c0
[ 105.447825] ffff880407d43e30 ffffffffa00021a9 ffff880407d43e80 ffffffff812c91ca
[ 105.483634] Call Trace:
[ 105.495364] [] _raw_spin_lock+0x1c/0x20
[ 105.521835] [] sd_ioctl_canary+0x23/0x60 [redpill]
[ 105.553114] [] sd_ioctl_smart_shim+0x29/0x40 [redpill]
[ 105.585937] [] blkdev_ioctl+0x30a/0x9d0
[ 105.612113] [] block_ioctl+0x38/0x40
[ 105.637114] [] do_vfs_ioctl+0x7ea/0xa80
[ 105.663379] [] SyS_ioctl+0xa1/0xb0
[ 105.687413] [] entry_SYSCALL_64_fastpath+0x1e/0x8e
[ 105.718368] Code: 81 48 89 10 8b 42 08 85 c0 75 1e f3 90 8b 42 08 85 c0 74 f7 eb 13 f3 90 8b 37 81 fe 00 01 00 00 74 f4 e9 32 ff ff ff f3 90 8b 07 <66> 85 c0 75 f7 39 c1 75 0f 89 c8 be 01 00 00 00 f0 0f b1 37 39
[ 105.809621] Sending NMI to other CPUs:
[ 105.827730] NMI backtrace for cpu 0
[ 105.844387] CPU: 0 PID: 0 Comm: swapper/0 Tainted: P OEL 4.4.180+ #42962
[ 105.881254] Hardware name: HP ProLiant MicroServer Gen8, BIOS J06 04/04/2019
[ 105.914940] task: ffffffff818114c0 ti: ffffffff81800000 task.ti: ffffffff81800000
[ 105.951172] RIP: 0010:[] [] mwait_idle+0xb6/0x190
[ 105.989694] RSP: 0018:ffffffff81803ef0 EFLAGS: 00000246
[ 106.015155] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 106.049602] RDX: 0000000000000000 RSI: ffffffff81804000 RDI: 0000000000000096
[ 106.084377] RBP: ffffffff81803f08 R08: 0000000000000002 R09: 0000000000000000
[ 106.118946] R10: 000000000000049a R11: 00000000000075e1 R12: 0000000000000000
[ 106.153540] R13: ffffffff81804000 R14: 00000000fffffff0 R15: 0000000000000000
[ 106.188268] FS: 0000000000000000(0000) GS:ffff880409600000(0000) knlGS:0000000000000000
[ 106.227962] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 106.255565] CR2: 000000000075a334 CR3: 000000040727f000 CR4: 00000000001606f0
[ 106.290410] Stack:
thank you very much

@fbelavenuto
Copy link
Owner

Hi, does your PC have any network cards that depend on the 'ixgbe' driver? Please test again by disabling this module "Main Menu / Modules / Select Modules / (deselect ixgbe)"

@nemesis122
Copy link
Author

Hi Fabio
Yes it is Intel X520sfp+ card with one port maybe this is releated to the same as you have changed to the ixgbe driver from pocopico then i had the same problem .
Then you have changed back to take the driver from the source and the error was gone i think it was in version 1.01 so in any case with 1.03 all is working perfect maybe here in version 1.06 we have the same problem ?
thank you for all your hard work for the communtiy :-)

@nemesis122
Copy link
Author

i will try with mellanox card not this intel one and let you know
thanks
i will also try again with ixgbe deslect have a nice time see ya

@nemesis122
Copy link
Author

nemesis122 commented Dec 7, 2022

Hi Fabio
I have tried with ARPL 107 all Modules updated and without addons added:
without igbe nothing happend the server is not found at the local network
and with ixgbe added there is kernel panic full crash

pls find both output as attachment .
Do you have changed back to the ixgbe driver from pocopico ?(all time crash the loader also with tinycore )
because before when the driver was added from the source all was working fine.
thank you
Michael
ARPL107 without ixgbeallmodulesupdated.txt
ARPL107withixgbeallmodulesupdated.txt

@fbelavenuto
Copy link
Owner

Please update modules, rebuild the loader and test it

@nemesis122
Copy link
Author

ok i saw the change i hope now the driver from pocopico will work becuase with the driver from the source all was working i will try and let you know ty very much

@fbelavenuto
Copy link
Owner

Okay, I'm confused as I don't remember what worked and I may have misinterpreted the texts as I don't understand the English language well.

@nemesis122
Copy link
Author

nemesis122 commented Dec 8, 2022

Hi Fabio NP i mean
ixgbe driver from the source like 1.03 all is working fine
ixgbe from pocopico is NOT working als not with tinycore loader
so chang it forever from the source and all is working and fabio has lower stress :-)

@fbelavenuto fbelavenuto transferred this issue from fbelavenuto/arpl Dec 8, 2022
@fbelavenuto
Copy link
Owner

Please re-test after update modules.

@nemesis122
Copy link
Author

Hi Sorry to say it is not working updated the usb stick to 1.08 chosse 3622xs updated all modules and also tried without any module .
1.08 with all Modules not working.txt

Second created a new usb stick with 1.08 without modules with modules etc this ise is happen patching the zimage
error creating the loader 3622xs

with 1.03b all is working perfect

thank you very much and have a nice day

@nemesis122
Copy link
Author

will test some other things today evening also try with my mellanox 10 Gbe SFP+ card

@nemesis122
Copy link
Author

nemesis122 commented Dec 13, 2022

I think here is the reason [ 12.790287] sd 6:0:0:0: [synoboot] Attached SCSI removable disk
udevadm settle failed
[ 56.685321] NMI watchdog: BUG: soft lockup - CPU#2 stuck for 41s! [scsi_id:4844]
[ 56.720695] Modules linked in: usb_storage ixgbe(O)

so in this case again the ixgbe driver is broken when its not taken from the source as we have in 1.03b
i
could we add this module as we have in 1.03b because this version from arpl 1.03 is working great
thank you very much
see ya
Michael

@nemesis122
Copy link
Author

i think this is also releated to this
pocopico/rp-ext#103
the ixgbe driver is included in DSM and vanilla and causes the problem.
Anyway with 1.03 all was perfekt what is different for the ixgbe driver from 1.03b to 1.04- 1.08b ??
thank you
Michael

@nemesis122
Copy link
Author

also found this in logfile
[ 127.490716] <redpill/bios_shim.c:100> broadwellnk_synobios BIOS went away - you may get a kernel panic if YOU unloaded it

@nemesis122
Copy link
Author

So i think i can disable the PCI device intel 520 in the bios and create the loader with arpl 1.08 and after them enabled it and change the mac adress and it should work :-)

But with 1.03b and 0.5 arpl it is working great why?? what is different with the module ixgbe in 1.03b / 0.5 / and others like 1.04 - 1.08b etc ??????

@fbelavenuto
Copy link
Owner

Second created a new usb stick with 1.08 without modules with modules etc this ise is happen patching the zimage

Okay, I don't know what could have happened here!

@fbelavenuto
Copy link
Owner

But with 1.03b and 0.5 arpl it is working great why?? what is different with the module ixgbe in 1.03b / 0.5 / and others like 1.04 - 1.08b etc ??????

I don't know!! I took the same modules from version 1.03 and put them in version 1.07 and it didn't work!
If you can test it, get the files inside /mnt/p3/modules from the 1.03 image and put it on your current USB flash drive, rebuild the loader and test it.

@nemesis122
Copy link
Author

Hi thank you for your feedback in 103b the ixgbe driver was used from the source (<--this means DSM? ) maybe there is en other loader config ? or the module use has changed?
ty

@fbelavenuto
Copy link
Owner

Now I have removed the ixgbe from all the models that already have it!
I'm going to look for a network card that uses this driver for further testing.
Thanks.

@nemesis122
Copy link
Author

nemesis122 commented Dec 13, 2022

Which NetworkCard do you have in your enviroment ? then i will buy this one and we have the same and it will all time working :-)

@fbelavenuto
Copy link
Owner

I use a Xeon server with proxmox, so most of the time I use virtio_net. I have access to some machines at work so I test with some Realtek models and Intel i211 network cards (igb driver).

@nemesis122
Copy link
Author

Por lo tanto, está trabajando con el cargador ARPL 1.09 creado con estos pasos:
1 Actualizar ARPL a 1.09
2 Actualizar todos los módulos y complementos
3 Reinicio
4 crear 3622 xs con todos los módulos incluidos
5 esta versión del cargador 1.09 viene im mi caja fuerte :-)
6 ¿Cuál fue el problema?
muchas gracias Maestro

" So it is working with the ARPL 1.09 Loader Createdt with this steps:
1 Update ARPL to 1.09
2 Update all Modules and Addons
3 Reboot
4 create 3622 xs with all Modules included
5 this Loader Version 1.09 comes im my safe :-)
6 what was the problem?
thank you very much Master "
ARPL 1.09 all Modules included version 3622xs 42962 working like a dream.txt

@nemesis122
Copy link
Author

this drives me crazy second time creating the loader again and now kernel panic
:-/

@nemesis122
Copy link
Author

nemesis122 commented Dec 13, 2022

So i found it but i dont know why
so first create the loader install DSM all is fine create the storage or etc in DSM all is fine ...
reboot the server and then kernel panic changed nothing on the loader USB
second reboot kernel panic.txt

@nemesis122
Copy link
Author

This is the strange behavior at the second boot
[ 127.490716] <redpill/bios_shim.c:100> broadwellnk_synobios BIOS went away - you may get a kernel panic if YOU unloaded it

@nemesis122
Copy link
Author

nemesis122 commented Dec 13, 2022

with 1.03b all is working great perfect very fast tested again
maybe this kernel panic is not related to the ixgbe driver ??

ev. to en other issue that ihaschanged from 1.03b ..... to 1.09b in the loader because unload the bios is strange
[ 127.490716] <redpill/bios_shim.c:100> broadwellnk_synobios BIOS went away - you may get a kernel panic if YOU unloaded it

@nemesis122
Copy link
Author

i will test with the mellanox card an let you know

@nemesis122
Copy link
Author

I have now tested with arpl 1.03b but all addons and Modules from the version 1.08b but the loader version is still 1.03b and the server crashed pls find the log attached
1.03b with updated Modules abd Addons to 1.08 but arpl is still 1.03b crash.txt

@fbelavenuto
Copy link
Owner

Hi nemesis122, this last log do not have a kernel panic!
Please do some tests, uses the v1.0-beta3 and update only modules, build and test

@nemesis122
Copy link
Author

nemesis122 commented Dec 14, 2022

ok i will try

@nemesis122
Copy link
Author

nemesis122 commented Dec 14, 2022

Please find attached the logs
ARPL 1.03b
workflow:
update only the modules
Select 3622xs Version 42962
Deselect all Modules
Build the Loader
and Booting --> Crash

Second
Reboot edit the Loader
Add all Modules
Build the loader
Booting
--> Crash
3622xs 1.03b only update the Modules and booting without modules_error synboot removed in the log.txt
3622xs 1.03b only update the Modules and booting with all modules.txt

[ 90.971546] Module [broadwellnk_synobios] is removed.
[ 90.996338] synobios: unload !

@fbelavenuto
Copy link
Owner

The error is this:

[   11.766378] sd 6:0:0:0: [synoboot] Attached SCSI removable disk
udevadm settle failed
[   56.122881] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 41s! [scsi_id:4909]

[ 52.119368] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 41s! [scsi_id:4830]
Some trouble with SCSI!!

Try one thing, go to advanced menu, enable "direct boot" and boot

@nemesis122
Copy link
Author

with direct boot it is working but why ?

@nemesis122
Copy link
Author

920+ is working great is there
disadvantage to 3622xs
?

@fbelavenuto
Copy link
Owner

with direct boot it is working but why ?

We do not know! It was pocopico who noticed this problem with the HP Proliant Microservers

@fbelavenuto
Copy link
Owner

920+ is working great is there disadvantage to 3622xs ?

No, the 920 is device-tree, it only works with native SATA. If it's working fine for you I recommend using the 920.

@nemesis122
Copy link
Author

nemesis122 commented Dec 21, 2022

it is realy strange with this loader since 1.06
i have changed the network adapater to Mellanox 3 and it is crash again
install all is working fine as example 3622 after the reboot it is crashing same as with the intel network adapter
with 1.03 all is running perfect

@nemesis122
Copy link
Author

Hi Fabio
Christian has changed some things in the modules now it is working with the gen8 could you have a look i dont know if its only releated to the ixgbe driver
https://github.com/AuxXxilium/arc-modules/tree/main/broadwellnk-4.4.180
ty

@nemesis122
Copy link
Author

also this version crashes second boot after install or set back to factory settings

@fbelavenuto
Copy link
Owner

Hi Fabio Christian has changed some things in the modules now it is working with the gen8 could you have a look i dont know if its only releated to the ixgbe driver https://github.com/AuxXxilium/arc-modules/tree/main/broadwellnk-4.4.180 ty

Ok, thanks.

@fbelavenuto
Copy link
Owner

Please update ARPL, reboot, update modules, addons and lkm, rebuild the loader and test it.

@nemesis122
Copy link
Author

nemesis122 commented Jan 14, 2023

i will test and let you know thanks anyway :-)

@nemesis122
Copy link
Author

nemesis122 commented Jan 15, 2023

Hi Fabio :-)
thank you very much for this Masterpiece Loader it is working in any case restore / switch to other Modell / Fresh install als the CPU Frequencies is working now perfect.
tested and working perfect 👍
3617
3622
4021
restore / switch / fresh install all is working perfect !!!

there is a small issue in the cpu info for the CPU clock but this is only cosmetic.
https://xpenology.com/forum/topic/65408-automated-redpill-loader-arpl/?do=findComment&comment=436739

@fbelavenuto
Copy link
Owner

The ixgbe is working?

@nemesis122
Copy link
Author

nemesis122 commented Jan 18, 2023

Hi Fabio
Yes ixgbe is working i have tested with 👍
I3 3240 (2 cores)
16Gb RAM
1xSSD Volume1
4x6tb Raid0 Volume2
and intel x520 SFP+ <---
and it is working new install / update install / switch to en other Model etc / also the perormance is fine and the CPU Governor Command is not needed anymore the only small issue is a small cosmetic issue CPU Info Clock speed
https://xpenology.com/forum/topic/65408-automated-redpill-loader-arpl/?do=findComment&comment=436739

BR
Michael
Have a nice time see ya

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants