Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@jim3ma, Can GeminiLake, v1000, and r1000 be used as HBAs in lkm4? #14

Open
PeterSuh-Q3 opened this issue Aug 31, 2023 · 23 comments
Open

Comments

@PeterSuh-Q3
Copy link

d7e0766

@jim3ma
I tried applying this new module you added for lkm5 to lkm4 yesterday.

https://xpenology.com/forum/topic/65408-automated-redpill-loader-arpl/?do=findComment&comment=449219

An issue where the disk serial, which is part of S.M.A.R.T information, cannot be displayed when using HBA in DS918+ (Apollo Lake) has been resolved.
Thank you so much for adding such a great feature.

I have one question or additional request.
The SA6400 is a Device-Tree based model, so is this a function that HBA can use?
I haven't been able to test SA6400 yet because I was concentrating on DS918+ in HBA yesterday.
If this were possible,
I hope that HBA will also be activated in existing Device-Tree based platforms such as lkm4, Gemini Lake, v1000, and r1000.
Is it possible?

@jim3ma
Copy link
Contributor

jim3ma commented Sep 1, 2023

I have compiled many HBA modules for SA6400. I'm sure SA6400 with DSM 7.2 works well with many HBA cards, like LSI and HBA 1000.
I think other platforms are same.

@PeterSuh-Q3
Copy link
Author

PeterSuh-Q3 commented Sep 3, 2023

@jim3ma

What type of HBA did you test?
I am testing on an Intel 4th generation with a Dell Perc H200 or H310 which corresponds to SAS2008 and uses the mpt3sas module.

Direct Boot was activated on ARPL-i18n,
The disks addon could not be added directly.
However, ARPL-i18n seems to have a disk mapping part internally handled by the disks add-on.
( I checked the script and it is already adding it internally. )
https://github.com/wjz304/arpl-i18n/blob/main/files/board/arpl/overlayfs/opt/arpl/ramdisk-patch.sh#L165

The serial port log below was confirmed.
The device is listed in the HBA, but is not mapped to model.dtb.
Do you think that storage under HBA in DT-based systems such as SA6400 should also be mapped to model.dtb?

serial_port_log.txt

스크린샷 2023-09-03 오전 9 56 10

@jim3ma
Copy link
Contributor

jim3ma commented Sep 3, 2023

Do you think that storage under HBA in DT-based systems such as SA6400 should also be mapped to model.dtb?

Yes, I have modified HBA drivers to match sd_probe in SA6400.

The kernel source between SA6400 and other machines are not different, but I think HBA is okay for those machines if update HBA drivers to match the synology kernel.

@PeterSuh-Q3
Copy link
Author

It is correct to understand that HBA operates independently of model.dtb.

And, it seems that kernel 4 already has a part related to sd_probe.

https://github.com/search?q=repo%3APeterSuh-Q3%2Farpl-modules%20sd_probe&type=code

This repo also contains kernel 5 sources, but sd_probe is not coded.
Could you please tell me the repo where your sources are located?
I would like to refer to it.

@jim3ma
Copy link
Contributor

jim3ma commented Sep 3, 2023

The HBA did not operate model.dtb and only disks add-on does this work after recognized all disks .

The sd_probe in drivers/scsi/sd.c will allocate disk name and detect disk type for all disks include HBA disks.

@PeterSuh-Q3
Copy link
Author

I wanted to reference the code in your repo,
As a result of searching with the keywords below, nothing was implemented.
":= sd.o"

스크린샷 2023-09-04 오전 9 09 39

Is it true that you are only giving me instructions?
Did you make any special changes to the disks addon for the SA6400?
It's the same as ARC's code, right?

I am a beginner who has done module compilation in Ubuntu a few times.
I don't know if it's possible, but I'll give it a try following your instructions.
Looking at the Linux 4.4.302 kernel source
I think sd.c / sd.h / sd_dif.c needs to be newly added.
It is not accurate up to sd_dif.c.

To resolve errors during compilation, you may also receive help from ChatGPT.
I would also appreciate the help of this repo's collaborators.
Let's give it a try.

@PeterSuh-Q3
Copy link
Author

There was no problem compiling with the new sd.c / sd.h / sd_dif.c added, and sd_mod.ko was added as a result.
Is this how it should be?

스크린샷 2023-09-04 오후 1 37 13

@jim3ma
Copy link
Contributor

jim3ma commented Sep 4, 2023

You should compiling HBA drivers to meet sd_probe(drivers/scsi/sd.c)'s request: add syno_disk_type in HBA drivers. You can find syno_disk_type in Synology NAS GPL Source: https://archive.synology.com/download/ToolChain/Synology%20NAS%20GPL%20Source

@PeterSuh-Q3
Copy link
Author

PeterSuh-Q3 commented Sep 4, 2023

I am referring to the repo below, which was compiled for Kernel 4.4.x, the source source, 4 years ago, as a model.

I think it would be difficult to selectively refer to only the syno_disk_type you mentioned.
It is very difficult to get all the relevant parts without missing anything.
Are there any good tips?

https://github.com/wellfrogliu/Synology-MT7601u/tree/90ee0df49755fac621e2dc967adcf9fea189bfe2/kernel/linux-4.4.x/drivers/scsi

I imported the entire scsi and commented out some functions that cause compilation errors.
Perhaps the ko module will be compiled only for the defined parts.

@jim3ma
Copy link
Contributor

jim3ma commented Sep 5, 2023

https://global.synologydownload.com/download/ToolChain/Synology%20NAS%20GPL%20Source/7.1.1-42962/purley/linux-4.4.x.txz
Download this kernel code, and search syno_port_type in drivers/scsi/mpt3sas.

@PeterSuh-Q3
Copy link
Author

PeterSuh-Q3 commented Sep 5, 2023

thank you The contents of the toolchain you provided and
By referring to syno_port_type that is already included in my code,
Let's include it in drivers/scsi/mpt3sas.

https://github.com/PeterSuh-Q3/arpl-modules/blob/main/src/4.x/drivers/scsi/virtio_scsi.c#L811

I'm testing Gemini Lake.
I have already completed the first compilation using the guide you provided yesterday.
I don't know if it will work.

And, I found a declaration that is already in use in kernel 5.

https://github.com/PeterSuh-Q3/arpl-modules/blob/main/src/5.x/drivers/scsi/mpt3sas/mpt3sas_scsih.c#L10715

I think the same thing can be applied to kernel 4 code.
Is that right?

PeterSuh-Q3/arpl-modules@d910009

@jim3ma
Copy link
Contributor

jim3ma commented Sep 5, 2023

Different platforms use different mechanism, like SA6400 only supports SYNO_PORT_TYPE_SATA, you must verify yourself.

@PeterSuh-Q3
Copy link
Author

The /drivers/scsi subcodes have been modified as per your instructions as shown below.

스크린샷 2023-09-06 오전 10 25 24

스크린샷 2023-09-06 오전 10 26 08

스크린샷 2023-09-06 오전 10 26 57

스크린샷 2023-09-06 오후 4 05 29

The platform I am currently testing is Gemini Lake DS920+.
For the toolchain code, refer to the link below.

https://global.synologydownload.com/download/ToolChain/Synology%20NAS%20GPL%20Source/7.1.1-42962/geminilake/linux-4.4.x.txz

I am currently testing one Intel SSD disk on a Dell Perc H200 in IT mode.
I used the disks add-on from TCRP FRIEND as per your instructions and
The log results are as follows.
(It looks like you used direct boot with ARPL, which you can switch to with TCRP, but for now you used a FRIEND kernel that goes through the GNU kernel.)

messages.txt

linuxrc.syno.log

The disk does not appear in /sys/block.
What's the problem?

SynologyNAS> ll /sys/block drwxr-xr-x 2 root root 0 Sep 6 06:55 . dr-xr-xr-x 12 root root 0 Sep 6 06:55 .. lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram0 -> ../devices/virtual/block/ram0 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram1 -> ../devices/virtual/block/ram1 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram10 -> ../devices/virtual/block/ram10 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram11 -> ../devices/virtual/block/ram11 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram12 -> ../devices/virtual/block/ram12 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram13 -> ../devices/virtual/block/ram13 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram14 -> ../devices/virtual/block/ram14 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram15 -> ../devices/virtual/block/ram15 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram2 -> ../devices/virtual/block/ram2 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram3 -> ../devices/virtual/block/ram3 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram4 -> ../devices/virtual/block/ram4 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram5 -> ../devices/virtual/block/ram5 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram6 -> ../devices/virtual/block/ram6 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram7 -> ../devices/virtual/block/ram7 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram8 -> ../devices/virtual/block/ram8 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram9 -> ../devices/virtual/block/ram9

@jim3ma
Copy link
Contributor

jim3ma commented Sep 6, 2023

Your message.txt is missing some logs.
There is an error in current message.txt: general protection fault for mpt3sas.

@PeterSuh-Q3
Copy link
Author

In case there was any improvement, I replaced the /drivers/scsi/mpt3sas subdirectory with geminilake's original old version 09.102.00.00 and compiled it.
The results seem to be the same. A general protection fault occurs.

[ 50.618684] mpt3sas version 09.102.00.00 loaded
[ 51.023252] general protection fault: 0000 [#1] SMP

I did some Googling, and this problem seems to be a very tricky and difficult situation to analyze.
There are many cases of questions asked by Red Hat, etc., but it is difficult to find analyzed answers.
I also don't have the ability to analyze the debugging logs shown in this area and see if there are any memory problems.

Should we stop here?

Attach the dmesg full log again.

dmesg.txt

@jim3ma
Copy link
Contributor

jim3ma commented Sep 7, 2023

The most possible reason is the struct memory aligning due to the different CONFIG_*.

The sd_probe will use the point from mpt3sas module, if the memory aligning is not same, sd_probe will access invalid memory address and panic.

[   51.023252] general protection fault: 0000 [#1] SMP
[   51.028243] Modules linked in: mpt3sas(OE+) raid_class(OE) scsi_transport_sas(OE) e1000e(OE) button(OE) fb fbdev i2c_algo_bit usb_storage xhci_pci xhci_hcd us
bcore usb_common
[   51.044018] CPU: 2 PID: 5154 Comm: modprobe Tainted: G           OE   4.4.302+ #64570
[   51.051836] Hardware name: Gigabyte Technology Co., Ltd. Z87N-WIFI/Z87N-WIFI, BIOS F6 08/12/2014
[   51.060606] task: ffff88040aa7a700 ti: ffff8804056d4000 task.ti: ffff8804056d4000
[   51.068078] RIP: 0010:[<ffffffff813efb8d>]  [<ffffffff813efb8d>] syno_libata_info_enum.constprop.0+0x6d/0x100
[   51.078000] RSP: 0018:ffff8804056d7390  EFLAGS: 00010246
[   51.083305] RAX: 0000000000000000 RBX: ffff880407508000 RCX: 0000000000000000
[   51.090429] RDX: 0000000000000076 RSI: 0000000000000000 RDI: ffffffffa0594290
[   51.097553] RBP: ffff8804056d75b0 R08: 000000000001d760 R09: ffffffff812d4fcc
[   51.104678] R10: ffff88040b003540 R11: 0000000000000000 R12: ffffffffa0594290
[   51.111802] R13: ffff88040750878d R14: 7065725f6873756c R15: ffff880407ba400c
[   51.118926] FS:  00007fc6b3551740(0000) GS:ffff88041d300000(0000) knlGS:0000000000000000
[   51.127003] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   51.132741] CR2: 00007f1cc015cad8 CR3: 00000004076cd000 CR4: 00000000001606f0
[   51.139865] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   51.146989] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   51.154113] Stack:
[   51.156123]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   51.163576]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   51.171029]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   51.178484] Call Trace:
[   51.180930]  [<ffffffff813f0252>] sd_probe+0x632/0x670
[   51.186069]  [<ffffffff8106b7e6>] ? blocking_notifier_call_chain+0x86/0x90
[   51.192934]  [<ffffffffa00006c2>] sd_probe_shim+0x52/0x80 [redpill]
[   51.199200]  [<ffffffff813b17cb>] driver_probe_device+0x19b/0x280
[   51.205291]  [<ffffffff813b19b0>] __device_attach_driver+0x80/0xa0
[   51.211461]  [<ffffffff813b1930>] ? __driver_attach+0x80/0x80
[   51.217200]  [<ffffffff813af504>] bus_for_each_drv+0x64/0xa0
[   51.222858]  [<ffffffff813b13a2>] __device_attach+0xa2/0x120
[   51.228509]  [<ffffffff813b1a0e>] device_initial_probe+0xe/0x10
[   51.234419]  [<ffffffff813b0acd>] bus_probe_device+0x8d/0xa0
[   51.240069]  [<ffffffff813ae4df>] device_add+0x3ff/0x620
[   51.245377]  [<ffffffff813e3058>] scsi_sysfs_add_sdev+0x88/0x280
[   51.251381]  [<ffffffff813e058b>] scsi_probe_and_add_lun+0xdcb/0x10b0
[   51.257812]  [<ffffffff813e0bd3>] __scsi_scan_target+0xa3/0x5a0
[   51.263722]  [<ffffffff813ae95c>] ? device_create+0x3c/0x40
[   51.269288]  [<ffffffff813bc2a7>] ? __pm_runtime_resume+0x47/0x60
[   51.275380]  [<ffffffff813e1185>] scsi_scan_target+0xb5/0xc0
[   51.281031]  [<ffffffffa055d1d7>] sas_rphy_add+0x107/0x150 [scsi_transport_sas]
[   51.288336]  [<ffffffffa057c095>] mpt3sas_transport_port_add+0x255/0x910 [mpt3sas]
[   51.295894]  [<ffffffff810943fa>] ? vprintk_default+0x1a/0x20
[   51.301632]  [<ffffffff810a2363>] ? del_timer_sync+0x43/0x50
[   51.307286]  [<ffffffffa0585370>] scsih_scan_finished.cold+0x214/0x252 [mpt3sas]
[   51.314675]  [<ffffffff813e142f>] do_scsi_scan_host+0x6f/0xa0
[   51.320411]  [<ffffffff813e15cd>] scsi_scan_host+0x16d/0x190
[   51.326065]  [<ffffffffa0579c9a>] _scsih_probe+0x42a/0x560 [mpt3sas]
[   51.332418]  [<ffffffff8131a260>] pci_device_probe+0x90/0xf0
[   51.338076]  [<ffffffff813b17cb>] driver_probe_device+0x19b/0x280
[   51.344160]  [<ffffffff813b1929>] __driver_attach+0x79/0x80
[   51.349722]  [<ffffffff813b18b0>] ? driver_probe_device+0x280/0x280
[   51.355980]  [<ffffffff813af469>] bus_for_each_dev+0x69/0xa0
[   51.361632]  [<ffffffff813b1109>] driver_attach+0x19/0x20
[   51.367021]  [<ffffffff813b0d56>] bus_add_driver+0x116/0x1d0
[   51.372673]  [<ffffffffa059b000>] ? 0xffffffffa059b000
[   51.377805]  [<ffffffff813b216a>] driver_register+0x8a/0xe0
[   51.383368]  [<ffffffff81318d21>] __pci_register_driver+0x41/0x50
[   51.389455]  [<ffffffffa059b0c1>] _mpt3sas_init+0xc1/0xd1 [mpt3sas]
[   51.395718]  [<ffffffff81000347>] do_one_initcall+0x87/0x130
[   51.401369]  [<ffffffff810bac8b>] do_init_module+0x5b/0x200
[   51.406934]  [<ffffffff810bcc0a>] load_module+0x1d9a/0x2280
[   51.412497]  [<ffffffff810b9660>] ? symbol_put_addr+0x40/0x40
[   51.418236]  [<ffffffff8118120c>] ? kernel_read+0x3c/0x50
[   51.423635]  [<ffffffff810bd2a3>] SYSC_finit_module+0x73/0x90
[   51.429371]  [<ffffffff810bd2d9>] SyS_finit_module+0x9/0x10
[   51.434936]  [<ffffffff8158464a>] entry_SYSCALL_64_fastpath+0x1e/0x93
[   51.441365] Code: 00 00 4c 89 e7 49 89 f5 e8 21 f9 02 00 84 c0 75 0b 41 c7 85 34 04 00 00 01 00 00 00 
4d 8b b4 24 c0 52 00 00 4c 8d ab 8d 07 00 00 <49> 8b 86
88 00 00 00 48 85 c0 74 13 48 8b 30 48 c7 c7 5c 3a 74
[   51.461316] RIP  [<ffffffff813efb8d>] syno_libata_info_enum.constprop.0+0x6d/0x100
[   51.468891]  RSP <ffff8804056d7390>
[   51.472383] ---[ end trace 20daaee2ec75c3f1 ]---

You can disassemble vmlinux and find the code in sd_probe. The message points that when reach general protection fault, the cpu is executing code <49> 8b 86.

@PeterSuh-Q3
Copy link
Author

https://github.com/PeterSuh-Q3/arpl-modules/blame/main/src/4.x/drivers/scsi/sd.c#L4666

In that line of code: CONFIG_SYNO_MULTIPATH_NATIVE_SAS_DEVICE_PREFIX
Do you think the definition is wrong?

@jim3ma
Copy link
Contributor

jim3ma commented Sep 7, 2023

https://github.com/PeterSuh-Q3/arpl-modules/blame/main/src/4.x/drivers/scsi/sd.c#L4666

In that line of code: CONFIG_SYNO_MULTIPATH_NATIVE_SAS_DEVICE_PREFIX Do you think the definition is wrong?

I'm not clear.

The scsi/sd.c is built in vmlinux. You should make your HBA modules to satisfy sd_probe of scsi/sd.c.
There are many macro MY_ABC_HERE and MY_DEF_HERE which is original CONFIG_SYNO_xxx. You should not enable all of MY_ABC_HERE and MY_DEF_HERE when compile mpt3sas.

@PeterSuh-Q3
Copy link
Author

And I've never done disassembly before, so I'll give it a try.

objcopy -O binary -R .note -R .comment -S vmlinux vmlinux.bin

objdump -D -b binary -m i386:vmlinux.bin

@jim3ma
Copy link
Contributor

jim3ma commented Sep 7, 2023

And I've never done disassembly before, so I'll give it a try.

objcopy -O binary -R .note -R .comment -S vmlinux vmlinux.bin

objdump -D -b binary -m i386:vmlinux.bin

target=/path/to/work
# bzImage-to-vmlinux.sh is in arpl
/path/to/arpl/files/board/arpl/overlayfs/opt/arpl/bzImage-to-vmlinux.sh \
  "$target/zImage" \
  "$target/vmlinux"

# https://github.com/marin-m/vmlinux-to-elf
/path/to/vmlinux-to-elf/vmlinux-to-elf "$target/vmlinux" "$target/vmlinux.elf"

After converted, you can use IDA Pro or other disassemble tools to analyze vmlinux.elf.

@PeterSuh-Q3
Copy link
Author

PeterSuh-Q3 commented Sep 7, 2023

The last compiled mpt3sas.ko is
I recompiled by switching to MY_DEF_HERE instead of MY_ABC_HERE.

PeterSuh-Q3/arpl-modules@b7b4acd

The reason for the change is because I think SYNO_DISK_SAS should be used in sd_probe.

I installed ida 8.3 free and opened the elf file as shown below.
Do you think I can analyze it? ^^

vmlinux.elf.zip

스크린샷 2023-09-07 오후 8 49 36

@PeterSuh-Q3
Copy link
Author

I compared it with the dmesg log of ds918+, which is operating normally.

스크린샷 2023-09-08 오전 12 16 41

@jim3ma
Copy link
Contributor

jim3ma commented Sep 8, 2023

I recompiled by switching to MY_DEF_HERE instead of MY_ABC_HERE.

I don't think it works.

You should analyze some struct size which sd_probe used, like:

	struct device
	struct scsi_device
	struct scsi_disk
	struct gendisk
  1. print sizeof()
  2. disassemble vmlinux and search code which allocate struct with size provided.
  3. confirm same size

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants