-
Notifications
You must be signed in to change notification settings - Fork 23
NAS-137494 / None / Update to 6.18 LTS #232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit adds TrueNAS build customization required for building Debian packages for TrueNAS SCALE kernel. The original commit ported from v6.6 is 0c5b36a. Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Support for alternate datastreams over the SMB protocol has been historically enabled in such a way that Samba writes them as filesystem extended attributes in the user namespace. FreeBSD has no practical limit on xattr size, and so clients (often MacOS) may write ones that exceed the 64 KiB limit imposed by the Linux kernel. Since XATTR_SIZE_MAX is uesd in many places in the kernel, and not all filesystems support large xattrs, introduce new constant XATTR_LARGE_SIZE_MAX that is used as an alternate value if the filesystem sb_flags has SB_LARGEXATTR. There will be corresponding commit in ZFS to set this flag when it is defined and xattrs are enabled on the ZFS dataset. This commit also introduces flag SB_NFSV4ACL which will be used to indicate and enable NFSv4-specific behavior in kernel with regard to permissions. These new features / alternate behavior are controlled by the compile-time kernel compilation flag CONFIG_TRUENAS, which defaults to n (off). In principle, TrueNAS-specific changes that deviate from a vanilla Linux kernel can be removed for testing purposes by changing CONFIG_TRUENAS=n in the relevant build scripts. Signed-off-by: Andrew Walker <awalker@ixsystems.com>
There are various places in which evaluation of permissions in the presence of an NFSv4 ACL is more nuanced than what is typical when evaluating traditional POSIX permissions. For example, a user may be permitted to delete a file if he has DELETE permissions on the file or DELETE_CHILD permissions on the parent directory. Traditional POSIX permissions will only check for MAY_WRITE | MAY_EXEC on parent directory. Several new inode permissions masks have been added to facilitate these NFSv4-specific checks corresponding to different NFSv4 permissions that grant abilities to make changes to files. For the purpose of this commit and the goal of providing rough a approximation of NFSv4 access checks, only write (and not read) access checks have been implemented. This is selectively done in a way to grant minimal compliance with permissions as defined in RFC-5661. The new permissions-related behavior is only applied when the inode sb_flag SB_NFS4ACL is present. In this case, the onus of full implementation of requisite features to satisfy the ACL behavior specified in RFC-5661 is delegated to the filesystem's inode permissions interface (i_op->permission). If possible we try to check for the convention POSIX permission first before trying the NFSv4-equivalent. For example, when writing an xattr, we check for WRITE_DATA before WRITE_NAMED_ATTRS because in the case of former with a trivial ACL we can avoid having to evaluate the full ACL, and instead merely look at POSIX mode.
csiostor seems to cause Chelsio T6 firmware to crash. Jira: NAS-110910
Being written anything waits for all device probe to complete before returning. After that `udevadm settle` used by ZFS scripts really can provide system with all disks detected for boot pool import. Ticket: NAS-108200
Enable NTB and NTB tools in the Truenas config. In addition, enable the Intel NTB driver, so that we have at least one NTB driver available.
Added initial support for PLX Non-Transparent Bridge.
Before this change it was impossible to load client modules before NTB hardware is probed. This change removes the limitation. New NTB transports will get their children devices as they come in.
This fixes interrupt storms on hardware using legacy level-triggered interrupts, since doorbell processing could take time after interrupt handler completion, that triggered extra interrupts in a loop.
If a previous successful run is present, then skip re-run for its pull request. Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Previously for TrueNAS, Debian Linux kernel configuration was used and TrueNAS config options were added on top of that. Because of that, TrueNAS kernel config in 'scripts/package/truenas/tn.config' has grown very large and difficult to manage for TrueNAS only options. Debian Linux kernel configuration for version 6.1.55 has been added as 'debian_amd64.config' to keep the options from Debian seperate from TrueNAS options. TrueNAS only config options are stored in 'truenas.config'. Kernel configuration can now be generated as: 1) make ARCH=x86_64 defconfig 2) ./scripts/kconfig/merge_config.sh .config \ ./scripts/package/truenas/debian_amd64.config 3) ./scripts/kconfig/merge_config.sh .config \ ./scripts/package/truenas/truenas.config 4) ./scripts/kconfig/merge_config.sh .config \ ./scripts/package/truenas/debug.config Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Without this it was impossible to use multple NTB consumer drivers, since kernel just attached the first one to all NTBs. This replicates driver_override device attribute from PCI, plus adds module parameter to ease default setting. Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Previously ntb_transport required at least 6 scratchpad registers, plus 2 more for each additional memory window. That is too much for some configurations, where several drivers have to share resources of the same NTB hardware. This patch introduces new compact version of the protocol, requiring only 3 scratchpad registers, plus one more for each additional memory window. The optimization is based on fact that neither of version, number of windows or number of queue pairs really need more then one byte each, and window sizes of 4GB are not very useful now. The new protocol is activated automatically when the configuration is low on scratchpad registers, or it can be activated explicitly with module parameter.
It attaches as client to some NTB device and provides several other NTB device, splitting NTB resources according to config module parameter. For example, one BAR can be dedicated for remote memory access, while other resources can be used for packet transport for virtual Ethernet interface.
Mostly complete. AHCI controllers supporting enclosure management will create a virtual SES logical unit on a dedicated SCSI host, allowing control/status of EM LEDs. link to the device in a given slot is created for mapping the slot to a device: ``` root@trunasmini:~# strings /sys/class/enclosure/*/Slot*/device/inquiry ATA Samsung SSD 860 1B6Q ATA Samsung SSD 840 6B0Q ATA SanDisk SD6SB1M0600 ATA SanDisk SDSSDHII00RL root@trunasmini:~# ls /sys/class/enclosure/*/Slot*/device/block/ '/sys/class/enclosure/4:0:0:0/Slot 00/device/block/': sda '/sys/class/enclosure/4:0:0:0/Slot 01/device/block/': sdb '/sys/class/enclosure/4:0:0:0/Slot 02/device/block/': sdc '/sys/class/enclosure/4:0:0:0/Slot 03/device/block/': sdd root@trunasmini:~# ls /sys/block/*/device/enclosure_device:* '/sys/block/sda/device/enclosure_device:Slot 00': active device fault locate power power_status slot status type uevent '/sys/block/sdb/device/enclosure_device:Slot 01': active device fault locate power power_status slot status type uevent '/sys/block/sdc/device/enclosure_device:Slot 02': active device fault locate power power_status slot status type uevent '/sys/block/sdd/device/enclosure_device:Slot 03': active device fault locate power power_status slot status type uevent ``` Tasks left for future work: * avoid the layering violation in the SES driver * provide an option for the tunable to force SES emulation without LED support * make use of ALLOCATION LENGTH
Currently, the CI run for Pull Requests are skipped only if a successful Push CI run is found. This implies that Push CI run has to complete before Pull Request should be opened, otherwise Pull Request CI will also start in parallel to Push CI run. If upon opening a Pull Request, a Push CI run is already in progress, concurrent skipping allows to skip Pull Request CI run completely. 'same_content_newer' ensures that atleast one of Push or Pull Request is run. If a new commit is added after opening a Pull Request or in case of force push, atleast one of Pull Request or Push CI run will be performed. Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Debian set this to med_performance_with_dipm, which causes issues with some hardware. Set it to max_performance instead. Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
First, Linux expects NVDIMM _DSM method with unknown UUID to return zero, meaning no functions supported. Supermicro though returns 1, meaning error has happened. It makes Linux think that NVDIMM supports all command sets, selecting Intel as first on the list. Workaround it by checking actual function bits instead of bit 1, same as we are doing on FreeBSD. Second, Linux checks for _LSI/_LSR/_LSW methods presence to detect NVDIMM's support for labels. Supermicro though has those methods, but they just always return error status. It makes Linux disable the NVDIMM after failed attempt to read its label during the probe. Workaround it by actually calling _LSI method once and checking the returned status. Together those fix `ndctl list -D --health` for NVDIMMs implementing Microsoft command set on the specified motherboards.
Children drivers must call their ntb_transport_free_queue() before we clean up what's left (nothing should). Otherwise we get double free.
Otherwise on next attach we get warning message that the directory already exist.
Disable init_on_alloc to improve ZFS performance. ZFS allocates and frees pages frequently, and init_on_alloc zeroes out the pages during allocation. Disabling init_on_alloc should improve ZFS performance. Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
* scsi: core: sd: Add genhd_hidden flag This flag will permit SCSI low level drivers to set the gendisk GENHD_FL_HIDDEN flag, resulting in a gendisk that can be used for I/O submission inside the kernel, but which is not registered as a user facing block device. * scsi: iscsi_tcp: Hide logged-in HA peer node iSCSI targets from local access
This commit adds the support for production and debug kernels for SCALE.
Following is a summary of changes:
- tn-production.config has been added. This fragment will be used to
disable unnecessary debug option that are not required in production.
- tn-debug.config has been added. This fragment will enable extra
debug options to aid debugging during boot and runtime.
- Unnecessary hardware options like WiFi, Bluetooth, NFSC etc. have
been disabled for both production and debug kernels.
- The packaging bits in kernel have been updated to include the extra
version string in package name so that the packages for debug and
production kernels can be uniquely identified.
- A new changelog entry will be generated based on the updated package
name for production and debug kernels.
- EXTRAVERSION variable con be configured from the environment.
`export EXTRAVERSION=-production` would build the packages for
production kernel. This should be the first step to build the kernel
packages moving forward.
- GitHub Actions are updated to build both production and debug
kernels.
A minor update has been made here while porting from 6.1, EXTRAVERSION
variable is set to '-production'. So by default, production kernel will
be built and we don't need to export the EXTRAVERSION variable while
building the production kernel.
Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
SES device of the enclosure reports disk's SAS address in Additional Element Status few seconds after disk is detected by SES driver. The periodic polling of enclosures allows enclosure to be added in case of the lost events.
Allow userspace (root) access to all IO memory. /dev/mem allows userspace access to memory rabges that may or may not be actively by a driver. Required for plx_eeprom utility. Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Unlike SAS disks, the Linux kernel does not update the size of SATA disks immediately unless the power cycle is done or the disk is revalidated (due to some error condition). In this patch, we refresh the sector count by revalidating the disk in response to ATA sector update command. Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Change concurrent_skipping from 'same_content_newer' to 'outdated_runs'. Previously, when pushing multiple commits in quick succession, the newer workflow run would be skipped while the older one continued. This is backwards for development, i.e., we want to test the latest code and cancel the outdated runs. Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Add more aggressive cleanup of unused system directories to free up additional disk space. The 6.18 kernel build was requiring more space due to significantly larger build artifacts compared to 6.12. Added cleanup of Android SDK, Swift, GHCup directories, Docker images, and APT cache to provide several additional gigabytes of free space. Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Debian disabled this in 6.18 as deprecated, breaking API test test_300_nfs test_state_directory which requires the /proc/fs/nfsd/nfsv4recoverydir interface. Re-enable for API test compatibility. Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Kernel 9ba817f deprecated automounting tracefs in debugfs, causing "Automounting of tracing to debugfs is deprecated and will be removed in 2030" warnings whenever /sys/kernel/debug/tracing is accessed. This warning spams kernel logs during normal operation. Tools should use /sys/kernel/tracing instead of /sys/kernel/debug/tracing. Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Import configuration from Debian's linux-config-6.18_6.18~rc7-1~exp1_amd64.deb package. At the time of the merge, 6.18-rc7 is the most recent kernel config available in Debian's package repository (https://ftp.debian.org/debian/pool/main/l/linux/). Will update to the final 6.18 release config once available. Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
yocalebo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I trust you've done as much testing as humanly possible
mgrimesix
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've opened https://ixsystems.atlassian.net/browse/NAS-138841 for fixing the CONFIG_NFSD_LEGACY_CLIENT_TRACKING=y config addition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ran the sharing protocols tests with the build. Results here.
iSCSI and NVMe-oF look fine, so I will approve. (Also spun up a VM to ensure HA internode iSCSI disk suppression was functional).
However there seem to be some SMB client regressions.
Thanks @bmeagherix for running sharing protocol tests. All the smb client tests were being affected by 6.18 merge. The issue was reliably reproducible on 6.18 client side as we get 28 bytes entry instead of full 48-88 bytes entry depending on SID mapping. After digging deeper, found out that ad9364a caused the regression. The commit changed I have added a fix according to my understanding of the issue: f1a88a6 But it needs deeper eyes by @anodos325 to make sure this is the right approach for handling the upstream change. |
|
Triggered the following test runs. Will merge once the test runs complete with expected results: |
Upstream Commit ad9364a changed get_acl() to request only the security information specified in extra_info parameter, instead of always including OWNER|GROUP|DACL. The TrueNAS system.nfs4_acl_xdr handler was passing 0 as extra_info, causing the server to return an empty Security Descriptor with dacloffset == 0. This triggered the null ACL generation path, returning a single everyone@ entry (28 bytes) instead of the actual ACL entries from the server. Fix by explicitly requesting OWNER_SECINFO | GROUP_SECINFO | DACL_SECINFO in the XATTR_ZFSACL case, ensuring the full Security Descriptor is retrieved. Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Fix mount tree root detection to handle kernel 6.14+ parent_id changes discovered during 6.18 kernel port. The middleware hardcoded a check for `parent_id==1` to identify the root filesystem in `/proc/self/mountinfo`. This breaks with kernel 6.14+ which changed mount ID allocation from `IDA` (starting at 0) to xarray with `XA_LIMIT(1,INT_MAX)` (starting at 1), shifting early boot mount IDs: - `shmem_init`: 0 → 1 - `init_mount_tree` (initial ramfs): 1 → 2 This causes the root filesystem's parent_id to reference mount ID 2 instead of 1, resulting in `KeyError` when parsing the mount tree. Per [proc_pid_mountinfo(5)](https://man7.org/linux/man-pages/man5/proc_pid_mountinfo.5.html), the root mount may have a parent that lies visible tree and won't appear in mountinfo. The fix identifies root as the mount with smallest `parent_id` (referencing the earliest boot mount) rather than hardcoding `parent_id==1`. This resolves SMB share creation failures and other services that validate mount information during operation. Kernel commit (Regression): torvalds/linux@7f9bfafc5f49 **Kernel Update PR:** truenas/linux#232
|
Not updating JIRA ticket https://ixsystems.atlassian.net/browse/NAS-137494 target versions as no JIRA version corresponds to this PR |
|
This PR has been merged and conversations have been locked. |
Description
Rebased all TrueNAS patches from truenas/linux-6.12 onto Linux v6.18 release tag. Discarded merge commits, changelog updates, and kernel version bump commits. Several patches were already upstreamed and excluded from this rebase.
Patches with merge conflicts resolved:
New commits for 6.18 compatibility:
libdw-dev)CONFIG_TRUENASmacro usage to#ifdefthroughout codebaseCONFIG_NFSD_LEGACY_CLIENT_TRACKINGfor API test compatibility (Debian disabled in 6.18)CONFIG_TRACEFS_AUTOMOUNT_DEPRECATEDto suppress kernel warning spamRelated PRs
Testing
test_300_nfs test_state_directoryfailure resolved by enablingCONFIG_NFSD_LEGACY_CLIENT_TRACKING