Skip to content

Conversation

@ixhamza
Copy link
Contributor

@ixhamza ixhamza commented Dec 8, 2025

Description

Rebased all TrueNAS patches from truenas/linux-6.12 onto Linux v6.18 release tag. Discarded merge commits, changelog updates, and kernel version bump commits. Several patches were already upstreamed and excluded from this rebase.

Patches with merge conflicts resolved:

Commit Description Conflicts
3ac7142 efi: Add an EFI_SECURE_BOOT flag to indicate secure boot mode include/linux/efi.h
57ac602 block: Introduce queue limits and sysfs for copy-offload support block/blk-settings.c, block/blk-sysfs.c
35c569e disable multiple device (md) driver scripts/package/truenas/truenas.config
eba7974 NVME Encapsulation support for Trimode HBA drivers/scsi/mpt3sas/mpt3sas_ctl.h
7d500ea fs/cifs - add ZFS ACL support to SMB client fs/smb/client/cifs_debug.c
6d72bc4 Add DACL support to nfsd (v4.1+) fs/nfsd/nfs4xdr.c
8a3c5b3 nvme: skip optional id ctrl csi for versions less than 2.0.0 drivers/nvme/host/core.c
ea6a273 ixgbe: Print fw warning message only once drivers/net/ethernet/intel/ixgbe/
49dc703 Build production and debug kernels for SCALE scripts/package/mkdebian
998ce51 truenas-scale: Add NTB to default tn.config arch/x86/Kconfig
e1969f6 Add initial support for large xattrs fs/xattr.c
46e77e3 Add TrueNAS Debian build customizations Makefile, scripts/package/

New commits for 6.18 compatibility:

  • cbca239: Imported Debian v6.18-rc7 configuration (will update to v6.18 release config once published by Debian)
  • CI improvements:
    • 3708feb: Added aggressive disk cleanup for larger 6.18 build artifacts
    • afd59a7: Updated build dependencies (libdw-dev)
    • c39b563: Enabled workflow cancellation for outdated runs
  • 11b7437: Standardized CONFIG_TRUENAS macro usage to #ifdef throughout codebase
  • eb7dc66: Enabled CONFIG_NFSD_LEGACY_CLIENT_TRACKING for API test compatibility (Debian disabled in 6.18)
  • 8d5020f: Disabled CONFIG_TRACEFS_AUTOMOUNT_DEPRECATED to suppress kernel warning spam
  • 6a0b876: Fix Samba Client regression with kernel 6.18+

Related PRs

  • scale-build:
    • Switch kernel branch from truenas/linux-6.12 to truenas/linux-6.18
    • Upgrade NVIDIA driver from 580 to 590 for 6.18 compatibility
    • Add zstd package for kernel compression
  • middleware:
    • Fix mount tree parsing for kernel 6.18 mount ID allocation changes
  • ZFS:
    • Update META to enable 6.18 kernel compatibility

Testing

  • Scale Build
  • API Tests
  • CI Testing: Production and debug kernel builds complete successfully
  • Boot testing: Confirmed successful boot on KVM with no kernel regressions
  • NVIDIA driver 590.44.01 compiles successfully on 6.18
  • Middleware mount tree parsing tested on both 6.12 and 6.18
  • API test test_300_nfs test_state_directory failure resolved by enabling CONFIG_NFSD_LEGACY_CLIENT_TRACKING

ixhamza and others added 30 commits December 1, 2025 15:42
This commit adds TrueNAS build customization required for building
Debian packages for TrueNAS SCALE kernel.

The original commit ported from v6.6 is
0c5b36a.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Support for alternate datastreams over the SMB protocol has been
historically enabled in  such a way that Samba writes them as
filesystem extended attributes in the user namespace. FreeBSD has no
practical limit on xattr size, and so clients (often MacOS) may write
ones that exceed the 64 KiB limit imposed by the Linux kernel. Since
XATTR_SIZE_MAX is uesd in many places in the kernel, and not all
filesystems support large xattrs, introduce new constant
XATTR_LARGE_SIZE_MAX that is used as an alternate value if the
filesystem sb_flags has SB_LARGEXATTR. There will be corresponding
commit in ZFS to set this flag when it is defined and xattrs are
enabled on the ZFS dataset.

This commit also introduces flag SB_NFSV4ACL which will be used
to indicate and enable NFSv4-specific behavior in kernel with regard
to permissions.

These new features / alternate behavior are controlled by the
compile-time kernel compilation flag CONFIG_TRUENAS, which defaults
to n (off). In principle, TrueNAS-specific changes that deviate from
a vanilla Linux kernel can be removed for testing purposes by changing
CONFIG_TRUENAS=n in the relevant build scripts.

Signed-off-by: Andrew Walker <awalker@ixsystems.com>
There are various places in which evaluation of permissions
in the presence of an NFSv4 ACL is more nuanced than what is
typical when evaluating traditional POSIX permissions. For
example, a user may be permitted to delete a file if he
has DELETE permissions on the file or DELETE_CHILD permissions
on the parent directory. Traditional POSIX permissions will
only check for MAY_WRITE | MAY_EXEC on parent directory.

Several new inode permissions masks have been added to facilitate these
NFSv4-specific checks corresponding to different NFSv4 permissions
that grant abilities to make changes to files. For the purpose of
this commit and the goal of providing rough a approximation of
NFSv4 access checks, only write (and not read) access checks have
been implemented. This is selectively done in a way to grant
minimal compliance with permissions as defined in RFC-5661.

The new permissions-related behavior is only applied when the
inode sb_flag SB_NFS4ACL is present. In this case, the onus of full
implementation of requisite features to satisfy the ACL behavior
specified in RFC-5661 is delegated to the filesystem's inode
permissions interface (i_op->permission). If possible we try to
check for the convention POSIX permission first before trying
the NFSv4-equivalent. For example, when writing an xattr, we
check for WRITE_DATA before WRITE_NAMED_ATTRS because in the case
of former with a trivial ACL we can avoid having to evaluate the
full ACL, and instead merely look at POSIX mode.
csiostor seems to cause Chelsio T6 firmware to crash.

Jira: NAS-110910
Being written anything waits for all device probe to complete before
returning.  After that `udevadm settle` used by ZFS scripts really
can provide system with all disks detected for boot pool import.

Ticket:	NAS-108200
Enable NTB and NTB tools in the Truenas config.  In addition, enable
the Intel NTB driver, so that we have at least one NTB driver available.
Added initial support for PLX Non-Transparent Bridge.
Before this change it was impossible to load client modules before
NTB hardware is probed.  This change removes the limitation.  New
NTB transports will get their children devices as they come in.
This fixes interrupt storms on hardware using legacy level-triggered
interrupts, since doorbell processing could take time after interrupt
handler completion, that triggered extra interrupts in a loop.
If a previous successful run is
present, then skip re-run for its pull request.

Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Previously for TrueNAS, Debian Linux kernel configuration was used and
TrueNAS config options were added on top of that. Because of that,
TrueNAS kernel config in 'scripts/package/truenas/tn.config' has grown
very large and difficult to manage for TrueNAS only options.

Debian Linux kernel configuration for version 6.1.55 has been added
as 'debian_amd64.config' to keep the options from Debian seperate from
TrueNAS options. TrueNAS only config options are stored in
'truenas.config'.

Kernel configuration can now be generated as:

	1) make ARCH=x86_64 defconfig
	2) ./scripts/kconfig/merge_config.sh .config \
			./scripts/package/truenas/debian_amd64.config
	3) ./scripts/kconfig/merge_config.sh .config \
			./scripts/package/truenas/truenas.config
	4) ./scripts/kconfig/merge_config.sh .config \
			./scripts/package/truenas/debug.config

Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Without this it was impossible to use multple NTB consumer drivers,
since kernel just attached the first one to all NTBs.  This replicates
driver_override device attribute from PCI, plus adds module parameter
to ease default setting.

Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Previously ntb_transport required at least 6 scratchpad registers, plus
2 more for each additional memory window.  That is too much for some
configurations, where several drivers have to share resources of the same
NTB hardware.  This patch introduces new compact version of the protocol,
requiring only 3 scratchpad registers, plus one more for each additional
memory window.  The optimization is based on fact that neither of version,
number of windows or number of queue pairs really need more then one byte
each, and window sizes of 4GB are not very useful now.  The new protocol
is activated automatically when the configuration is low on scratchpad
registers, or it can be activated explicitly with module parameter.
It attaches as client to some NTB device and provides several other NTB
device, splitting NTB resources according to config module parameter.
For example, one BAR can be dedicated for remote memory access, while other
resources can be used for packet transport for virtual Ethernet interface.
Mostly complete.  AHCI controllers supporting enclosure management will
create a virtual SES logical unit on a dedicated SCSI host, allowing
control/status of EM LEDs.  link to the device in a given slot is created for
mapping the slot to a device:

```
root@trunasmini:~# strings /sys/class/enclosure/*/Slot*/device/inquiry
ATA     Samsung SSD 860 1B6Q
ATA     Samsung SSD 840 6B0Q
ATA     SanDisk SD6SB1M0600
ATA     SanDisk SDSSDHII00RL
root@trunasmini:~# ls /sys/class/enclosure/*/Slot*/device/block/
'/sys/class/enclosure/4:0:0:0/Slot 00/device/block/':
sda

'/sys/class/enclosure/4:0:0:0/Slot 01/device/block/':
sdb

'/sys/class/enclosure/4:0:0:0/Slot 02/device/block/':
sdc

'/sys/class/enclosure/4:0:0:0/Slot 03/device/block/':
sdd
root@trunasmini:~# ls /sys/block/*/device/enclosure_device:*
'/sys/block/sda/device/enclosure_device:Slot 00':
active  device  fault  locate  power  power_status  slot  status  type  uevent

'/sys/block/sdb/device/enclosure_device:Slot 01':
active  device  fault  locate  power  power_status  slot  status  type  uevent

'/sys/block/sdc/device/enclosure_device:Slot 02':
active  device  fault  locate  power  power_status  slot  status  type  uevent

'/sys/block/sdd/device/enclosure_device:Slot 03':
active  device  fault  locate  power  power_status  slot  status  type  uevent
```

Tasks left for future work:
 * avoid the layering violation in the SES driver
 * provide an option for the tunable to force SES emulation without LED support
 * make use of ALLOCATION LENGTH
Currently, the CI run for Pull Requests are skipped only if a
successful Push CI run is found. This implies that Push CI run has to
complete before Pull Request should be opened, otherwise Pull Request
CI will also start in parallel to Push CI run.

If upon opening a Pull Request, a Push CI run is already in progress,
concurrent skipping allows to skip Pull Request CI run completely.

'same_content_newer' ensures that atleast one of Push or Pull Request
is run. If a new commit is added after opening a Pull Request or in
case of force push, atleast one of Pull Request or Push CI run will
be performed.

Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Debian set this to med_performance_with_dipm, which causes issues with
some hardware.  Set it to max_performance instead.

Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
First, Linux expects NVDIMM _DSM method with unknown UUID to return
zero, meaning no functions supported.  Supermicro though returns 1,
meaning error has happened.  It makes Linux think that NVDIMM supports
all command sets, selecting Intel as first on the list.  Workaround
it by checking actual function bits instead of bit 1, same as we are
doing on FreeBSD.

Second, Linux checks for _LSI/_LSR/_LSW methods presence to detect
NVDIMM's support for labels.  Supermicro though has those methods,
but they just always return error status.  It makes Linux disable
the NVDIMM after failed attempt to read its label during the probe.
Workaround it by actually calling _LSI method once and checking the
returned status.

Together those fix `ndctl list -D --health` for NVDIMMs implementing
Microsoft command set on the specified motherboards.
Children drivers must call their ntb_transport_free_queue() before
we clean up what's left (nothing should).  Otherwise we get double
free.
Otherwise on next attach we get warning message that the directory
already exist.
Disable init_on_alloc to improve ZFS performance. ZFS allocates
and frees pages frequently, and init_on_alloc zeroes out the pages
during allocation. Disabling init_on_alloc should improve ZFS
performance.

Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
* scsi: core: sd: Add genhd_hidden flag

This flag will permit SCSI low level drivers to set the gendisk GENHD_FL_HIDDEN
flag, resulting in a gendisk that can be used for I/O submission inside the
kernel, but which is not registered as a user facing block device.

* scsi: iscsi_tcp: Hide logged-in HA peer node iSCSI targets from local access
This commit adds the support for production and debug kernels for SCALE.
Following is a summary of changes:

  - tn-production.config has been added. This fragment will be used to
    disable unnecessary debug option that are not required in production.
  - tn-debug.config has been added. This fragment will enable extra
    debug options to aid debugging during boot and runtime.
  - Unnecessary hardware options like WiFi, Bluetooth, NFSC etc. have
    been disabled for both production and debug kernels.
  - The packaging bits in kernel have been updated to include the extra
    version string in package name so that the packages for debug and
    production kernels can be uniquely identified.
  - A new changelog entry will be generated based on the updated package
    name for production and debug kernels.
  - EXTRAVERSION variable con be configured from the environment.
    `export EXTRAVERSION=-production` would build the packages for
    production kernel. This should be the first step to build the kernel
    packages moving forward.
  - GitHub Actions are updated to build both production and debug
    kernels.

A minor update has been made here while porting from 6.1, EXTRAVERSION
variable is set to '-production'. So by default, production kernel will
be built and we don't need to export the EXTRAVERSION variable while
building the production kernel.

Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
This commit removes tn.config that was removed in v6.1 in commit
c17939b.

This commit provides addtional free space for CI runners from
commit c3e03a5.

bd0cd59 is also applied here
to fix couple issues in CI.

Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
SES device of the enclosure reports disk's SAS address in Additional
Element Status few seconds after disk is detected by SES driver. The
periodic polling of enclosures allows enclosure to be added in case of
the lost events.
Allow userspace (root) access to all IO memory. /dev/mem allows
userspace access to memory rabges that may or may not be actively
by a driver. Required for plx_eeprom utility.

Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Unlike SAS disks, the Linux kernel does not update the size of SATA
disks immediately unless the power cycle is done or the disk is
revalidated (due to some error condition). In this patch, we refresh
the sector count by revalidating the disk in response to ATA sector
update command.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Change concurrent_skipping from 'same_content_newer' to 'outdated_runs'.
Previously, when pushing multiple commits in quick succession, the newer
workflow run would be skipped while the older one continued. This is
backwards for development, i.e., we want to test the latest code and
cancel the outdated runs.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Add more aggressive cleanup of unused system directories to free up
additional disk space. The 6.18 kernel build was requiring more space
due to significantly larger build artifacts compared to 6.12.

Added cleanup of Android SDK, Swift, GHCup directories, Docker images,
and APT cache to provide several additional gigabytes of free space.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Debian disabled this in 6.18 as deprecated, breaking API test
test_300_nfs test_state_directory which requires the
/proc/fs/nfsd/nfsv4recoverydir interface. Re-enable for API test
compatibility.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Kernel 9ba817f deprecated automounting tracefs in debugfs, causing
"Automounting of tracing to debugfs is deprecated and will be removed in
2030" warnings whenever /sys/kernel/debug/tracing is accessed. This
warning spams kernel logs during normal operation. Tools should use
/sys/kernel/tracing instead of /sys/kernel/debug/tracing.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Import configuration from Debian's
linux-config-6.18_6.18~rc7-1~exp1_amd64.deb package. At the time of
the merge, 6.18-rc7 is the most recent kernel config available in
Debian's package repository
(https://ftp.debian.org/debian/pool/main/l/linux/). Will update to
the final 6.18 release config once available.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
@bugclerk bugclerk changed the title Update to 6.18 LTS NAS-137494 / None / Update to 6.18 LTS Dec 8, 2025
@bugclerk
Copy link

bugclerk commented Dec 8, 2025

Copy link
Contributor

@yocalebo yocalebo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I trust you've done as much testing as humanly possible

Copy link
Contributor

@mgrimesix mgrimesix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened https://ixsystems.atlassian.net/browse/NAS-138841 for fixing the CONFIG_NFSD_LEGACY_CLIENT_TRACKING=y config addition.

Copy link
Contributor

@bmeagherix bmeagherix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ran the sharing protocols tests with the build. Results here.

iSCSI and NVMe-oF look fine, so I will approve. (Also spun up a VM to ensure HA internode iSCSI disk suppression was functional).

However there seem to be some SMB client regressions.

@ixhamza
Copy link
Contributor Author

ixhamza commented Dec 9, 2025

However there seem to be some SMB client regressions.

Thanks @bmeagherix for running sharing protocol tests. All the smb client tests were being affected by 6.18 merge. The issue was reliably reproducible on 6.18 client side as we get 28 bytes entry instead of full 48-88 bytes entry depending on SID mapping. After digging deeper, found out that ad9364a caused the regression. The commit changed SMB2_query_acl() to not automatically include OWNER|GROUP|DACL, instead it only uses what's passed in extra_info. TrueNAS code was passing 0, so server returned empty Security Descriptor.

I have added a fix according to my understanding of the issue: f1a88a6
It explicitly requests OWNER_SECINFO | GROUP_SECINFO | DACL_SECINFO in the XATTR_ZFSACL case. Waiting for Scale Build to complete, and will run sharing Sharing Protocol Tests afterwards.

But it needs deeper eyes by @anodos325 to make sure this is the right approach for handling the upstream change.

@ixhamza
Copy link
Contributor Author

ixhamza commented Dec 9, 2025

Triggered the following test runs. Will merge once the test runs complete with expected results:

Upstream Commit ad9364a changed get_acl() to request only the
security information specified in extra_info parameter, instead of
always including OWNER|GROUP|DACL. The TrueNAS system.nfs4_acl_xdr
handler was passing 0 as extra_info, causing the server to return an
empty Security Descriptor with dacloffset == 0. This triggered the null
ACL generation path, returning a single everyone@ entry (28 bytes)
instead of the actual ACL entries from the server.

Fix by explicitly requesting OWNER_SECINFO | GROUP_SECINFO |
DACL_SECINFO in the XATTR_ZFSACL case, ensuring the full Security
Descriptor is retrieved.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
@ixhamza ixhamza merged commit 5e0cb28 into truenas/linux-6.18 Dec 9, 2025
6 checks passed
ixhamza added a commit to truenas/middleware that referenced this pull request Dec 9, 2025
Fix mount tree root detection to handle kernel 6.14+ parent_id changes
discovered during 6.18 kernel port.

The middleware hardcoded a check for `parent_id==1` to identify the root
filesystem in `/proc/self/mountinfo`. This breaks with kernel 6.14+
which changed mount ID allocation from `IDA` (starting at 0) to xarray
with `XA_LIMIT(1,INT_MAX)` (starting at 1), shifting early boot mount
IDs:
  - `shmem_init`: 0 → 1
  - `init_mount_tree` (initial ramfs): 1 → 2

This causes the root filesystem's parent_id to reference mount ID 2
instead of 1, resulting in `KeyError` when parsing the mount tree.

Per
[proc_pid_mountinfo(5)](https://man7.org/linux/man-pages/man5/proc_pid_mountinfo.5.html),
the root mount may have a parent that lies visible tree and won't appear
in mountinfo. The fix identifies root as the mount with smallest
`parent_id` (referencing the earliest boot mount) rather than hardcoding
`parent_id==1`.

This resolves SMB share creation failures and other services that
validate mount information during operation.

Kernel commit (Regression):
torvalds/linux@7f9bfafc5f49
**Kernel Update PR:** truenas/linux#232
@bugclerk
Copy link

bugclerk commented Dec 9, 2025

Not updating JIRA ticket https://ixsystems.atlassian.net/browse/NAS-137494 target versions as no JIRA version corresponds to this PR

@bugclerk
Copy link

bugclerk commented Dec 9, 2025

This PR has been merged and conversations have been locked.
If you would like to discuss more about this issue please use our forums or raise a Jira ticket.

@truenas truenas locked as resolved and limited conversation to collaborators Dec 9, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.