From f34301bd1d3852ff93d7ed5a936fcbc23ad6bf58 Mon Sep 17 00:00:00 2001 From: MicJ Date: Wed, 13 Dec 2023 10:49:03 -0500 Subject: [PATCH] PD-781 Add Description Front Matter This commit adds description front matter to /reference/ articles that do not have this. It fixes tense/voice, style issues in articles that needed these updates. It makes other minor formatting changes to improve readability in a few articles. --- content/References/ACLPrimer.md | 1 + content/References/ConceptsAndTerms.md | 17 ++-- content/References/Copyrights.md | 1 + content/References/DefaultPorts.md | 5 +- content/References/IPMIFAQ.md | 77 +++++++++++---- content/References/L2ARC.md | 38 ++++--- content/References/SLOG.md | 9 +- content/References/SMB.md | 19 ++-- content/References/ZFSDeduplication.md | 132 +++++++++++++++++++------ content/References/ZFSPrimer.md | 2 +- content/References/ZILandSLOG.md | 31 +++++- content/References/_index.md | 2 +- 12 files changed, 241 insertions(+), 93 deletions(-) diff --git a/content/References/ACLPrimer.md b/content/References/ACLPrimer.md index 2f2d79e051..0fea60eddd 100644 --- a/content/References/ACLPrimer.md +++ b/content/References/ACLPrimer.md @@ -1,5 +1,6 @@ --- title: "ACL Primer" +description: "Provides general information on POSIX and NFSv4 access control lists (ACLs) in TrueNAS systems and when to use them." weight: 9 --- diff --git a/content/References/ConceptsAndTerms.md b/content/References/ConceptsAndTerms.md index 167551c742..5a7b107544 100644 --- a/content/References/ConceptsAndTerms.md +++ b/content/References/ConceptsAndTerms.md @@ -1,9 +1,10 @@ --- title: "Concepts and Terminology" +description: "Provides a glossary of terms and definitions." weight: 10 --- -TrueNAS is very complicated software that combines many different Open Source solutions into one cohesive software package. +TrueNAS is very complicated software that combines many different open-source solutions into one cohesive software package. While TrueNAS is designed for and ever-evolving towards increased user friendliness, there are many terms and concepts that can be learned to improve your ability to understand and configure the software. ## General Concepts @@ -34,9 +35,9 @@ While TrueNAS is designed for and ever-evolving towards increased user friendlin ## Accounts Terminology -* *root* User: *root* is the primary account that by default has access to all commands and files on a Linux and Unix-like operating systems. It is also referred to as the root account, root user, and/or the superuser. This is similar to the "Administrator" account on Windows. +* *root* User: *root* is the primary account that by default has access to all commands and files on a Linux and Unix-like operating systems. It is also referred to as the root account, root user, and/or the superuser. This is similar to the administrator account on Windows. -* User: A *user* account is an additional account on a Linux and Unix-like operating system that has a lower permission levels than the "root" account. +* User: A *user* account is an additional account on a Linux and Unix-like operating system that has a lower permission levels than the root account. * Group: A *group* is a collection of users. The main purpose of the groups is to easily define a set of privileges like read, write, or execute permission for a given resource that can be shared among the multiple users within the group. @@ -104,7 +105,7 @@ While TrueNAS is designed for and ever-evolving towards increased user friendlin * SMB: SMB, or sometimes the Common Internet File System (CIFS), is a communication protocol for providing shared access to files, printers, and serial ports between nodes on a network. It was original designed by IBM in the early 1980s. -* Active Directory: Active Directory (AD) is a directory service developed by Microsoft for Windows domain networks. Active Directory uses the Lightweight Directory Access Protocol (LDAP), Microsoft's version of Kerberos, and DNS. +* Active Directory: Active Directory (AD) is a directory service developed by Microsoft for Windows domain networks. Active Directory uses the Lightweight Directory Access Protocol (LDAP), Microsoft versions of Kerberos, and DNS. * DDNS: Dynamic DNS (DDNS) is a method to realtime update a name server in the Domain Name System (DNS) according to the active DDNS configuration of its configured hostnames, addresses or other information. @@ -114,7 +115,7 @@ While TrueNAS is designed for and ever-evolving towards increased user friendlin * LLDP: The Link Layer Discovery Protocol (LLDP) is a link layer protocol used by network devices for advertising their identity, capabilities, and neighbors on a local area network based on IEEE 802 technology, typically wired Ethernet. -* IPMI: The Intelligent Platform Management Interface (IPMI) is a set of computer interface specifications for a computer subsystem that provides management and monitoring capabilities independently of the host system's CPU, firmware (BIOS or UEFI) and operating system. +* IPMI: The Intelligent Platform Management Interface (IPMI) is a set of computer interface specifications for a computer subsystem that provides management and monitoring capabilities independently of the host system CPU, firmware (BIOS or UEFI) and operating system. * NDS: The Network Information Service (NIS) originally designed by Sun Microsystems is a client–server directory service protocol for distributing system configuration data such as user and host names between computers on a computer network. @@ -144,19 +145,19 @@ While TrueNAS is designed for and ever-evolving towards increased user friendlin * AHCI: The Advanced Host Controller Interface (AHCI) is a technical standard defined by Intel that specifies the operation of Serial ATA (SATA) host controllers in a non-implementation-specific manner in its motherboard chipsets. The specification describes a system memory structure for computer hardware to exchange data between host system memory and attached storage devices. For modern solid state drives, the interface has been superseded by NVMe. -* VirtIO: VirtIO is a virtualization standard for network and disk device drivers where just the guest's device driver "knows" it is running in a virtual environment, and cooperates with the hypervisor. VirtIO was chosen to be the main platform for IO virtualization in KVM. +* VirtIO: VirtIO is a virtualization standard for network and disk device drivers where just the guest device driver knows it is running in a virtual environment, and cooperates with the hypervisor. VirtIO was chosen to be the main platform for IO virtualization in KVM. * UEFI: The Unified Extensible Firmware Interface (UEFI) is a specification that defines a software interface between an operating system and platform firmware. UEFI replaces the legacy Basic Input/Output System (BIOS) firmware interface originally present in all IBM PC-compatible personal computers. * UEFI-CSM: To ensure backward compatibility, most UEFI firmware implementations on PC-class machines also support booting in legacy BIOS mode from MBR-partitioned disks through the Compatibility Support Module (CSM). In this scenario, booting is performed in the same way as on legacy BIOS-based systems: ignoring the partition table and relying on the content of a boot sector. -* GRUB: GNU GRUB stands for GNU GRand Unified Bootloader and is commonly referred to as GRUB. GRUB is a boot loader package from the GNU Project and the reference implementation of the Free Software Foundation's Multiboot Specification, which provides the choice to boot into one of multiple operating systems installed on a computer or select a specific kernel configuration available on a particular operating system's partitions. +* GRUB: GNU GRUB stands for GNU GRand Unified Bootloader and is commonly referred to as GRUB. GRUB is a boot loader package from the GNU Project and the reference implementation of the Free Software Foundation Multiboot Specification, which provides the choice to boot into one of multiple operating systems installed on a computer or select a specific kernel configuration available on a particular operating system partitions. * VNC: Virtual Network Computing (VNC) is a graphical desktop-sharing system that uses the remote frame buffer protocol to remotely control another computer. It transmits the keyboard and mouse events from one computer to another and relays back the graphical-screen updates through a network. ## iSCSI -* Portals: An iSCSI portal is a target's IP and TCP port pair. +* Portals: An iSCSI portal is a target IP and TCP port pair. * Initiator: An initiator is software or hardware that enables a host computer to send data to an external iSCSI-based storage array through a network adapter. diff --git a/content/References/Copyrights.md b/content/References/Copyrights.md index e11c278df5..1bf98120b7 100644 --- a/content/References/Copyrights.md +++ b/content/References/Copyrights.md @@ -1,5 +1,6 @@ --- title: "Copyrights and Trademarks" +description: "Provides a list of copyrights and trademarks, and related logos registered as trademarks of iXsystems." weight: 1 --- diff --git a/content/References/DefaultPorts.md b/content/References/DefaultPorts.md index 178f0b069d..5223feb3a1 100644 --- a/content/References/DefaultPorts.md +++ b/content/References/DefaultPorts.md @@ -1,5 +1,6 @@ --- title: "Default Ports" +description: "Provides lists of assigned inbound and outband port numbers used in TrueNAS systems." weight: 20 --- @@ -30,9 +31,9 @@ TCP ports and services that listen for external connections: ## Outbound Ports -Protocols that are “outbound” do not listen for or accept external connections. +Protocols that are outbound do not listen for or accept external connections. These protocols and ports are not a security risk and are usually allowed through firewalls. -These protocols are considered "primary" and might need to be permitted through a firewall: +These protocols are considered *primary* and might need to be permitted through a firewall: {{< truetable >}} | Outbound Port | Protocol | Service Name | Description of Service | Encrypted | Defaults | diff --git a/content/References/IPMIFAQ.md b/content/References/IPMIFAQ.md index e19024e37a..3a5b6e7f2a 100644 --- a/content/References/IPMIFAQ.md +++ b/content/References/IPMIFAQ.md @@ -1,5 +1,6 @@ --- title: "IPMI Frequently Asked Questions" +description: "Provides TrueNAS connecting modes for remote management (IPMI) features, and configuration and general use information for remote management." weight: 25 aliases: - /hardware/notices/faqs/ipmi-faq/ @@ -7,39 +8,64 @@ aliases: **How do I connect to Remote Management?** -iXsystems servers provide two modes for connecting the Remote Management (IPMI) features to the network. The first method is a dedicated connection via a separate Ethernet jack on the back panel of the system. On most servers, this jack is located above the USB ports near the keyboard and mouse port on the rear I/O panel. This port runs at 100Mb/s and has a link status and speed/traffic lights. +iXsystems servers provide two modes for connecting the remote management (IPMI) features to the network. The first method is a dedicated connection via a separate Ethernet jack on the back panel of the system. +On most servers, this jack is located above the USB ports near the keyboard and mouse port on the rear I/O panel. +This port runs at 100Mb/s and has a link status and speed/traffic lights. -The second method is a shared connection with the first LAN port, which is the port on the lower left on the rear I/O panel. By default, Remote Management chooses which port to use by searching for a link when the server is initially plugged in. If there is an Ethernet link on the dedicated port, it will choose dedicated mode. If no link is detected, shared mode is chosen. The connection mode does not change while the server continues to receive power unless the LAN port mode is changed to force one port or the other in the web interface. +The second method is a shared connection with the first LAN port, which is the port on the lower left on the rear I/O panel. By default, remote management chooses which port to use by searching for a link when the server is initially plugged in. +If there is an Ethernet link on the dedicated port, it chooses dedicated mode. If no link is detected, shared mode is chosen. +The connection mode does not change while the server continues to receive power unless the LAN port mode is changed to force one port or the other in the web interface. -Remote Management uses an Ethernet MAC address for communications, regardless of which port it uses, and by default will obtain its own IP address using DHCP. +Remote management uses an Ethernet MAC address for communications, regardless of which port it uses, and by default obtains its own IP address using DHCP. **What are the advantages and disadvantages of dedicated mode?** -One advantage of dedicated mode is that remote management traffic is kept separate from host traffic. Since heavy traffic on a shared mode connection can cause performance degradation, dedicated mode connections do not interfere with host traffic. Dedicated mode allows for separate data and control planes (networks), but requires more physical connections – one wire for the dedicated port and one for the host network connection. To avoid accidental switches to shared mode, use the web interface to configure the LAN port mode to dedicated. +One advantage of dedicated mode is that remote management traffic is kept separate from host traffic. +Since heavy traffic on a shared mode connection can cause performance degradation, dedicated mode connections do not interfere with host traffic. +Dedicated mode allows for separate data and control planes (networks), but requires more physical connections – one wire for the dedicated port and one for the host network connection. +To avoid accidental switches to shared mode, use the web interface to configure the LAN port mode to dedicated. **What are the advantages and disadvantages of shared mode?** -Shared mode is the easiest way to get connected and requires no additional cabling beyond a single connection. Remote Management can be configured to use an alternate VLAN ID, allowing communication on a separate VLAN from host traffic. However, Remote Management and host traffic will compete for physical network access and could degrade each other in high traffic scenarios. The switch port for the shared connection must be set to auto-negotiate speed and duplex for Remote Management to be accessible when the system is powered down. +Shared mode is the easiest way to get connected and requires no additional cabling beyond a single connection. +Remote management can be configured to use an alternate VLAN ID, allowing communication on a separate VLAN from host traffic. +However, Remote Management and host traffic competes for physical network access and could degrade each other in high traffic scenarios. +The switch port for the shared connection must be set to auto-negotiate speed and duplex for remote management to be accessible when the system is powered down. **How do I configure IP of Remote Management?** -Network configuration for Remote Management can happen in a number of ways. By default, Remote Management obtains an IP address using DHCP. The obtained address is available to standard IPMI tools or in the BIOS Setup under **Advanced/IPMI Configuration/IPMI LAN Configuration**. If a static address is desired, the following configuration methods are available: +Network configuration for remote management can happen in a number of ways. By default, remote management obtains an IP address using DHCP. +The obtained address is available to standard IPMI tools or in the BIOS setup under **Advanced/IPMI Configuration/IPMI LAN Configuration**. +If a static address is desired, the following configuration methods are available: -1. The easiest method is to set the IP in the BIOS Setup at server deployment time. The IPMI LAN configuration is found under **Advanced/IPMI Configuration/IPMI LAN Configuration**. +1. The easiest method is to set the IP in the BIOS Setup at server deployment time. + The IPMI LAN configuration is found under **Advanced/IPMI Configuration/IPMI LAN Configuration**. -2. If an operating system is already installed, an IPMI-compliant local configuration tool, such as `ipmitool` for UNIX-style platforms, can be used to configure the network parameters. Some operating systems may require IPMI drivers to be loaded before using tools like `ipmitool` and the tool might need to be configured to use the system interface instead of a network connection. When using `ipmitool`, the ipmitool lan feature set provides subcommands to set and view the current configuration. +2. If an operating system is already installed, an IPMI-compliant local configuration tool, such as `ipmitool` for UNIX-style platforms, can be used to configure the network parameters. + Some operating systems might require loading IPMI drivers before using tools like `ipmitool` and the tool might need to be configured to use the system interface instead of a network connection. + When using `ipmitool`, the ipmitool lan feature set provides subcommands to set and view the current configuration. **How do I perform firmware updates?** -Remote Management’s firmware can be upgraded using two methods: the Firmware Update in the Maintenance menu of the web interface or a DOS utility run on the host. The host must be rebooted after the update for the sensors to begin recording data again. In most cases the update should be run without saving the existing configuration, though it can be saved if there is extensive network and user configuration already present that would take time to reconfigure. +Remote management firmware can be upgraded using two methods: the firmware update in the maintenance menu of the web interface or a DOS utility run on the host. +The host must be rebooted after the update for the sensors to begin recording data again. +In most cases the update should be run without saving the existing configuration, though it can be saved if there is extensive network and user configuration already present that would take time to reconfigure. **What is the ipmitool and how do I get it?** -ipmitool is the industry standard Open Source CLI tool for viewing and configuring IPMI systems. It was originally written by Sun Microsystems, but has grown independently to support any IPMI-compliant system. iXsystems Remote Management is IPMI v2.0 compliant and `ipmitool` can make use of most of its capabilities. +ipmitool is the industry standard open source CLI tool for viewing and configuring IPMI systems. +Originally written by Sun Microsystems, it has grown independently to support any IPMI-compliant system. +iXsystems remote management is IPMI v2.0 compliant and `ipmitool` can make use of most of its capabilities. -For most operating systems, `ipmitool` is generally available in third-party package repositories. It can also be downloaded from its [GitHub Repository](https://github.com/ipmitool/ipmitool). For local use on Linux operating systems, ipmitool requires the ipmi-si and ipmi-devintf kernel modules. On FreeBSD, the ipmi kernel module is required to provide access to the Remote Management hardware. No kernel modules are necessary for access to a remote server. +For most operating systems, `ipmitool` is generally available in third-party package repositories. +It can also be downloaded from its [GitHub Repository](https://github.com/ipmitool/ipmitool). +For local use on Linux operating systems, ipmitool requires the ipmi-si and ipmi-devintf kernel modules. +On FreeBSD, the ipmi kernel module is required to provide access to the remote management hardware. No kernel modules are necessary for access to a remote server. -`ipmitool` is designed for CLI environments where simple management commands and a scriptable interface are needed. It supports both local (System Interface) connections and LAN connections via the lanplus interface. `ipmitool` also supports Serial over LAN (SOL) connections. `ipmitool` can be incorporated into monitoring frameworks to provide monitoring, trending, and alerting features. +`ipmitool` is designed for CLI environments where simple management commands and a scriptable interface are needed. +It supports both local (System Interface) connections and LAN connections via the lanplus interface. +`ipmitool` also supports Serial over LAN (SOL) connections. +`ipmitool` can be incorporated into monitoring frameworks to provide monitoring, trending, and alerting features. Some common `ipmitool` commands are: `ipmitool lan print` – displays network configuration @@ -48,24 +74,37 @@ Some common `ipmitool` commands are: `ipmitool sensor list` – displays sensors, their readings, state, and thresholds -When opening a support ticket with iXsystems in regards to troubleshooting Remote Management exceptions, include the output of `ipmitool sel elist` and `ipmitool sensor list`. The output of these commands will help us determine the specific issue. +When opening a support ticket with iXsystems in regards to troubleshooting remote management exceptions, include the output of `ipmitool sel elist` and `ipmitool sensor list`. The output of these commands helps us determine the specific issue. **What is IPMIView?** -[IPMIView](https://www.supermicro.com/manuals/other/IPMIView20.pdf) is a no-cost Windows application providing central control and management of multiple Remote Management-enabled servers. Servers can be grouped for speedy administration of multiple servers at once and their video displays can be quickly viewed without logging into each individual server’s Remote Management interface. Servers reporting trouble can be seen at-a-glance. +[IPMIView](https://www.supermicro.com/manuals/other/IPMIView20.pdf) is a no-cost Windows application providing central control and management of multiple remote management-enabled servers. +Servers can be grouped for speedy administration of multiple servers at once and their video displays can be quickly viewed without logging into each individual server remote management interface. Servers reporting trouble can be seen at-a-glance. **What is IPMICFG?** -`IPMICFG`, for DOS, Linux, and Windows platforms, provides network configuration options as well as some hardware-specific commands, such as factory settings reset. `IPMICFG` can be downloaded from: [ftp://ftp.supermicro.com/utility/IPMICFG/](ftp://ftp.supermicro.com/utility/IPMICFG/). +`IPMICFG`, for DOS, Linux, and Windows platforms, provides network configuration options as well as some hardware-specific commands, such as factory settings reset. +`IPMICFG` can be downloaded from: [ftp://ftp.supermicro.com/utility/IPMICFG/](ftp://ftp.supermicro.com/utility/IPMICFG/). **How do I reset Remote Management?** -In rare circumstances, Remote Management can malfunction and require a reset. To reset Remote Management, log in to the web interface and select *Reset Controller* from the *Maintenance* menu. Wait approximately 2 minutes before logging in to the web interface again. It may be necessary to reboot the host to allow the sensors to repopulate afterward. +In rare circumstances, remote management can malfunction and require a reset. +To reset remote management, log in to the web interface and select **Reset Controller** from the **Maintenance** menu. +Wait approximately two minutes before logging in to the web interface again. It might be necessary to reboot the host to allow the sensors to repopulate afterward. -If the procedure above does not address the issue, Remote Management may need to be power-cycled by shutting down and unplugging the server for approximately 30 seconds, then reconnecting and powering up the server. This causes Remote Management to re-initialize itself upon the next boot. +If the procedure above does not address the issue, remote management might need to be power-cycled by shutting down and unplugging the server for approximately 30 seconds, then reconnecting and powering up the server. This causes remote management to re-initialize itself upon the next boot. -In some cases, the software may become corrupted and need to be re-flashed. If the web interface is operational, the firmware can be reflashed as in an upgrade. If not, a DOS tool can be used to reflash the firmware. Refer to the instructions included in the readme associated with the Remote Management firmware for exact instructions and commands to re-flash the firmware manually. +In some cases, the software might become corrupted and need to be re-flashed. +If the web interface is operational, the firmware can be reflashed as in an upgrade. If not, a DOS tool can be used to reflash the firmware. +Refer to the instructions included in the readme associated with the remote management firmware for exact instructions and commands to re-flash the firmware manually. **What is Serial Over LAN?** -Serial Over LAN (SOL) provides the redirection of a virtual serial port to Remote Management. The virtual serial port acts like a standard PC serial port and the operating system generally attaches its serial port driver automatically. In server environments, serial ports are used for the operating system’s console, which can then be logged to disk or accessed remotely via out-of-band means in the event of failure. On iXsystems servers, the SOL virtual port can appear as *COM2* or *COM3* depending on the system, as the virtual port is numbered after the physical serial ports and some systems have 1 physical port and others have 2. The virtual serial port settings can be configured in the BIOS Setup. By default, the BIOS is configured to copy its screen output to the virtual serial port so the BIOS can be manipulated via SOL. The SOL serial port can be connected to and viewed using IPMI SOL-compliant tools such as `ipmitool` or by an applet in the web GUI. Authentication is required to connect to the port and the connection can be encrypted if supported by the tool. +Serial Over LAN (SOL) provides the redirection of a virtual serial port to remote management. +The virtual serial port acts like a standard PC serial port and the operating system generally attaches its serial port driver automatically. +In server environments, serial ports are used for the operating system console, which can then be logged to disk or accessed remotely via out-of-band means in the event of failure. +On iXsystems servers, the SOL virtual port can appear as **COM2** or **COM3** depending on the system, as the virtual port is numbered after the physical serial ports and some systems have one physical port and others have two. +The virtual serial port settings can be configured in the BIOS Setup. +By default, the BIOS is configured to copy its screen output to the virtual serial port so the BIOS can be manipulated via SOL. +The SOL serial port can be connected to and viewed using IPMI SOL-compliant tools such as `ipmitool` or by an applet in the web GUI. +Authentication is required to connect to the port and the connection can be encrypted if supported by the tool. diff --git a/content/References/L2ARC.md b/content/References/L2ARC.md index 1dce4bfd92..696ae24f30 100644 --- a/content/References/L2ARC.md +++ b/content/References/L2ARC.md @@ -1,25 +1,34 @@ --- title: "L2ARC" +description: "Provides information on L2ARC, caches drives, and persistent L2ARC implementations in TrueNAS CORE and SCALE." weight: 30 aliases: - /core/notices/persistentl2arcin12.0/ --- -ZFS has several features to help improve performance for frequent access data read operations. One is Adaptive Replacement Cache (ARC), which uses the server memory (RAM). The other is second level adaptive replacement cache (L2ARC), which uses cache drives added to ZFS storage pools. These cache drives are multi-level cell (MLC) SSD drives and, while slower than system memory, are still much faster than standard hard drives. ZFS (including TrueNAS) uses all of the RAM installed in a system to make the ARC as large as possible, but this can be very expensive. Cache drives provide a cheaper alternative to RAM for frequently accessed data. +ZFS has several features to help improve performance for frequent access data read operations. +One is Adaptive Replacement Cache (ARC), which uses the server memory (RAM). +The other is second level adaptive replacement cache (L2ARC), which uses cache drives added to ZFS storage pools. +These cache drives are multi-level cell (MLC) SSD drives and, while slower than system memory, are still much faster than standard hard drives. +ZFS (including TrueNAS) uses all of the RAM installed in a system to make the ARC as large as possible, but this can be very expensive. +Cache drives provide a cheaper alternative to RAM for frequently accessed data. ## How Does L2ARC Work? -When a system gets read requests, ZFS uses ARC (RAM) to serve those requests. When the ARC is full and there are L2ARC drives allocated to a ZFS pool, ZFS uses the L2ARC to serve the read requests that overflowed from the ARC. This reduces the use of slower hard drives and therefore increases system performance. +When a system gets read requests, ZFS uses ARC (RAM) to serve those requests. +When the ARC is full and there are L2ARC drives allocated to a ZFS pool, ZFS uses the L2ARC to serve the read requests that overflowed from the ARC. +This reduces the use of slower hard drives and therefore increases system performance. ## Implementation in TrueNAS -TrueNAS integrates L2ARC management in the web interface **Storage** section. Specifically, adding a *Cache* vdev to a new or existing pool and allocating drives to that pool enables L2ARC for that specific storage pool. +TrueNAS integrates L2ARC management in the web interface **Storage** section. +Specifically, adding a **Cache** vdev to a new or existing pool and allocating drives to that pool enables L2ARC for that specific storage pool. Cached drives are always striped, not mirrored. To increase an existing L2ARC size, stripe another cache device with it. You cannot share dedicated L2ARC devices between ZFS pools. -A cache device failure does not affect the integrity of the pool, but it may impact read performance depending on the workload and the dataset size to cache size ratio. +A cache device failure does not affect the integrity of the pool, but it might impact read performance depending on the workload and the dataset size to cache size ratio. ### Persistent L2ARC in CORE and SCALE @@ -28,43 +37,46 @@ When Persistent L2ARC is enabled, a sysctl repopulates the cache device mapping Persistent L2ARC preserves L2ARC performance even after a system reboot. However, persistent L2ARC for large data pools can drastically slow the reboot process, degrading middleware and web interface performance. -Because of this, we've disabled Persistent L2ARC by default in TrueNAS CORE, but you can manually activate it. +Because of this, we have disabled persistent L2ARC by default in TrueNAS CORE, but you can manually activate it. ### Activating Persistent L2ARC {{< tabs "L2ARC" >}} {{< tab "CORE" >}} Go to **System > Tunables** and click **ADD**. -For the **Variable**, enter `vfs.zfs.l2arc.rebuild_enabled`. Set the **Value** to **1** and the **Type** to **sysctl**. -We recommend noting in the **Description** that this is the Persistent L2ARC activation. -Make sure **Enabled** is set and click **SUBMIT**. +For the **Variable**, enter **vfs.zfs.l2arc.rebuild_enabled**. Set the **Value** to **1** and the **Type** to **sysctl**. +We recommend noting in the **Description** that this is the persistent L2ARC activation. +Make sure **Enabled** is selected and click **SUBMIT**. ![PersistentL2ARCTunable](/images/CORE/System/SystemTunablesL2ARCRebuild.png "Persistent L2ARC Activation") {{< nest-expand "CLI Instructions" "v" >}} {{< hint type=important >}} -TrueNAS CORE doesn't write settings changed through the CLI to the configuration database. TrueNAS will reset them on reboot. +TrueNAS CORE does not write settings changed through the CLI to the configuration database. TrueNAS resets them on reboot. {{< /hint >}} In a command line, enter `sysctl vfs.zfs.l2arc.rebuild_enabled=1`. When successful, the output reads: `vfs.zfs.l2arc.rebuild_enabled: 0 -> 1` {{< /nest-expand >}} {{< /tab >}} {{< tab "SCALE" >}} -TrueNAS SCALE enables Persistent L2ARC by default. We do not recommend users disable it. +TrueNAS SCALE enables persistent L2ARC by default. We do not recommend users disable it. {{< /tab >}} {{< /tabs >}} ## Device Recommendations -Like all complicated features, deciding whether L2ARC is effective or not requires a strong understanding of your storage environment, performance goals, and the software you're using. +Like all complicated features, deciding whether L2ARC is effective or not requires a strong understanding of your storage environment, performance goals, and the software you are using. However, we have a few recommendations for L2ARC devices: * Using multiple L2ARC devices helps reduce latency and improve performance. -* Random Read Heavy workloads can benefit from large capacity L2ARC SSDs. L2ARC SSDs are faster than the existing data storage drives. +* Using large capacity L2ARC SSDs can benefit random Read Heavy workloads. L2ARC SSDs are faster than the existing data storage drives. + +* Using an L2ARC device that is much faster than the data storage devices makes better use of its larger capacity. + Sequential or streaming workloads need very fast, low-latency L2ARC devices. + [We recommend Enterprise-grade NVMe devices](https://www.snia.org/sites/default/files/SDC/2019/presentations/File_Systems/McKenzie_Ryan_Best_Practices_for_OpenZFS_L2ARC_in_the_Era_of_NVMe.pdf). L2ARC device capacity depends on how much faster it is than the data storage devices. -* Sequential or streaming workloads need very fast, low-latency L2ARC devices. [We recommend Enterprise-grade NVMe devices](https://www.snia.org/sites/default/files/SDC/2019/presentations/File_Systems/McKenzie_Ryan_Best_Practices_for_OpenZFS_L2ARC_in_the_Era_of_NVMe.pdf). L2ARC device capacity depends on how much faster it is than the data storage devices. An L2ARC device that is much faster than the data storage devices makes better use of its larger capacity. ## Resources diff --git a/content/References/SLOG.md b/content/References/SLOG.md index d058b20886..f38ab65291 100644 --- a/content/References/SLOG.md +++ b/content/References/SLOG.md @@ -1,5 +1,6 @@ --- title: "SLOG Devices" +description: "Provides general information on ZFS intent logs (ZIL) and separate intent logs (SLOG), their uses cases and implementation in TrueNAS." weight: 40 tags: - slog @@ -30,7 +31,7 @@ This can provide a large benefit due to lower latency of a SLOG on SSD(s) vs dat ## SLOG Use Case A ZIL alone does not improve performance. -Every ZFS data pool uses a ZIL that is stored on disk to log synchronous writes before *flushing* to a final location in the storage. +Every ZFS data pool uses a ZIL that is stored on disk to log synchronous writes before flushing to a final location in the storage. This means synchronous writes operate at the speed of the storage pool and must write to the pool twice or more (depending on disk redundancy). A separate high-speed SLOG device provides the performance improvements so ZIL-based writes are not limited by pool input/outputs per second (IOPS) or penalized by the RAID configuration. @@ -46,16 +47,16 @@ Combined SLOG write throughput should be higher than the planned synchronous wri The iXsystems current recommendation is a 16 GB SLOG device over-provisioned from larger SSDs to increase the write endurance and throughput of an individual SSD. This 16 GB size recommendation is based on performance characteristics of typical HDD pools with SSD SLOGs and capped by the value of the tunable vfs.zfs.dirty_data_max_max. -TrueNAS Enterprise Appliances from iXsystems might have an additional platform specific auto-tuning set and are built with SLOG devices specifically set up for the performance of that appliance. +TrueNAS Enterprise appliances from iXsystems might have an additional platform specific auto-tuning set and are built with SLOG devices specifically set up for the performance of that appliance. ## TrueNAS Implementation Add and manage SLOG devices in the **Storage > Pools** web interface area. -When creating or expanding a pool, open the **ADD VDEV** drop-down list and select the **Log**. +When creating or expanding a pool, open the **ADD VDEV** dropdown list and select the **Log**. Allocate SSDs into this vdev according to your use case. To avoid data loss from device failure or any performance degradation, arrange the **Log VDev** as a mirror. -The drives *must* be the same size. +The drives must be the same size. As stated earlier in the recommended drive size is 16 GB after over-provisioning. See the [SLOG over-provisioning guide]({{< relref "CORE/CORETutorials/Storage/Pools/SLOGOverprovision.md" >}}) for over-provisioning procedures. diff --git a/content/References/SMB.md b/content/References/SMB.md index 910710f68f..be5ca884c2 100644 --- a/content/References/SMB.md +++ b/content/References/SMB.md @@ -1,17 +1,18 @@ --- title: "Server Message Block (SMB)" +description: "Provides general information on SMB protocol and shares, shadow copies and Time Machine implementation in TrueNAS." weight: 50 --- Server Message Block shares, also known as Common Internet File System (CIFS) shares, are accessible by Windows, macOS, Linux, and BSD computers. SMB provides more configuration options than NFS and is a good choice on a network for Windows or Mac systems. -TrueNAS uses [Samba](https://www.samba.org/) to share pools using Microsoft's SMB protocol. +TrueNAS uses [Samba](https://www.samba.org/) to share pools using Microsoft SMB protocol. SMB is built into the Windows and macOS operating systems and most Linux and BSD systems pre-install an SMB client to provide support for the SMB protocol. The SMB protocol supports many different types of configuration scenarios, ranging from the simple to complex. -The complexity of the scenario depends upon the types and versions of the client operating systems that will connect to the share, whether the network has a Windows server, and whether Active Directory is being used. +The complexity of the scenario depends upon the types and versions of the client operating systems that connects to the share, whether the network has a Windows server, and whether Active Directory is used. Depending on the authentication requirements, it might be necessary to create or import users and groups. Samba supports server-side copy of files on the same share with clients from Windows 8 and higher. @@ -27,13 +28,13 @@ Another helpful reference is [Methods For Fine-Tuning Samba Permissions](https:/ {{< /hint >}} By default, Samba disables NTLMv1 authentication for security. -Standard configurations of Windows XP and some configurations of later clients like Windows 7 will not be able to connect with NTLMv1 disabled. +Standard configurations of Windows XP and some configurations of later clients like Windows 7 are not able to connect with NTLMv1 disabled. [Security guidance for NTLMv1 and LM network authentication](https://support.microsoft.com/en-us/help/2793313/security-guidance-for-ntlmv1-and-lm-network-authentication) has information about the security implications and ways to enable NTLMv2 on those clients. -If changing the client configuration is not possible, NTLMv1 authentication can be enabled by selecting the **NTLMv1 auth** option in the SMB service configuration screen. +If changing the client configuration is not possible, enable NTLMv1 authentication by selecting the **NTLMv1 auth** option in the SMB service configuration screen. {{< include file="/_includes/SMBShareMSDOSalert.md" >}} -To view all active SMB connections and users, enter `smbstatus` in the TrueNAS **Shell**. +To view all active SMB connections and users, enter `smbstatus` in the TrueNAS SCALE **Shell** or open an SSH or local console shell in CORE. Most configuration scenarios require each user to have their own user account and to authenticate before accessing the share. This allows the administrator to control access to data, provide appropriate permissions to that data, and to determine who accesses and modifies stored data. @@ -41,7 +42,7 @@ A Windows domain controller is not needed for authenticated SMB shares, which me However, because there is no domain controller to provide authentication for the network, each user account must be created on the TrueNAS system. This type of configuration scenario is often used in home and small networks as it does not scale well if many user accounts are needed. -[Shadow Copies](https://en.wikipedia.org/wiki/Shadow_copy), also known as the Volume Shadow Copy Service (VSS) or Previous Versions, is a Microsoft service for creating volume snapshots. +[Shadow Copies](https://en.wikipedia.org/wiki/Shadow_copy), also known as the Volume Shadow Copy Service (VSS) or previous versions, is a Microsoft service for creating volume snapshots. Shadow copies can be used to restore previous versions of files from within Windows Explorer. By default, all ZFS snapshots for a dataset underlying an SMB share path are presented to SMB clients through the volume shadow copy service (or accessible directly with SMB if the hidden ZFS snapshot directory is located within the path of the SMB share). @@ -57,9 +58,9 @@ Before using shadow copies with TrueNAS, be aware of these caveats: * Users cannot delete shadow copies via an SMB client. Instead, the administrator can remove snapshots from the TrueNAS web interface. - Shadow copies can be disabled for an SMB share by unsetting the *Enable shadow copies* advanced option for the SMB share. + Shadow copies can be disabled for an SMB share by unsetting the **Enable shadow copies** advanced option for the SMB share. Note that this does not prevent access to the hidden .zfs/snapshot - directory for a ZFS dataset if it is located within the *Path* for an SMB share. + directory for a ZFS dataset if it is located within the **Path** for an SMB share. macOS includes the [Time Machine](https://support.apple.com/en-us/HT201250) feature which performs automatic backups. TrueNAS supports Time Machine backups for both SMB and AFP shares. @@ -67,7 +68,7 @@ TrueNAS supports Time Machine backups for both SMB and AFP shares. Configuring a quota for each Time Machine share helps prevent backups from using all available space on the TrueNAS system. Time Machine waits two minutes before creating a full backup. It then creates ongoing hourly, daily, weekly, and monthly backups. -**The oldest backups are deleted when a Time Machine share fills up, so make sure that the quota size is large enough to hold the desired number of backups.** +The oldest backups are deleted when a Time Machine share fills up, so make sure that the quota size is large enough to hold the desired number of backups. A default installation of macOS is over 20 GiB. Configure a global quota using the instructions in [Set up Time Machine for multiple machines with OSX Server-Style Quotas](https://forums.freenas.org/index.php?threads/how-to-set-up-time-machine-for-multiple-machines-with-osx-server-style-quotas.47173/) diff --git a/content/References/ZFSDeduplication.md b/content/References/ZFSDeduplication.md index e4423377ad..3035a32b58 100644 --- a/content/References/ZFSDeduplication.md +++ b/content/References/ZFSDeduplication.md @@ -1,69 +1,114 @@ --- title: "ZFS Deduplication" +description: "Provides general information on ZFS deduplication in TrueNAS,hardware recommendations, and useful deduplication CLI commands." weight: 60 --- -ZFS supports deduplication as a feature. Deduplication means that identical data is only stored once, and this can greatly reduce storage size. However deduplication is a compromise and balance between many factors, including cost, speed, and resource needs. It must be considered exceedingly carefully and the implications understood, before being used in a pool. +ZFS supports deduplication as a feature. Deduplication means that identical data is only stored once, and this can greatly reduce storage size. +However deduplication is a compromise and balance between many factors, including cost, speed, and resource needs. +It must be considered exceedingly carefully and the implications understood, before being used in a pool. ## Deduplication on ZFS -Deduplication is one technique ZFS can use to store file and other data in a pool. If several files contain the same pieces (blocks) of data, or any other pool data occurs more than once in the pool, ZFS will store just one copy of it. In effect instead of storing many copies of a book, it stores one copy and an arbitrary number of pointers to that one copy. Only when no file uses that data, is the data actually deleted. ZFS keeps a reference table which links files and pool data to the actual storage blocks containing "their" data. This is the Deduplication Table (DDT). +Deduplication is one technique ZFS can use to store file and other data in a pool. +If several files contain the same pieces (blocks) of data, or any other pool data occurs more than once in the pool, ZFS stores just one copy of it. +In effect instead of storing many copies of a book, it stores one copy and an arbitrary number of pointers to that one copy. +Only when no file uses that data, is the data actually deleted. +ZFS keeps a reference table which links files and pool data to the actual storage blocks containing their data. This is the deduplication table (DDT). -The DDT is a fundamental ZFS structure. It is treated as part of the pool's metadata. If a pool (or any dataset in the pool) has ever contained deduplicated data, the pool _will_ contain a DDT, and that DDT is as fundamental to the pool data as any of its other file system tables. Like any other metadata, DDT contents may temporarily be held in the ARC (RAM/memory cache) or [L2ARC]({{< relref "/content/references/L2ARC.md" >}}) (disk cache) for speed and repeated use, but the DDT is not a disk cache. It is a fundamental part of the ZFS pool structure, how ZFS organizes pool data on its disks. Therefore like any other pool data, if DDT data is lost, the pool is likely to become unreadable. So it is important it is stored on redundant devices. +The DDT is a fundamental ZFS structure. It is treated as part of the metadata or the pool. +If a pool (or any dataset in the pool) has ever contained deduplicated data, the pool contains a DDT, and that DDT is as fundamental to the pool data as any of its other file system tables. +Like any other metadata, DDT contents might be temporarily held in the ARC (RAM/memory cache) or [L2ARC]({{< relref "/content/references/L2ARC.md" >}}) (disk cache) for speed and repeated use, but the DDT is not a disk cache. +It is a fundamental part of the ZFS pool structure, how ZFS organizes pool data on its disks. +Therefore like any other pool data, if DDT data is lost, the pool is likely to become unreadable. So it is important it is stored on redundant devices. -A pool can contain any mix of deduplicated data and non-deduplicated data, coexisting. Data is written using the DDT if deduplication is enabled at the time of writing, and is written non-deduplicated if deduplication is not enabled at the time of writing. Subsequently, the data will remain as at the time it was written, until it is deleted. +A pool can contain any mix of deduplicated data and non-deduplicated data, coexisting. +Data is written using the DDT if deduplication is enabled at the time of writing, and is written non-deduplicated if deduplication is not enabled at the time of writing. Subsequently, the data remains as at the time it was written, until it is deleted. -The only way to convert existing current data to be all deduplicated or undeduplicated, or to change how it is deduplicated, is to create a new copy, while new settings are active. This could be done by copying the data within a file system, or to a different file system, or replicating using `zfs send` and `zfs receive` or the Web UI replication functions. Data in snapshots is fixed, and can only be changed by replicating the snapshot to a different pool with different settings (which preserves its snapshot status), or copying its contents. +The only way to convert existing current data to be all deduplicated or undeduplicated, or to change how it is deduplicated, is to create a new copy, while new settings are active. +This can be done by copying the data within a file system, or to a different file system, or replicating using `zfs send` and `zfs receive` or the Web UI replication functions. +Data in snapshots is fixed, and can only be changed by replicating the snapshot to a different pool with different settings (which preserves its snapshot status), or copying its contents. -It is possible to stipulate in a pool, that only certain datasets and volumes will be deduplicated. The DDT encompasses the entire pool, but only data in those locations will be deduplicated when written. Other data which will not deduplicate well or where deduplication is inappropriate, will not be deduplicated when written, saving resources. +It is possible to stipulate in a pool, to deduplicate only certain datasets and volumes. +The DDT encompasses the entire pool, but only data in those locations is deduplicated when written. +Other data which is not deduplicate well or where deduplication is inappropriate, is not be deduplicated when written, saving resources. ## Benefits -The main benefit of deduplication is that, where appropriate, it can greatly reduce the size of a pool and the disk count and cost. For example, if a server stores files with identical blocks, it could store thousands or even millions of copies for almost no extra disk space. When data is read or written, it is also possible that a large block read or write can be replaced by a smaller DDT read or write, reducing disk I/O size and quantity. +The main benefit of deduplication is that, where appropriate, it can greatly reduce the size of a pool and the disk count and cost. +For example, if a server stores files with identical blocks, it could store thousands or even millions of copies for almost no extra disk space. +When data is read or written, it is also possible that a large block read or write can be replaced by a smaller DDT read or write, reducing disk I/O size and quantity. ## Costs -The deduplication process is very demanding! There are four main costs to using deduplication: large amounts of RAM, requiring fast SSDs, CPU resources, and a general performance reduction. So the trade-off with deduplication is reduced server RAM/CPU/SSD performance and loss of "top end" I/O speeds in exchange for saving storage size and pool expenditures. +The deduplication process is very demanding! +There are four main costs to using deduplication: large amounts of RAM, requiring fast SSDs, CPU resources, and a general performance reduction. +So the trade-off with deduplication is reduced server RAM/CPU/SSD performance and loss of top end I/O speeds in exchange for saving storage size and pool expenditures. {{< expand "Reduced I/O" "v" >}} -Deduplication requires almost immediate access to the DDT. In a deduplicated pool, every block potentially needs DDT access. The number of small I/Os can be colossal; copying a 300 GB file could require tens, perhaps hundreds of millions of 4K I/O to the DDT. This is extremely punishing and slow. RAM must be large enough to store the entire DDT *and any other metadata* and the pool will almost always be configured using fast, high quality SSDs allocated as "special vdevs" for metadata. Data rates of *50,000-300,000* 4K I/O per second (IOPS) have been reported by the TrueNAS community for SSDs handling DDT. When the available RAM is insufficient, the pool runs extremely slowly. When the SSDs are unreliable or slow under mixed sustained loads, the pool can also slow down or even lose data if enough SSDs fail. +Deduplication requires almost immediate access to the DDT. In a deduplicated pool, every block potentially needs DDT access. +The number of small I/Os can be colossal; copying a 300 GB file could require tens, perhaps hundreds of millions of 4K I/O to the DDT. +This is extremely punishing and slow. RAM must be large enough to store the entire DDT and any other metadata and the pool almost always is configured using fast, high quality SSDs allocated as special vdevs for metadata. +Data rates of 50,000-300,000 4K I/O per second (IOPS) have been reported by the TrueNAS community for SSDs handling DDT. +When the available RAM is insufficient, the pool runs extremely slowly. +When the SSDs are unreliable or slow under mixed sustained loads, the pool can also slow down or even lose data if enough SSDs fail. {{< /expand >}} {{< expand "CPU Consumption" "v" >}} -Deduplication is extremely CPU intensive. Hashing is a complex operation and deduplication uses it on every read and write. It is possible for some operations (notably `scrub` and other intense activities) to use an entire *8 - 32* core CPU to meet the computational demand required for deduplication. +Deduplication is extremely CPU intensive. Hashing is a complex operation and deduplication uses it on every read and write. +It is possible for some operations (notably `scrub` and other intense activities) to use an entire 8 - 32 core CPU to meet the computational demand required for deduplication. {{< /expand >}} {{< expand "Reduced ZFS Performance" "v" >}} -Deduplication adds extra lookups and hashing calculations into the ZFS data pathway, which slows ZFS down significantly. A deduplicated pool does not reach the same speeds as a non-deduplicated pool. +Deduplication adds extra lookups and hashing calculations into the ZFS data pathway, which slows ZFS down significantly. +A deduplicated pool does not reach the same speeds as a non-deduplicated pool. {{< /expand >}} -When data is not sufficiently duplicated, deduplication wastes resources, slows the server down, and has no benefit. When data is already being heavily duplicated, then consider the costs, hardware demands, and impact of enabling deduplication **before** enabling on a ZFS pool. +When data is not sufficiently duplicated, deduplication wastes resources, slows the server down, and has no benefit. +When data is already being heavily duplicated, then consider the costs, hardware demands, and impact of enabling deduplication *before* enabling on a ZFS pool. ## Hardware Recommendations ### Disks -High quality mirrored SSDs configured as a "special vdev" for the DDT (and usually all metadata) are strongly recommended for deduplication unless the entire pool is built with high quality SSDs. Expect potentially severe issues if these are not used as described below. NVMe SSDs are recommended whenever possible. SSDs must be large enough to store all metadata. +High quality mirrored SSDs configured as a special vdev for the DDT (and usually all metadata) are strongly recommended for deduplication unless the entire pool is built with high quality SSDs. +Expect potentially severe issues if these are not used as described below. NVMe SSDs are recommended whenever possible. SSDs must be large enough to store all metadata. -The deduplication table (DDT) contains small entries about *300-900* bytes in size. It is primarily accessed using 4K reads. This places extreme demand on the disks containing the DDT. +The deduplication table (DDT) contains small entries about 300-900 bytes in size. It is primarily accessed using 4K reads. +This places extreme demand on the disks containing the DDT. -When choosing SSDs, remember that a deduplication-enabled server can have considerable mixed I/O and very long sustained access with deduplication. Try to find "real-world" performance data wherever possible. It is recommended to use SSDs that do not rely on a limited amount of fast cache to bolster a weak continual bandwidth performance. Most SSDs performance (latency) drops when the onboard cache is fully used and more writes occur. Always review the steady state performance for 4K random mixed read/write. +When choosing SSDs, remember that a deduplication-enabled server can have considerable mixed I/O and very long sustained access with deduplication. +Try to find real-world performance data wherever possible. +It is recommended to use SSDs that do not rely on a limited amount of fast cache to bolster a weak continual bandwidth performance. +Most SSDs performance (latency) drops when the onboard cache is fully used and more writes occur. +Always review the steady state performance for 4K random mixed read/write. -[Special vdev]({{< relref "CORE/CORETutorials/Storage/Pools/FusionPool.md" >}}) SSDs receive continuous, heavy I/O. HDDs and many common SSDs are inadequate. As of 2021, some recommended SSDs for deduplicated ZFS include Intel Optane 900p, 905p, P48xx, and better devices. Lower cost solutions are high quality consumer SSDs such as the Samsung EVO and PRO models. PCIe NVMe SSDs (NVMe, M.2 "M" key, or U.2) are recommended over SATA SSDs (SATA or M.2 "B" key). +[Special vdev]({{< relref "CORE/CORETutorials/Storage/Pools/FusionPool.md" >}}) SSDs receive continuous, heavy I/O. +HDDs and many common SSDs are inadequate. +As of 2021, some recommended SSDs for deduplicated ZFS include Intel Optane 900p, 905p, P48xx, and better devices. +Lower cost solutions are high quality consumer SSDs such as the Samsung EVO and PRO models. +PCIe NVMe SSDs (NVMe, M.2 "M" key, or U.2) are recommended over SATA SSDs (SATA or M.2 "B" key). -When special vdevs cannot contain all the pool metadata, then metadata is silently stored on other disks in the pool. When special vdevs become too full (about *85%-90%* usage), ZFS cannot run optimally and the disks operate slower. Try to keep special vdev usage under *65%-70%* capacity whenever possible. Try to plan how much future data will be added to the pool, as this increases the amount of metadata in the pool. More special vdevs can be added to a pool when more metadata storage is needed. +When special vdevs cannot contain all the pool metadata, then metadata is silently stored on other disks in the pool. +When special vdevs become too full (about 85%-90% usage), ZFS cannot run optimally and the disks operate slower. +Try to keep special vdev usage under 65%-70% capacity whenever possible. +Try to plan how much future data you wan to add to the pool, as this increases the amount of metadata in the pool. +More special vdevs can be added to a pool when more metadata storage is needed. ### RAM Deduplication is memory intensive. When the system does not contain sufficient RAM, it cannot cache DDT in memory when read and system performance can decrease. -The RAM requirement depends on the size of the DDT and how much data will be stored in the pool. Also, the more duplicated the data, the fewer entries and smaller DDT. Pools suitable for deduplication, with deduplication ratios of *3x* or more (data can be reduced to a third or less in size), might only need *1-3* GB of RAM per *1* TB of data. The actual DDT size can be estimated by deduplicating a limited amount of data in a temporary test pool, or by using `zdb -S` in a command line. +The RAM requirement depends on the size of the DDT and the amount of stored data to be added in the pool. +Also, the more duplicated the data, the fewer entries and smaller DDT. +Pools suitable for deduplication, with deduplication ratios of 3x or more (data can be reduced to a third or less in size), might only need 1-3 GB of RAM per 1 TB of data. +The actual DDT size can be estimated by deduplicating a limited amount of data in a temporary test pool, or by using `zdb -S` in a command line. -The [tunable]({{< relref "CORE/UIReference/System/Tunables.md" >}}) *vfs.zfs.arc.meta_min* (*type*=*LOADER*, *value*=*bytes*) can be used to force ZFS to reserve no less than the given amount of RAM for metadata caching. +Use the [tunable]({{< relref "CORE/UIReference/System/Tunables.md" >}}) **vfs.zfs.arc.meta_min** (*type*=*LOADER*, *value*=*bytes*) to force ZFS to reserve no less than the given amount of RAM for metadata caching. ### CPU -Deduplication consumes extensive CPU resources and it is recommended to use a high-end CPU with *4-6* cores at minimum. +Deduplication consumes extensive CPU resources and it is recommended to use a high-end CPU with 4-6 cores at minimum. ### Identifying Inadequate Hardware @@ -71,24 +116,39 @@ If deduplication is used in an inadequately built system, these symptoms might b {{< tabs "Hardware Symptoms" >}} {{< tab "RAM Starvation" >}} -* **Cause**: Continuous DDT access is limiting the available RAM or RAM usage is generally very high RAM usage. This can also slow memory access if the system uses swap space on disks to compensate. -* **Diagnose**: Open the command line and enter `top`. The header indicates ARC and other memory usage statistics. Additional commands for investigate RAM or ARC usage performance: `arc_summary` and `arcstat`. +* **Cause**: Continuous DDT access is limiting the available RAM or RAM usage is generally very high RAM usage. + This can also slow memory access if the system uses swap space on disks to compensate. +* **Diagnose**: Open the command line and enter `top`. + The header indicates ARC and other memory usage statistics. Additional commands to investigate RAM or ARC usage performance are `arc_summary` and `arcstat`. * **Solutions**: * Install more RAM. - * Add a new **System > Tunable**: *vfs.zfs.arc.meta_min* with *Type*=*LOADER* and *Value*=*bytes*. This specifies the minimum RAM that is reserved for metadata use and cannot be evicted from RAM when new file data is cached. + * Add a new **System > Tunable**: **vfs.zfs.arc.meta_min** with **Type**=**LOADER** and **Value**=**bytes**. + This specifies the minimum RAM that is reserved for metadata use and cannot be evicted from RAM when new file data is cached. {{< /tab >}} {{< tab "Disk I/O Slowdown" >}} * **Cause**: The system must perform disk I/O to fetch DDT entries, but these are usually 4K I/O and the underlying disk hardware is unable to cope in a timely manner. -* **Diagnose**: Open the command line and enter `gstat` to show heavy I/O traffic for either DDT or a generic pool, although DDT traffic is more often the cause. `zpool iostat` is another option that can show unexpected or very high disk latencies. When networking slowdowns are also seen, `tcpdump` or an application's TCP monitor can also show a low or zero TCP window over an extended duration. +* **Diagnose**: Open the command line and enter `gstat` to show heavy I/O traffic for either DDT or a generic pool, although DDT traffic is more often the cause. + `zpool iostat` is another option that can show unexpected or very high disk latencies. When networking slowdowns are also seen, `tcpdump` or a TCP monitor for an application can also show a low or zero TCP window over an extended duration. * **Solutions**: Add high quality SSDs as a special vdev and either move the data or rebuild the pool to use the new storage. {{< /tab >}} {{< tab "Unexpected Disconnections of Networked Resources" >}} -* **Cause**: This is a byproduct of the *Disk I/O Slowdown* issue. Network buffers can become congested with incomplete demands for file data and the entire ZFS I/O system is delayed by tens or hundreds of seconds because huge amounts of DDT entries have to be fetched. Timeouts occur when networking buffers can no longer handle the demand. Because all services on a network connection share the same buffers, all become blocked. This is usually seen as file activity working for a while and then unexpectedly stalling. File and networked sessions then fail too. Services can become responsive when the disk I/O backlog clears, but this can take several minutes. This problem is more likely to be seen when high speed networking is used because the network buffers fill faster. +* **Cause**: This is a byproduct of the disk I/O slowdown issue. + Network buffers can become congested with incomplete demands for file data and the entire ZFS I/O system is delayed by tens or hundreds of seconds because huge amounts of DDT entries have to be fetched. Timeouts occur when networking buffers can no longer handle the demand. + Because all services on a network connection share the same buffers, all become blocked. + This is usually seen as file activity working for a while and then unexpectedly stalling. File and networked sessions then fail too. + Services can become responsive when the disk I/O backlog clears, but this can take several minutes. + This problem is more likely when high speed networking is used because the network buffers fill faster. {{< /tab >}} {{< tab "CPU Starvation" >}} -* **Cause**: When ZFS has fast special vdev SSDs, sufficient RAM, and is not limited by disk I/O, then hash calculation becomes the next bottleneck. Most of the ZFS CPU consumption is from attempting to keep hashing up to date with disk I/O. When the CPU is overburdened, the console becomes unresponsive and the web UI fails to connect. Other tasks might not run properly because of timeouts. This is often encountered with [pool scrubs]({{< relref "CORE/CORETutorials/Tasks/CreatingScrubTasks.md" >}}) and it can be necessary to pause the scrub temporarily when other tasks are a priority. -* **Diagnose**: An easily seen symptom is that console logins or prompts take several seconds to display. Using `top` can confirm the issue. Generally, multiple entries with command *kernel {z_rd_int_[NUMBER]}* can be seen using the CPU capacity, and the CPU is heavily (98%+) used with almost no idle. -* **Solutions**: Changing to a higher performance CPU can help but might have limited benefit. 40 core CPUs have been observed to struggle as much as 4 or 8 core CPUs. A usual workaround is to temporarily pause scrub and other background ZFS activities that generate large amounts of hashing. It can also be possible to limit I/O using tunables that control disk queues and disk I/O ceilings, but this can impact general performance and is not recommended. +* **Cause**: When ZFS has fast special vdev SSDs, sufficient RAM, and is not limited by disk I/O, then hash calculation becomes the next bottleneck. + Most of the ZFS CPU consumption is from attempting to keep hashing up to date with disk I/O. + When the CPU is overburdened, the console becomes unresponsive and the web UI fails to connect. Other tasks might not run properly because of timeouts. + This is often encountered with [pool scrubs]({{< relref "CORE/CORETutorials/Tasks/CreatingScrubTasks.md" >}}) and it can be necessary to pause the scrub temporarily when other tasks are a priority. +* **Diagnose**: An easily seen symptom is that console logins or prompts take several seconds to display. Using `top` can confirm the issue. + Generally, multiple entries with command kernel {z_rd_int_[NUMBER]} can be seen using the CPU capacity, and the CPU is heavily (98%+) used with almost no idle. +* **Solutions**: Changing to a higher performance CPU can help but might have limited benefit. 40 core CPUs have been observed to struggle as much as 4 or 8 core CPUs. + A usual workaround is to temporarily pause scrub and other background ZFS activities that generate large amounts of hashing. + It can also be possible to limit I/O using tunables that control disk queues and disk I/O ceilings, but this can impact general performance and is not recommended. {{< /tab >}} {{< /tabs >}} @@ -96,7 +156,12 @@ If deduplication is used in an inadequately built system, these symptoms might b {{< tabs "Deduplication CLI Commands" >}} {{< tab "zpool status -D or -Dv" >}} -Shows a summary of DDT statistics for each pool, or the specified pool.
Typical output will include a line like this:
dedup: DDT entries 227317061, size 672B on disk, 217B in core
This means that the DDT contains 227 million blocks, and each block is using 672 bytes in the pool, and 217 bytes of RAM when cached in ARC. The two values differ because ZFS uses different structures for DDT entries on disk and in RAM. There is also a table, showing how many blocks (actual and referenced) are duplicated, summarized in bands (or "buckets") of powers of 2, and their average actual and referenced sizes. +Shows a summary of DDT statistics for each pool, or the specified pool. +Typical output includes a line like this: +dedup: DDT entries 227317061, size 672B on disk, 217B in core +This means that the DDT contains 227 million blocks, and each block is using 672 bytes in the pool, and 217 bytes of RAM when cached in ARC. +The two values differ because ZFS uses different structures for DDT entries on disk and in RAM. +There is also a table, showing how many blocks (actual and referenced) are duplicated, summarized in bands (or buckets) of powers of 2, and their average actual and referenced sizes. {{< /tab >}} {{< tab "zdb -U /data/zfs/zpool.cache -S [POOL_NAME]" >}} Estimates the outcome and DDT table size if a pool were entirely deduplicated. Warning: this can take many hours to complete. The output table is similar to that of `zpool status -Dv`. @@ -105,7 +170,8 @@ Estimates the outcome and DDT table size if a pool were entirely deduplicated. W These show core deduplication statistics for each pool. The `-v` option shows disk usage for each individual vdev, which helps confirm that DDT has not overflowed into other disks in the pool. {{< /tab >}} {{< tab "zpool iostat" >}} -Provides detailed analysis and statistics for disk I/O latency. Healthy pool latencies are generally in the nanoseconds to tens of milliseconds range. If latencies in the seconds or tens of seconds are seen, this indicates a problem with disk usage. This means that certain disks are unable to service commands at the speed needed and there is a large command backlog. +Provides detailed analysis and statistics for disk I/O latency. Healthy pool latencies are generally in the nanoseconds to tens of milliseconds range. +If latencies in the seconds or tens of seconds are seen, this indicates a problem with disk usage. This means that certain disks are unable to service commands at the speed needed and there is a large command backlog. {{< /tab >}} {{< tab "top, top -mio, and gstat" >}} These commands monitor RAM, CPU, and disk I/O. @@ -117,7 +183,11 @@ These utilities provide much more information about RAM and memory caching syste {{< expand "Hashing Note" "v" >}} Deduplication hashes (calculates a digital signature) for the data in each block to be written to disk and checking to see if data already exists in the pool with the same hash. -When a block exists with the same hash, then the block is not written and a new pointer is written to the DDT and saving that space. Depending how the hash is calculated, there is a possibility that two different blocks could have the same hash and cause the file system to believe the blocks are the same. When choosing a hash, choose one that is complex, like *SHA 256*, *SHA 512*, and *Skein*, to minimize this risk. A *SHA 512* checksum hash is recommended for 64-bit platforms. To manually change at the time dedup is enabled on a pool, or any dataset/volume within a pool, use zfs set checksum=sha512 . +When a block exists with the same hash, then the block is not written and a new pointer is written to the DDT and saving that space. +Depending how the hash is calculated, there is a possibility that two different blocks could have the same hash and cause the file system to believe the blocks are the same. +When choosing a hash, choose one that is complex, like SHA 256, SHA 512, and Skein, to minimize this risk. +A SHA 512 checksum hash is recommended for 64-bit platforms. +To manually change at the time dedup is enabled on a pool, or any dataset/volume within a pool, use zfs set checksum=sha512 . {{< /expand >}} ## Additional Resources diff --git a/content/References/ZFSPrimer.md b/content/References/ZFSPrimer.md index 0964582f57..44beb4f185 100644 --- a/content/References/ZFSPrimer.md +++ b/content/References/ZFSPrimer.md @@ -135,7 +135,7 @@ In some cases, two separate pools might be more efficient: one on SSDs for activ After adding an L2ARC device, monitor its effectiveness using tools such as `arcstat`. To increase the size of an existing L2ARC, stripe another cache device with it. The web interface always stripes L2ARC instead of mirroring it since the system recreates L2ARC contents at boot. -If an individual L2ARC pool SSD fails, it will not affect pool integrity, but it might impact read performance depending on the workload and the ratio of dataset size to cache size. +If an individual L2ARC pool SSD fails, itdoes not affect pool integrity, but it might impact read performance depending on the workload and the ratio of dataset size to cache size. You cannot share dedicated L2ARC devices between ZFS pools. ### ZFS Redundancy and RAID diff --git a/content/References/ZILandSLOG.md b/content/References/ZILandSLOG.md index 5c013c128c..8dd4f7bb9f 100644 --- a/content/References/ZILandSLOG.md +++ b/content/References/ZILandSLOG.md @@ -1,18 +1,39 @@ --- title: "ZFS ZIL and SLOG Demystified" +description: "Provides clarification on ZFS, ZIL, and SLOG concepts." weight: 80 --- -The ZIL and SLOG are two of the most misunderstood concepts in ZFS and hopefully this will clear things up +The ZIL and SLOG are two of the most misunderstood concepts in ZFS and hopefully this helps clear things up. As you surely know by now, ZFS is taking extensive measures to safeguard your data and it should be no surprise that these two buzzwords represent key data safeguards. What is not obvious however is that they only come into play under very specific circumstances. -**The first thing to understand is that ZFS behaves like any other file system with regard to asynchronous and synchronous writes:** When data is written to disk, it can either be buffered in RAM by the operating system’s kernel prior to being written to disk, or it can be immediately written to disk. The buffered asynchronous behavior is often used because of the perceived speed that it provides the user, while synchronous behavior is used for the integrity it guarantees. A synchronous write is only reported as successful to the application that requested it when the underlying disk has confirmed completion of it. Synchronous write behavior is determined by either the file being opened with the `O_SYNC` flag set by the application, or the underlying file systems being explicitly mounted in “synchronous” mode. Synchronous writes are desired for consistency-critical applications such as databases and some network protocols such as NFS but come at the cost of slower write performance. In the case of ZFS, the `sync=standard` property of a pool or dataset will provide POSIX-compatible “synchronous only if requested” write behavior while `sync=always` will force synchronous write behavior akin to a traditional file system being mounted in synchronous mode. +**The first thing to understand is that ZFS behaves like any other file system with regard to asynchronous and synchronous writes.** -**“Asynchronous unless requested otherwise” write behavior is taken for granted in modern computing with the caveat that buffered writes are simply lost in the case of a kernel panic or power loss.**Applications and file systems vary in how they handle such interruptions and ZFS fortunately guarantees that you can only lose the few seconds worth of writes that came after the last successful transaction group. Given the choice between the performance of asynchronous writes with the integrity of synchronous writes, a compromise is achieved with the ZFS Intent Log or “ZIL”. Think of the ZIL as the street-side mailbox of a large office: it is fast to use from the postal carrier’s perspective and is secure from the office’s perspective, but the mail in the mailbox is by no means sorted for its final destinations yet. When synchronous writes are requested, the ZIL is the short-term place on disk where the data lands prior to being formally spread across the pool for long-term storage at the configured level of redundancy. There are however two special cases when the ZIL is not used despite being requested: If large blocks are used or the `logbias=throughput` property is set. +When data is written to disk, it can either be buffered in RAM by the operating system kernel prior to being written to disk, or it can be immediately written to disk. +The buffered asynchronous behavior is often used because of the perceived speed that it provides the user, while synchronous behavior is used for the integrity it guarantees. +A synchronous write is only reported as successful to the application that requested it when the underlying disk has confirmed completion of it. +Synchronous write behavior is determined by either the file being opened with the `O_SYNC` flag set by the application, or the underlying file systems being explicitly mounted in *synchronous* mode. +Synchronous writes are desired for consistency-critical applications such as databases and some network protocols such as NFS but come at the cost of slower write performance. +In the case of ZFS, the `sync=standard` property of a pool or dataset provides POSIX-compatible synchronous-only-if-requested write behavior while `sync=always` forces synchronous write behavior akin to a traditional file system being mounted in synchronous mode. -**By default, the short-term ZIL storage exists on the same hard disks as the long-term pool storage at the expense of all data being written to disk twice: once to the short-term ZIL and again across the long-term pool.** Because each disk can only perform one operation at a time, the performance penalty of this duplicated effort can be alleviated by sending the ZIL writes to a Separate ZFS Intent Log or “SLOG”, or simply “log”. While using a spinning hard disk as SLOG will yield performance benefits by reducing the duplicate writes to the same disks, it is a poor use of a hard drive given the small size but high frequency of the incoming data. +**“Asynchronous unless requested otherwise” write behavior is taken for granted in modern computing with the caveat that buffered writes are simply lost in the case of a kernel panic or power loss.** -**The optimal SLOG device is a small, flash-based device such an SSD or NVMe card, thanks to their inherent high-performance, low latency and of course persistence in case of power loss.** You can mirror your SLOG devices as an additional precaution and will be surprised what speed improvements can be gained from only a few gigabytes of separate log storage. Your storage pool will have the write performance of an all-flash array with the capacity of a traditional spinning disk array. This is why we ship every spinning-disk TrueNAS system with a high-performance flash SLOG and make them a standard option on our FreeNAS Certified line. +Applications and file systems vary in how they handle such interruptions and ZFS fortunately guarantees that you can only lose the few seconds worth of writes that came after the last successful transaction group. +Given the choice between the performance of asynchronous writes with the integrity of synchronous writes, a compromise is achieved with the ZFS Intent Log or ZIL. +Think of the ZIL as the street-side mailbox of a large office: it is fast to use from the postal carrier perspective and is secure from the office perspective, but the mail in the mailbox is by no means sorted for its final destinations yet. +When synchronous writes are requested, the ZIL is the short-term place on disk where the data lands prior to being formally spread across the pool for long-term storage at the configured level of redundancy. +There are however two special cases when the ZIL is not used despite being requested: if large blocks are used or the `logbias=throughput` property is set. + +**By default, the short-term ZIL storage exists on the same hard disks as the long-term pool storage at the expense of all data being written to disk twice: once to the short-term ZIL and again across the long-term pool.** + +Because each disk can only perform one operation at a time, the performance penalty of this duplicated effort can be alleviated by sending the ZIL writes to a separate ZFS intent log or SLOG, or simply log. +While using a spinning hard disk as SLOG yields performance benefits by reducing the duplicate writes to the same disks, it is a poor use of a hard drive given the small size but high frequency of the incoming data. + +**The optimal SLOG device is a small, flash-based device such an SSD or NVMe card, thanks to their inherent high-performance, low latency and of course persistence in case of power loss.** + +You can mirror your SLOG devices as an additional precaution and be surprised what speed improvements can be gained from only a few gigabytes of separate log storage. +Your storage pool has the write performance of an all-flash array with the capacity of a traditional spinning disk array. +This is why we ship every spinning-disk TrueNAS system with a high-performance flash SLOG and make them a standard option on our FreeNAS Certified line. Thank you **Matthew Ahrens** of the OpenZFS project for reviewing this article. diff --git a/content/References/_index.md b/content/References/_index.md index 8b406221bb..3496574aa8 100644 --- a/content/References/_index.md +++ b/content/References/_index.md @@ -16,6 +16,6 @@ related: false The TrueNAS **References** section includes additional information on various topics helpful for a TrueNAS user. -{{< children depth="2" >}} +{{< children depth="2" description="true" >}} If you are searching for additional information on a topic and are unable to find it, you can try searching the [TrueNAS Community Forum](https://www.truenas.com/community/).