01_About_ESOS

Marc A. Smith edited this page Mar 1, 2017 · 9 revisions

Introduction

ESOS started with an internal need for a fully functioning SCST / Linux system that could easily be deployed to new servers. We also wanted this Linux "distribution" to be fully optimized for SCST and include necessary RAID controller tools/utilities so new volumes could easily be provisioned/modified from inside the OS.

After some time passed and our comfort level with ESOS/SCST grew, we realized the next logical step: Highly available storage clusters. We pumped a number of new features (software projects) into ESOS, including, but not limited to DRBD, LVM2, Linux software RAID (md), mhVTL, and a full-featured cluster stack (Pacemaker + Corosync).


Included Projects

ESOS uses the software projects listed below; you can check one of the CHECKSUM files in the source tree for an up-to-date list, and the specific versions (varies by release) of each package.

  • Linux kernel
  • SCST
  • BusyBox
  • GRUB
  • SysVinit
  • GLIBC
  • vixie-cron
  • libibumad
  • libibverbs
  • srptools
  • OpenSSH
  • sSMTP
  • Perl
  • OpenSSL
  • e2fsprogs
  • zlib
  • lsscsi
  • sg3_utils
  • groff
  • ncurses
  • kexec-tools
  • GCC (only libstdc++ and libgcc are installed to image)
  • iniparser
  • CDK
  • DRBD
  • LVM2
  • QLogic Fibre Channel Binary Firmware
  • xfsprogs
  • mdadm
  • GNU Parted
  • libqb
  • Pacemaker
  • Corosync
  • nss
  • glib
  • libxml2
  • libxslt
  • libtool
  • bzip2
  • Python
  • crmsh
  • libaio
  • glue
  • readline
  • resource-agents
  • mhVTL
  • lzo
  • mhash
  • lessfs
  • tokyocabinet
  • FUSE
  • Berkeley DB
  • Google's Snappy
  • fence-agents
  • OpenSM
  • pycurl
  • curl
  • Net-Telnet (Perl module)
  • python-suds (Python module)
  • setuptools (Python module)
  • pexpect (Python module)
  • GNU Bash
  • Open-FCoE
  • Open-LLDP
  • libconfig
  • libnl
  • libpciaccess
  • Linux Firmware (package of binary blobs)
  • dlm
  • sysklogd
  • ipmitool
  • less
  • fio
  • mtx
  • mt-st
  • nagios-plugins
  • libedit
  • libmcrypt
  • nsca
  • btrfs-progs
  • attr
  • acl
  • htop
  • dmidecode
  • xmlrpc-c
  • stunnel
  • sudo
  • rsync
  • eudev
  • gperf
  • multipath-tools
  • archivemount
  • libarchive
  • mcelog
  • edac-utils
  • sysfsutils
  • sqlite-autoconf
  • iperf2
  • iperf3
  • libpcap
  • iftop
  • nmon
  • munin-c
  • open-iscsi
  • open-isns
  • coreutils
  • open-vm-tools
  • libdnet
  • libffi
  • ptyprocess
  • requests
  • openwsman
  • swig
  • userspace-rcu
  • docutils
  • tzdata
  • nvme-cli
  • freetype

Several other proprietary pieces are options that can be downloaded and included at install time:

  • storcli64 (for Broadcom/Avago/LSI MegaRAID controllers)
  • perccli64 (For Dell PERC RAID controllers)
  • arcconf (for Microsemi/Adaptec AACRAID controllers)
  • hpacucli (for HP Smart Array controllers)
  • hpssacli (for HP Gen8+ Smart Array controllers)
  • cli64 (for Areca RAID controllers)
  • tw_cli.x86_64 (for 3ware SATA/SAS RAID controllers)
  • MegaCli64 (supports MegaRAID controllers for special use, deprecated in the TUI)

How It Works

ESOS boots from a USB flash drive; all of the binaries/files/directories/etc. are loaded into memory on boot. If the USB flash drive fails, the system will keep running normally until the failed flash drive can be addressed (replaced). Configuration files and settings are sync'd to a file system since ESOS is volatile (memory resident). Log files are also archived to the USB drive on shutdown/restart or if the file system grows too large. This also provides an easy and reversible upgrade procedure: You simply create a new, updated ESOS USB flash drive, copy your configuration to it, and boot the new drive -- if you happen to experience an issue with the new version, you can always boot back into your previous ESOS USB drive.

Here is a high level step-through of the ESOS boot process:

  1. The ESOS USB flash drive is used as the BIOS boot device.
  2. GRUB is loaded; user can select between ESOS 'Production' or 'Debug' kernel/modules.
  3. Selected kernel (and initramfs image) is loaded; initramfs init then takes care of various prep. tasks, initializes tmpfs file system (RAM) and then extracts root image into newly created tmpfs file system.
  4. Control is then passed to init and the rc/init scripts are executed.
  5. Running ESOS configuration is synchronized with USB flash drive.
  6. Various daemons (sshd, crond, etc.) are started, HBA/HCA/CNA and SCST modules are loaded.

In the hopefully rare case of kernel panics, kexec is implemented in ESOS. At boot, a crash dump kernel is loaded. If/when the kernel panics, the system loads the crash dump kernel. The initramfs init script is ran again and it looks for the '/proc/vmcore' file indicating the crash dump kernel is running, due to a kernel panic. The vmcore file is then compressed and saved onto the "esos_logs" filesystem. Finally the system does a full reboot and boots back into the normal/production ESOS kernel. The start-up scripts look for any saved vmcore files and will email an alert. This is all fully automated -- the idea is to save the kernel panic information for diagnosing at a later time and get your ESOS storage server back into production mode as quickly as possible.