Permalink
Browse files

Added correct copyright notice

  • Loading branch information...
1 parent 3fbcf63 commit b98903038ed7c996954ebf589bfe7da191e8075c @bbulkow bbulkow committed Nov 29, 2012
Showing with 4,394 additions and 2 deletions.
  1. +37 −0 Makefile
  2. +36 −0 Makesalt
  3. +255 −0 README
  4. +0 −2 README.md
  5. +1,164 −0 act.c
  6. +30 −0 actconfig.txt
  7. +388 −0 atomic.h
  8. +49 −0 clock.c
  9. +66 −0 clock.h
  10. +172 −0 histogram.c
  11. +43 −0 histogram.h
  12. +371 −0 latency_calc/act_latency.py
  13. +280 −0 latency_calc/cf_latency.py
  14. +85 −0 latency_calc/cf_latency.sh
  15. +4 −0 latency_calc/get_latency
  16. +737 −0 queue.c
  17. +142 −0 queue.h
  18. +6 −0 runact
  19. +529 −0 salt.c
View
@@ -0,0 +1,37 @@
+
+DIR_INCLUDE = .
+DIR_OBJECT = .
+DIR_TARGET = .
+
+HEADERS = atomic.h clock.h histogram.h queue.h
+SOURCES = act.c clock.c histogram.c queue.c
+TARGET = act
+
+CC = gcc
+#CFLAGS = -g -O3 -fno-common -std=gnu99 -Wall -D_REENTRANT
+CFLAGS = -g -fno-common -std=gnu99 -Wall -D_REENTRANT -D_FILE_OFFSET_BITS=64
+CFLAGS += -MMD
+LDFLAGS = $(CFLAGS)
+INCLUDES = $(DIR_INCLUDE:%=-I%) -I/usr/include
+LIBRARIES = -lpthread -lrt -lcrypto -lssl
+
+OBJECTS = $(SOURCES:%.c=$(DIR_OBJECT)/%.o)
+DEPENDENCIES = $(OBJECTS:%.o=%.d)
+
+default: $(TARGET)
+
+.PHONY: clean
+clean:
+ /bin/rm -f $(OBJECTS) $(DEPENDENCIES) $(TARGET)
+
+.PHONY: $(TARGET)
+$(TARGET): $(OBJECTS)
+ echo "Linking $@"
+ $(CC) $(LDFLAGS) -o $(TARGET) $(OBJECTS) $(LIBRARIES)
+
+-include $(DEPENDENCIES)
+
+$(DIR_OBJECT)/%.o: %.c
+ echo "Building $@"
+ $(CC) $(CFLAGS) -o $@ -c $(INCLUDES) $<
+
View
@@ -0,0 +1,36 @@
+
+DIR_INCLUDE = .
+DIR_OBJECT = .
+DIR_TARGET = .
+
+SOURCES = salt.c
+TARGET = salt
+
+CC = gcc
+#CFLAGS = -g -O3 -fno-common -std=gnu99 -Wall -D_REENTRANT
+CFLAGS = -g -fno-common -std=gnu99 -Wall -D_REENTRANT -D_FILE_OFFSET_BITS=64
+CFLAGS += -MMD
+LDFLAGS = $(CFLAGS)
+INCLUDES = $(DIR_INCLUDE:%=-I%) -I/usr/include
+LIBRARIES = -lpthread -lrt -lssl
+
+OBJECTS = $(SOURCES:%.c=$(DIR_OBJECT)/%.o)
+DEPENDENCIES = $(OBJECTS:%.o=%.d)
+
+default: $(TARGET)
+
+.PHONY: clean
+clean:
+ /bin/rm -f $(OBJECTS) $(DEPENDENCIES) $(TARGET)
+
+.PHONY: $(TARGET)
+$(TARGET): $(OBJECTS)
+ echo "Linking $@"
+ $(CC) $(LDFLAGS) -o $(TARGET) $(OBJECTS) $(LIBRARIES)
+
+-include $(DEPENDENCIES)
+
+$(DIR_OBJECT)/%.o: %.c
+ echo "Building $@"
+ $(CC) $(CFLAGS) -o $@ -c $(INCLUDES) $<
+
View
255 README
@@ -0,0 +1,255 @@
+Test Process Overview
+---------------------
+
+1. Clean and initialize the storage device(s).
+
+2. Run the act executable.
+
+3. Analyze act's output using the act_latency.py script.
+
+
+Caution
+-------
+
+THE TESTS DESTROY ALL DATA ON THE TEST DEVICES!
+
+When cleaning, initializing, and running tests, make sure the devices are
+specified by name correctly.
+
+Also make sure that the test devices are not mounted.
+
+Run the command:
+ $ mount
+and examine the result. e.g. the result:
+ /dev/sda1 on /boot type ext3 (rw)
+implies device /dev/sda1 is mounted.
+
+Also run the command:
+ $ sudo /sbin/pvscan
+and examine the result. e.g. the result:
+ PV /dev/sda2 VG VolGroup00 lvm2 [19.88 GB / 0 free]
+implies device /dev/sda2 is mounted.
+
+Unmount any intended test devices that are mounted.
+
+
+Cleaning and Initializing Devices
+---------------------------------
+
+For consistency, and to obtain test results that model the long-time
+equilibrium condition expected in Aerospike production servers, it is best to
+prepare storage devices by first cleaning them (writing zeros everywhere) and
+then "salting" them (writing random data everywhere).
+
+This package contains actprep, an executable that may be used to clean and salt
+a device. actprep takes a device name as its only command-line parameter. For
+a typical 240GB SSD, actprep takes a little over an hour to run.
+
+Example - to clean and salt device /dev/sdc:
+ $ sudo ./actprep /dev/sdc
+
+
+act Overview
+------------
+
+act is a program for testing storage device IO. Its primary purpose is to
+measure the latency of small read transactions while modeling the Aerospike
+server's device IO pattern as closely as practical.
+
+These IO patterns will be very similar to many databases focused on real-time
+performance. Databases of this type will constantly read and write from
+disk, without allowing the drive time to rest. This tool very easily shows
+latency responses under write load.
+
+Three types of IO operations occur during a test run:
+
+1. Small (~2 Kbyte) read operations, typically several thousand per second.
+
+2. Large-block (~128 Kbyte) read operations, typically a few tens per second.
+
+3. Large-block write operations, same size and rate as large-block reads.
+
+The small read operations model client transaction requests. They occur at a
+specified rate. Requests are added at this rate to a specified number of
+read transaction queues, each of which is serviced by a specified number of
+threads.
+
+The large-block read and write operations model the Aerospike server's
+defragmentation process. They occur at a specified rate, executed from one
+dedicated large-block read thread and one dedicated large-block write thread per
+device.
+
+
+Using act
+---------
+
+Necessary files: act (the executable), plus a configuration text file.
+
+For ease of use, this package includes five example configuration files:
+
+ - actconfig_1x.txt - run a normal load test on one device
+
+ - actconfig_3x.txt - run a 3 times normal load test on one device
+
+ - actconfig_6x.txt - run a 6 times normal load test on one device
+
+ - actconfig_1x_2d.txt - run a normal load test on two devices at a time
+
+ - actconfig_1x_4d.txt - run a normal load test test on four devices at a time
+
+These configuration files must be modified to make sure the device-names field
+(see below) specifies exactly the device(s) to be tested.
+
+The other fields in the configuration files should not be changed without good
+reasons. As they are, the files specify 24-hour tests with IO patterns and
+loads very similar to Aerospike production servers.
+
+Usage example:
+ $ sudo ./act actconfig.txt > ouput.txt
+
+act outputs to stdout, so for normal (long-duration) tests, pipe to an output
+file as above. This will be necessary to run the act_latency.py script to
+analyze the output.
+
+If running act from a remote terminal, it is best to run it as a background
+process, or within a "screen". To verify that act is running, tail the output
+text file with the -f option.
+
+Note that if the drive(s) being tested perform so badly that act's internal
+transaction queues become extremely backed-up, act will halt before the
+configured test duration has elapsed. act may also halt prematurely if it
+encounters unexpected drive I/O or system errors.
+
+
+act Configuration File
+----------------------
+
+All fields use a "name-token: value" format, and must be on a single line.
+Field order in the file is unimportant. Integer values must be in decimal. To
+add comments, use '#' at the beginning of a line. The fields are:
+
+device-names
+The value is a comma-separated list of device names (full path), such as
+/dev/sdb. Make absolutely sure the devices named are exactly the devices to be
+used in the test.
+
+queue-per-device
+The value is either yes or no. If the field is left out, the default is no.
+This flag determines act's internal read transaction queue setup -- yes means
+each device is read by a single dedicated read transaction queue, no means each
+device is read by all read transaction queues.
+
+num-queues
+The value is a non-zero integer. This is the total number of read transaction
+queues. However if queue-per-device is set to yes, this field is ignored,
+since in this case the number of queues is determined by the number of devices.
+
+threads-per-queue
+The value is a non-zero integer. This is the number of threads per read
+transaction queue that execute the read transactions.
+
+test-duration-sec
+The value is a non-zero integer. This is the duration of the test, in seconds.
+Note that it has to be a single number, e.g. use 86400, not 60*60*24.
+
+report-interval-sec
+The value is a non-zero integer. This is the interval between metric reports,
+in seconds.
+
+read-reqs-per-sec
+The value is a non-zero integer. This is the total read transaction rate. Note
+that it is not per device, or per read transaction queue.
+
+large-block-ops-per-sec
+The value is a non-zero integer. This is the total rate used for both
+large-block write and large-block read operations. Note that it is not per
+device.
+
+read-req-num-512-blocks
+The value is a non-zero integer. This is the size read in each read
+transaction, in 512-byte blocks, e.g. for 1.5-Kbyte reads, use 3.
+
+large-block-op-kbytes
+The value is a non-zero integer. This is the size written and read in each
+large-block write and large-block read operation respectively, in Kbytes.
+
+use-valloc
+The value is either yes or no. If the field is left out, the default is no.
+This flag determines act's memory allocation mechanism for read transaction
+buffers -- yes means a system memory allocation call is used, no means dynamic
+stack allocation is used.
+
+num-write-buffers
+The value is an integer. If the field is left out, the default is 0. This is
+the number of different large blocks of random data we choose from when doing a
+large-block write operation -- 0 will cause all zeros to be written every time.
+
+scheduler-mode
+The value is either noop or cfq. If the field is left out, the default is noop.
+This sets the mode in /sys/block/<device>/queue/scheduler for all the devices in
+the test run -- noop means no special scheduling is done for device IO
+operations, cfq means operations may be reordered to optimize for physical
+constraints imposed by rotating disc drives (which likely means it hurts
+performance for ssds).
+
+
+Analyzing act Output
+--------------------
+
+Run act_latency.py to process a act output file and tabulate data about
+"latencies" (small read transactions that took longer than usual).
+
+Example usage:
+ $ ./act_latency.py -l output.txt
+
+act_latency.py command-line parameters:
+
+ -l <act output file name>
+
+ -t <analysis slice interval in seconds> (default is 3600)
+
+(There are two other optional parameters for more advanced use, to control which
+latency thresholds are displayed.)
+
+The script will analyze the act output in time slices as specified, and display
+latency data above various thresholds for each slice. The script output will
+show latencies both for end-to-end transactions (which include time spent on the
+transaction queues) and for the device IO portion of transactions.
+
+Example act_latency.py output (for a act output file yielding 12 slices):
+
+ trans device
+ %>(ms) %>(ms)
+slice 1 8 64 1 8 64
+----- ------ ------ ------ ------ ------ ------
+ 1 1.67 0.00 0.00 1.63 0.00 0.00
+ 2 1.38 0.00 0.00 1.32 0.00 0.00
+ 3 1.80 0.14 0.00 1.56 0.08 0.00
+ 4 1.43 0.00 0.00 1.39 0.00 0.00
+ 5 1.68 0.00 0.00 1.65 0.00 0.00
+ 6 1.37 0.00 0.00 1.33 0.00 0.00
+ 7 1.44 0.00 0.00 1.41 0.00 0.00
+ 8 1.41 0.00 0.00 1.35 0.00 0.00
+ 9 2.70 0.73 0.00 1.91 0.08 0.00
+ 10 1.54 0.00 0.00 1.51 0.00 0.00
+ 11 1.53 0.00 0.00 1.48 0.00 0.00
+ 12 1.47 0.00 0.00 1.43 0.00 0.00
+----- ------ ------ ------ ------ ------ ------
+ avg 1.62 0.07 0.00 1.50 0.01 0.00
+ max 2.70 0.73 0.00 1.91 0.08 0.00
+
+
+Device Pass/Fail Criteria
+-------------------------
+
+To deploy a device in production, Aerospike expects it to be able to perform
+consistently as follows:
+
+In any one-hour period, we must find that:
+
+ - fewer than 5% of transactions exceed 1 ms
+ - fewer than 1% of transactions exceed 8 ms
+ - fewer than 0.1% of transactions exceed 64 ms
+
+A device which does not violate these thresholds for 48 hours is considered
+production-worthy.
View
@@ -1,2 +0,0 @@
-act
-===
Oops, something went wrong.

0 comments on commit b989030

Please sign in to comment.