block: Adding ROW scheduling algorithm

This patch adds the implementation of a new scheduling algorithm - ROW. The policy of this algorithm is to prioritize READ requests over WRITE as much as possible without starving the WRITE requests. The requests are kept in queues according to their priority. The dispatch is done in a Round Robin manner with a different slice for each queue. READ request queues get bigger dispatch quantum than the write requests. Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org> Conflicts: block/Kconfig.iosched block/Makefile + ...
voku · Jan 1, 2013 · 2f0f5cf · 2f0f5cf
1 parent 4543840
commit 2f0f5cf
Show file tree

Hide file tree

Showing 4 changed files with 850 additions and 0 deletions.
diff --git a/Documentation/block/row-iosched.txt b/Documentation/block/row-iosched.txt
@@ -0,0 +1,134 @@
+Introduction
+============
+
+The ROW scheduling algorithm will be used in mobile devices as default
+block layer IO scheduling algorithm. ROW stands for "READ Over WRITE"
+which is the main requests dispatch policy of this algorithm.
+
+The ROW IO scheduler was developed with the mobile devices needs in
+mind. In mobile devices we favor user experience upon everything else,
+thus we want to give READ IO requests as much priority as possible.
+The main idea of the ROW scheduling policy is just that:
+- If there are READ requests in pipe - dispatch them, while write
+starvation is considered.
+
+Software description
+====================
+The elevator defines a registering mechanism for different IO scheduler
+to implement. This makes implementing a new algorithm quite straight
+forward and requires almost no changes to block/elevator framework. A
+new IO scheduler just has to implement a set of callback functions
+defined by the elevator.
+These callbacks cover all the required IO operations such as
+adding/removing request to/from the scheduler, merging two requests,
+dispatching a request etc.
+
+Design
+======
+
+The requests are kept in queues according to their priority. The
+dispatching of requests is done in a Round Robin manner with a
+different slice for each queue. The dispatch quantum for a specific
+queue is set according to the queues priority. READ queues are
+given bigger dispatch quantum than the WRITE queues, within a dispatch
+cycle.
+
+At the moment there are 6 types of queues the requests are
+distributed to:
+-	High priority READ queue
+-	High priority Synchronous WRITE queue
+-	Regular priority READ queue
+-	Regular priority Synchronous WRITE queue
+-	Regular priority WRITE queue
+-	Low priority READ queue
+
+The marking of request as high/low priority will be done by the
+application adding the request and not the scheduler. See TODO section.
+If the request is not marked in any way (high/low) the scheduler
+assigns it to one of the regular priority queues:
+read/write/sync write.
+
+If in a certain dispatch cycle one of the queues was empty and didn't
+use its quantum that queue will be marked as "un-served". If we're in
+a middle of a dispatch cycle dispatching from queue Y and a request
+arrives for queue X that was un-served in the previous cycle, if X's
+priority is higher than Y's, queue X will be preempted in the favor of
+queue Y.
+
+For READ request queues ROW IO scheduler allows idling within a
+dispatch quantum in order to give the application a chance to insert
+more requests. Idling means adding some extra time for serving a
+certain queue even if the queue is empty. The idling is enabled if
+the ROW IO scheduler identifies the application is inserting requests
+in a high frequency.
+Not all queues can idle. ROW scheduler exposes an enablement struct
+for idling.
+For idling on READ queues, the ROW IO scheduler uses timer mechanism.
+When the timer expires we schedule a delayed work that will signal the
+device driver to fetch another request for dispatch.
+
+ROW scheduler will support additional services for block devices that
+supports Urgent Requests. That is, the scheduler may inform the
+device driver upon urgent requests using a newly defined callback.
+In addition it will support rescheduling of requests that were
+interrupted. For example if the device driver issues a long write
+request and a sudden urgent request is received by the scheduler.
+The scheduler will inform the device driver about the urgent request,
+so the device driver can stop the current write request and serve the
+urgent request. In such a case the device driver may also insert back
+to the scheduler the remainder of the interrupted write request, such
+that the scheduler may continue sending urgent requests without the
+need to interrupt the ongoing write again and again. The write
+remainder will be sent later on according to the scheduler policy.
+
+SMP/multi-core
+==============
+At the moment the code is accessed from 2 contexts:
+- Application context (from block/elevator layer): adding the requests.
+- device driver thread: dispatching the requests and notifying on
+  completion.
+
+One lock is used to synchronize between the two. This lock is provided
+by the block device driver along with the dispatch queue.
+
+Config options
+==============
+1. hp_read_quantum: dispatch quantum for the high priority READ queue
+   (default is 100 requests)
+2. rp_read_quantum: dispatch quantum for the regular priority READ
+   queue (default is 100 requests)
+3. hp_swrite_quantum: dispatch quantum for the high priority
+   Synchronous WRITE queue (default is 2 requests)
+4. rp_swrite_quantum: dispatch quantum for the regular priority
+   Synchronous WRITE queue (default is 1 requests)
+5. rp_write_quantum: dispatch quantum for the regular priority WRITE
+   queue (default is 1 requests)
+6. lp_read_quantum: dispatch quantum for the low priority READ queue
+   (default is 1 requests)
+7. lp_swrite_quantum: dispatch quantum for the low priority Synchronous
+   WRITE queue (default is 1 requests)
+8. read_idle: how long to idle on read queue in Msec (in case idling
+   is enabled on that queue). (default is 5 Msec)
+9. read_idle_freq: frequency of inserting READ requests that will
+   trigger idling. This is the time in Msec between inserting two READ
+   requests. (default is 8 Msec)
+
+Note: Dispatch quantum is number of requests that will be dispatched
+from a certain queue in a dispatch cycle.
+
+To do
+=====
+The ROW algorithm takes the scheduling policy one step further, making
+it a bit more "user-needs oriented", by allowing the application to
+hint on the urgency of its requests. For example: even among the READ
+requests several requests may be more urgent for completion than other.
+The former will go to the High priority READ queue, that is given the
+bigger dispatch quantum than any other queue.
+
+Still need to design the way applications will "hint" on the urgency of
+their requests. May be done by ioctl(). We need to look into concrete
+use-cases in order to determine the best solution for this.
+This will be implemented as a second phase.
+
+Design and implement additional services for block devices that
+supports High Priority Requests.
diff --git a/block/Kconfig.iosched b/block/Kconfig.iosched
@@ -64,6 +64,16 @@ config IOSCHED_SIO
 	  basic merging, trying to keep a minimum overhead. It is aimed
 	  mainly for aleatory access devices (eg: flash devices).
 
+config IOSCHED_ROW
+	tristate "ROW I/O scheduler"
+	---help---
+	  The ROW I/O scheduler gives priority to READ requests over the
+	  WRITE requests when dispatching, without starving WRITE requests.
+	  Requests are kept in priority queues. Dispatching is done in a RR
+	  manner when the dispatch quantum for each queue is calculated
+	  according to queue priority.
+	  Most suitable for mobile devices.
+
 config IOSCHED_ZEN
 	tristate "Zen I/O scheduler"
 	default y
@@ -109,6 +119,16 @@ choice
 	config DEFAULT_SIO
 		bool "SIO" if IOSCHED_SIO=y
 
+	config DEFAULT_ROW
+		bool "ROW" if IOSCHED_ROW=y
+		help
+		  The ROW I/O scheduler gives priority to READ requests
+		  over the WRITE requests when dispatching, without starving
+		  WRITE requests. Requests are kept in priority queues.
+		  Dispatching is done in a RR manner when the dispatch quantum
+		  for each queue is defined according to queue priority.
+		  Most suitable for mobile devices.
+
 	config DEFAULT_VR
 		bool "V(R)" if IOSCHED_VR=y
 
@@ -130,6 +150,7 @@ config DEFAULT_IOSCHED
 	string
 	default "deadline" if DEFAULT_DEADLINE
 	default "cfq" if DEFAULT_CFQ
+	default "row" if DEFAULT_ROW
 	default "fiops" if DEFAULT_FIOPS
 	default "sio" if DEFAULT_SIO
 	default "vr" if DEFAULT_VR

diff --git a/block/Makefile b/block/Makefile
@@ -16,6 +16,7 @@ obj-$(CONFIG_IOSCHED_DEADLINE)	+= deadline-iosched.o
 obj-$(CONFIG_IOSCHED_CFQ)	+= cfq-iosched.o
 obj-$(CONFIG_IOSCHED_FIOPS)	+= fiops-iosched.o
 obj-$(CONFIG_IOSCHED_SIO)       += sio-iosched.o
+obj-$(CONFIG_IOSCHED_ROW)	+= row-iosched.o
 obj-$(CONFIG_IOSCHED_VR)        += vr-iosched.o
 obj-$(CONFIG_IOSCHED_BFQ)	+= bfq-iosched.o
 obj-$(CONFIG_IOSCHED_ZEN)	+= zen-iosched.o