/
skiplists.tex
520 lines (438 loc) · 22.1 KB
/
skiplists.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
\chapter{Skiplists}
\chaplabel{skiplists}
In this chapter, we discuss a beautiful data structure: the skiplist,
which has a variety of applications. Using a skiplist we can implement
a #List# that is fast for all the operations #get(i)#, #set(i,x)#,
#add(i,x)#, and #remove(i)#. We can also implement an #SSet# in which
all operations run in $O(\log #n#)$ expected time.
%Finally, a skiplist can be used to implement a #Rope# in which all
%operations run in $O(\log #n#)$ time.
Skiplists rely on \emph{randomization} for their efficiency.
In particular, a skiplist uses random coin tosses when an element is
inserted to determine the height of that element. The performance
of skiplists is expressed in terms of \emph{expected} running times
and lengths of paths. This expectation is taken over the random coin
tosses used by the skiplist. In the implementation, the random coin
tosses used by a skiplist are simulated using a pseudo-random number
(or bit) generator.
\section{The Basic Structure}
Conceptually, a skiplist is a sequence of singly-linked lists
$L_0,\ldots,L_h$, where each $L_r$ contains a subset of the items
in $L_{r-1}$. We start with the input list $L_0$ that contains #n#
items and construct $L_1$ from $L_0$, $L_2$ from $L_1$, and so on.
The items in $L_r$ are obtained by tossing a coin for each element, #x#,
in $L_{r-1}$ and including #x# in $L_r$ if the coin comes up heads.
This process ends when we create a list $L_r$ that is empty. An example
of a skiplist is shown in \figref{skiplist}.
\begin{figure}
\begin{center}
\includegraphics{figs/skiplist}
\end{center}
\caption{A skiplist containing seven elements.}
\figlabel{skiplist}
\end{figure}
For an element, #x#, in a skiplist, we call the \emph{height} of #x# the
largest value $r$ such that #x# appears in $L_r$. Thus, for example,
elements that only appear in $L_0$ have height $0$. Notice that the
height of #x# corresponds to the following experiment: Toss a coin
repeatedly until the first time it comes up tails. How many times did it
come up heads? The answer, not surprisingly, is that the expected height
of a node is 1. (We expect to toss the coin twice before getting tails,
but we don't count the last toss.) The \emph{height} of a skiplist is
the height of its tallest node.
At the head of every list is a special node, called the \emph{sentinel},
that acts as a dummy node for the list. The key property of skiplists
is that there is a short path, called the \emph{search path}, from the
sentinel in $L_h$ to every node in $L_0$. Remembering how to construct
a search path for a node, #u#, is easy (see \figref{skiplist-searchpath})
: Start at the top left corner of your skiplist (the sentinel in $L_h$)
and always go right unless that would overshoot #u#, in which case you
should take a step down into the list below.
More precisely, to construct the search path for the node #u# in $L_0$
we start at the sentinel, #w#, in $L_h$. Next, we examine #w.next#.
If #w.next# contains an item that appears before #u# in $L_0$, then
we set $#w#=#w.next#$. Otherwise, we move down and continue the search
at the occurrence of #w# in the list $L_{h-1}$. We continue this way
until we reach the predecessor of #u# in $L_0$.
\begin{figure}
\begin{center}
\includegraphics{figs/skiplist-searchpath}
\end{center}
\caption{The search path for the node containing $4$ in a skiplist.}
\figlabel{skiplist-searchpath}
\end{figure}
The following result, which we will prove in \secref{skiplist-analysis},
shows that the search path is quite short:
\begin{lem}\lemlabel{skiplist-searchpath}
The expected length of the search path for any node, #u#, in $L_0$ is at
most $2\log #n# + O(1) = O(\log #n#)$.
\end{lem}
A space-efficient way to implement a #Skiplist# is to define a #Node#,
#u#, as consisting of a data value, #x#, and an array, #next#, of
pointers, where #u.next[i]# points to #u#'s successor in the list
$L_{#i#}$. In this way, the data, #x#, in a node is
\javaonly{referenced}\cpponly{stored}
only once, even though #x# may appear in several lists.
\javaimport{ods/SkiplistSSet.Node<T>}
\cppimport{ods/SkiplistSSet.Node}
The next two sections of this chapter discuss two different applications
of skiplists. In each of these applications, $L_0$ stores the main
structure (a list of elements or a sorted set of elements).
The primary difference between these structures is in how
a search path is navigated; in particular, they differ in how
they decide if a search path should go down into $L_{r-1}$ or go right
within $L_r$.
\section{#SkiplistSSet#: An Efficient #SSet# Implementation}
\seclabel{skiplistset}
A #SkiplistSSet# uses a skiplist structure to implement the #SSet#
interface. When used this way, the list $L_0$ stores the elements of
the #SSet# in sorted order. The #find(x)# method works by following
the search path for the smallest value #y# such that $#y#\ge#x#$:
\codeimport{ods/SkiplistSSet.find(x).findPredNode(x)}
Following the search path for #y# is easy: when situated at
some node, #u#, in $L_{#r#}$, we look right to #u.next[r].x#.
If $#x#>#u.next[r].x#$, then we take a step to the right in $L_{#r#}$,
otherwise we move down into $L_{#r#-1}$. Each step (right or down) in
this search takes only constant time so, by \lemref{skiplist-searchpath},
the expected running time of #find(x)# is $O(\log #n#)$.
Before we can add an element to a #SkipListSSet#, we need a method to
simulate tossing coins to determine the height, #k#, of a new node.
We do this by picking a random integer, #z#, and counting the number of
trailing $1$s in the binary representation of #z#:\footnote{This method
does not exactly replicate the coin-tossing experiment since the value of
#k# will always be less than the number of bits in an #int#. However,
this will have negligible impact unless the number of elements in the
structure is much greater than $2^{32}=4294967296$.}
\codeimport{ods/SkiplistSSet.pickHeight()}
To implement the #add(x)# method in a #SkiplistSSet# we search for #x#
and then splice #x# into a few lists $L_0$,\ldots,$L_{#k#}$, where #k#
is selected using the #pickHeight()# method. The easiest way to do this
is to use an array, #stack#, that keeps track of the nodes at which
the search path goes down from some list $L_{#r#}$ into $L_{#r#-1}$.
More precisely, #stack[r]# is the node in $L_{#r#}$ where the search path
proceeded down into $L_{#r#-1}$. The nodes that we modify to insert #x#
are precisely the nodes $#stack[0]#,\ldots,#stack[k]#$. The following
code implements this algorithm for #add(x)#:
\codeimport{ods/SkiplistSSet.add(x)}
\begin{figure}
\begin{center}
\includegraphics{figs/skiplist-add}
\end{center}
\caption[Adding to a skiplist]{Adding the node containing $3.5$ to a skiplist. The nodes stored in #stack#
are highlighted.}
\figlabel{skiplist-add}
\end{figure}
Removing an element, #x#, is done in a similar way, except that there
is no need for #stack# to keep track of the search path. The removal
can be done as we are following the search path. We search for #x#
and each time the search moves downward from a node #u#, we check if
$#u.next.x#=#x#$ and if so, we splice #u# out of the list:
\codeimport{ods/SkiplistSSet.remove(x)}
\begin{figure}
\begin{center}
\includegraphics{figs/skiplist-remove}
\end{center}
\caption{Removing the node containing $3$ from a skiplist.}
\figlabel{skiplist-remove}
\end{figure}
\subsection{Summary}
The following theorem summarizes the performance of skiplists when used to
implement sorted sets:
\begin{thm}\thmlabel{skiplist}
A #SkiplistSSet# implements the #SSet# interface. A #SkiplistSSet# supports
the operations #add(x)#, #remove(x)#, and #find(x)# in $O(\log #n#)$
expected time per operation.
\end{thm}
\section{#SkiplistList#: An Efficient Random-Access #List# Implementation}
\seclabel{skiplistlist}
A #SkiplistList# implements the #List# interface on top of a skiplist structure.
In a #SkiplistList#, $L_0$ contains the elements of the
list in the order they appear in the list. Just like with a #SkiplistSSet#,
elements can be added, removed, and accessed in $O(\log #n#)$ time.
For this to be possible, we need a way to follow the search path for
the #i#th element in $L_0$. The easiest way to do this is to define
the notion of the \emph{length} of an edge in some list, $L_{#r#}$.
We define the length of every edge in $L_{0}$ as 1. The length of an edge, #e#,
in $L_{#r#}$, $#r#>0$, is defined as the sum of the lengths of the edges below #e#
in $L_{#r#-1}$. Equivalently, the length of #e# is
the number of edges in $L_0$ below #e#. See \figref{skiplist-lengths} for
an example of a skiplist with the lengths of its edges shown. Since the
edges of skiplists are stored in arrays, the lengths can be stored the same
way:
\begin{figure}
\begin{center}
\includegraphics{figs/skiplist-lengths}
\end{center}
\caption{The lengths of the edges in a skiplist.}
\figlabel{skiplist-lengths}
\end{figure}
\codeimport{ods/SkiplistList.Node}
The useful property of this definition of length is that, if we are
currently at a node that is at position #j# in $L_0$ and we follow an
edge of length $\ell$, then we move to a node whose position, in $L_0$,
is $#j#+\ell$. In this way, while following a search path, we can keep
track of the position, #j#, of the current node in $L_0$. When at a
node, #u#, in $L_{#r#}$, we go right if #j#
plus the length of the edge #u.next[r]# is less than #i#, otherwise we go
down into $L_{#r#-1}$.
\codeimport{ods/SkiplistList.findPred(i)}
\codeimport{ods/SkiplistList.get(i).set(i,x)}
Since the hardest part of the operations #get(i)# and #set(i,x)# is
finding the #i#th node in $L_0$, these operations run in
$O(\log #n#)$ time.
Adding an element to a #SkiplistList# at a position, #i#, is fairly
straightforward. Unlike in a #SkiplistSSet#, we are sure that a new
node will actually be added, so we can do the addition at the same time
as we search for the new node's location. We first pick the height, #k#,
of the newly inserted node, #w#, and then follow the search path for #i#.
Anytime the search path moves down from $L_{#r#}$ with $#r#\le #k#$, we
splice #w# into $L_{#r#}$. The only extra care needed is to ensure that
the lengths of edges are updated properly. See \figref{skiplist-addix}.
\begin{figure}
\begin{center}
\includegraphics{figs/skiplist-addix}
\end{center}
\caption[Adding to a SkiplistList]{Adding an element to a #SkiplistList#.}
\figlabel{skiplist-addix}
\end{figure}
Note that, each time the search path goes down at a node, #u#, in $L_{#r#}$,
the length of the edge #u.next[r]# increases by one, since we are adding
an element below that edge at position #i#. Splicing the node #w# between two nodes,
#u# and #z#, works as shown in \figref{skiplist-lengths-splice}. While
following the search path we are already keeping track of the position,
#j#, of #u# in $L_0$. Therefore, we know that the length of the edge from
#u# to #w# is $#i#-#j#$. We can also deduce the length of the edge
from #w# to #z# from the length, $\ell$, of the edge from #u# to #z#.
Therefore, we can splice in #w# and update the lengths of the edges in
constant time.
\begin{figure}
\begin{center}
\includegraphics{figs/skiplist-lengths-splice}
\end{center}
\caption[Adding to a SkiplistList]{Updating the lengths of edges while splicing a node
#w# into a skiplist.}
\figlabel{skiplist-lengths-splice}
\end{figure}
This sounds more complicated than it actually is and the code is actually
quite simple:
\codeimport{ods/SkiplistList.add(i,x)}
\codeimport{ods/SkiplistList.add(i,w)}
By now, the implementation of
the #remove(i)# operation in a #SkiplistList# should be obvious. We follow the search path for the node at position #i#. Each time the search path takes a step down from a node, #u#, at level #r# we decrement the length of the edge leaving #u# at that level. We also check if #u.next[r]# is the element of rank #i# and, if so, splice it out of the list at that level. An example is shown in \figref{skiplist-removei}.
\begin{figure}
\begin{center}
\includegraphics{figs/skiplist-removei}
\end{center}
\caption[Removing an element from a SkiplistList]{Removing an element from a #SkiplistList#.}
\figlabel{skiplist-removei}
\end{figure}
\codeimport{ods/SkiplistList.remove(i)}
\subsection{Summary}
The following theorem summarizes the performance of the #SkiplistList#
data structure:
\begin{thm}\thmlabel{skiplistlist}
A #SkiplistList# implements the #List# interface. A #SkiplistList#
supports the operations #get(i)#, #set(i,x)#, #add(i,x)#, and
#remove(i)# in $O(\log #n#)$ expected time per operation.
\end{thm}
%\section{Skiplists as Ropes}
%TODO: A section on ropes
\section{Analysis of Skiplists}
\seclabel{skiplist-analysis}
In this section, we analyze the expected height, size, and length of
the search path in a skiplist. This section requires a background in
basic probability. Several proofs are based on the following basic
observation about coin tosses.
\begin{lem}\lemlabel{coin-tosses}
Let $T$ be the number of times a fair coin is tossed up to and including
the first time the coin comes up heads. Then $\E[T]=2$.
\end{lem}
\begin{proof}
Suppose we stop tossing the coin the first time it comes up
heads. Define the indicator variable
\[ I_{i} = \left\{\begin{array}{ll}
0 & \mbox{if the coin is tossed less than $i$ times} \\
1 & \mbox{if the coin is tossed $i$ or more times}
\end{array}\right.
\]
Note that $I_i=1$ if and only if the first $i-1$ coin tosses are tails,
so $\E[I_i]=\Pr\{I_i=1\}=1/2^{i-1}$. Observe that $T$, the total
number of coin tosses, can be written as $T=\sum_{i=1}^{\infty} I_i$.
Therefore,
\begin{align*}
\E[T] & = \E\left[\sum_{i=1}^\infty I_i\right] \\
& = \sum_{i=1}^\infty \E\left[I_i\right] \\
& = \sum_{i=1}^\infty 1/2^{i-1} \\
& = 1 + 1/2 + 1/4 + 1/8 + \cdots \\
& = 2 \enspace . \qedhere
\end{align*}
\end{proof}
The next two lemmata tell us that skiplists have linear size:
\begin{lem}\lemlabel{skiplist-size1}
The expected number of nodes in a skiplist containing $#n#$ elements,
not including occurrences of the sentinel, is $2#n#$.
\end{lem}
\begin{proof}
The probability that any particular element, #x#, is included in list
$L_{#r#}$ is $1/2^{#r#}$, so the expected number of nodes in $L_{#r#}$
is $#n#/2^{#r#}$. Therefore, the total number of nodes in all lists is
\[ \sum_{#r#=0}^\infty #n#/2^{#r#} = #n#(1+1/2+1/4+1/8+\cdots) = 2#n# \enspace . \qedhere \]
\end{proof}
\begin{lem}\lemlabel{skiplist-height}
The expected height of a skiplist containing #n# elements is at most
$\log #n# + 2$.
\end{lem}
\begin{proof}
For each $#r#\in\{1,2,3,\ldots,\infty\}$,
define the indicator random variable
\[ I_{#r#} = \left\{\begin{array}{ll}
0 & \mbox{if $L_{#r#}$ is empty} \\
1 & \mbox{if $L_{#r#}$ is non-empty}
\end{array}\right.
\]
The height, #h#, of the skiplist is then given by
\[
#h# = \sum_{i=1}^\infty I_{#r#} \enspace .
\]
Note that $I_{#r#}$ is never more than the length, $|L_{#r#}|$, of $L_{#r#}$, so
\[
\E[I_{#r#}] \le \E[|L_{#r#}|] = #n#/2^{#r#} \enspace .
\]
Therefore, we have
\begin{align*}
\E[#h#] &= \E\left[\sum_{r=1}^\infty I_{#r#}\right] \\
&= \sum_{#r#=1}^{\infty} E[I_{#r#}] \\
&= \sum_{#r#=1}^{\lfloor\log #n#\rfloor} E[I_{#r#}]
+ \sum_{r=\lfloor\log #n#\rfloor+1}^{\infty} E[I_{#r#}] \\
&\le \sum_{#r#=1}^{\lfloor\log #n#\rfloor} 1
+ \sum_{r=\lfloor\log #n#\rfloor+1}^{\infty} #n#/2^{#r#} \\
&\le \log #n#
+ \sum_{#r#=0}^\infty 1/2^{#r#} \\
&= \log #n# + 2 \enspace . \qedhere
\end{align*}
\end{proof}
\begin{lem}\lemlabel{skiplist-size2}
The expected number of nodes in a skiplist containing $#n#$ elements,
including all occurrences of the sentinel, is $2#n#+O(\log #n#)$.
\end{lem}
\begin{proof}
By \lemref{skiplist-size1}, the expected number of nodes, not
including the sentinel, is $2#n#$. The number of occurrences of
the sentinel is equal to the height, $#h#$, of the skiplist so, by
\lemref{skiplist-height} the expected number of occurrences of the
sentinel is at most $\log #n#+2 = O(\log #n#)$.
\end{proof}
\begin{lem}
The expected length of a search path in a skiplist is at most $2\log #n# + O(1)$.
\end{lem}
\begin{proof}
The easiest way to see this is to consider the \emph{reverse search
path} for a node, #x#. This path starts at the predecessor of #x#
in $L_0$. At any point in time, if the path can go up a level, then
it does. If it cannot go up a level then it goes left. Observe that
the reverse search path for #x# is identical to the search path for #x#,
except that it is reversed.
The number of nodes that the reverse search path visits at a particular
level, #r#, is related to the following experiment: Toss a coin.
If the coin comes up heads then go up and stop, otherwise go left and
repeat the experiment. The number of coin tosses before the heads then
represents the number of steps to the left that a reverse search path
takes at a particular level.\footnote{Note that this might overcount
the number of steps to the left, since the experiment should end either at
the first heads or when the search path reaches the sentinel, whichever
comes first. This is not a problem since the lemma is only stating an
upper bound.} \lemref{coin-tosses} tells us that the expected number
of coin tosses before the first heads is 1.
Let $S_{#r#}$ denote the number of steps the forward search path takes at level
$#r#$ that go to the right. We have just argued that $\E[S_{#r#}]\le
1$. Furthermore, $S_{#r#}\le |L_{#r#}|$, since we can't take more steps
in $L_{#r#}$ than the length of $L_{#r#}$, so
\[
\E[S_{#r#}] \le E[|L_{#r#}|] = #n#/2^{#r#} \enspace .
\]
We can now finish as in the proof of \lemref{skiplist-height}.
Let $S$ be the length of the search path for some node, #u#, in a
skiplist, and let $#h#$ be the height of the skiplist. Then
\begin{align*}
\E[S]
&= \E\left[ #h# + \sum_{#r#=0}^\infty S_{#r#} \right] \\
&= \E[#h#] + \sum_{#r#=0}^\infty \E[S_{#r#}] \\
&= \E[#h#] + \sum_{#r#=0}^{\lfloor\log #n#\rfloor} \E[S_{#r#}]
+ \sum_{#r#=\lfloor\log #n#\rfloor+1}^\infty \E[S_{#r#}] \\
&\le \E[#h#] + \sum_{#r#=0}^{\lfloor\log #n#\rfloor} 1
+ \sum_{r=\lfloor\log #n#\rfloor+1}^\infty #n#/2^{#r#} \\
&\le \E[#h#] + \sum_{#r#=0}^{\lfloor\log #n#\rfloor} 1
+ \sum_{#r#=0}^{\infty} 1/2^{#r#} \\
&\le \E[#h#] + \sum_{#r#=0}^{\lfloor\log #n#\rfloor} 1
+ \sum_{#r#=0}^{\infty} 1/2^{#r#} \\
&\le \E[#h#] + \log #n# + 3 \\
&\le 2\log #n# + 5 \enspace . \qedhere
\end{align*}
\end{proof}
The following theorem summarizes the results in this section:
\begin{thm}
A skiplist containing $#n#$ elements has expected size $O(#n#)$ and the
expected length of the search path for any particular element is at most
$2\log #n# + O(1)$.
\end{thm}
%\section{Iteration and Finger Search}
%TODO: Write this section
\section{Discussion and Exercises}
Skiplists were introduced by Pugh \cite{p91} who also presented
a number of applications of skiplists \cite{p89}. Since then they
have been studied extensively. Several researchers have done very
precise analysis of the expected length and variance in length of the
search path for the #i#th element in a skiplist \cite{kp94,kmp95,pmp92}.
Deterministic versions \cite{mps92}, biased versions \cite{bbg02,esss01},
and self-adjusting versions \cite{bdl08} of skiplists have all been
developed. Skiplist implementations have been written for various
languages and frameworks and have seen use in open-source database
systems \cite{skipdb,redis}. A variant of skiplists is used in the HP-UX
operating system kernel's process management structures \cite{hpux}.
\javaonly{Skiplists are even part of the Java 1.6 API \cite{oracle_jdk6}.}
\begin{exc}\exclabel{skiplist-changes}
Show that, during an #add(x)# or a #remove(x)# operation, the expected
number of pointers in the structure that get changed is constant.
\end{exc}
\begin{exc}\exclabel{skiplist-opt}
Suppose that, instead of promoting an element from $L_{i-1}$ into $L_i$
based on a coin toss, we promote it with some probability $p$, $0 <
p < 1$. Show that the expected length of the search path in this case
is at most $(1/p)\log_{1/p} #n# + O(1)$. What is the value of $p$
that minimizes this expression? What is the expected height of the
skiplist? What is the expected number of nodes in the skiplist?
\end{exc}
\begin{exc}\exclabel{skiplist-opt-2}
Design and implement a #find(x)# method for #SkiplistSSet# that avoids
\emph{locally-redundant comparisons}; these are comparisons that have already
been done and occur because $#u.next[r]# = #u.next[r-1]#$.
Analyze the expected number of comparisons done by your modified #find(x)#
method.
\end{exc}
\begin{exc}
Design and implement a version of a skiplist that implements the
#SSet# interface, but also allows fast access to elements by rank.
That is, it also supports the function #get(i)#, which returns the
element whose rank is #i# in $O(\log #n#)$ expected time. (The rank
of an element #x# in an #SSet# is the number of elements in the #SSet#
that are less than #x#.)
\end{exc}
\begin{exc}
Using the ideas from the space-efficient linked-list, #SEList#,
design and implement a space-efficient #SSet#, #SESSet#. Do this by
storing the data, in order, in an #SEList# and then storing the blocks
of this #SEList# in an #SSet#. If the original #SSet# implementation
uses $O(#n#)$ space to store #n# elements, then the #SESSet# will use
enough space for #n# elements plus $O(#n#/#b#+#b#)$ wasted space.
\end{exc}
\begin{exc}
Using an #SSet# as your underlying structure, design and implement an
application that reads a (large) text file and allow you to search,
interactively, for any substring contained in the text.
\noindent Hint 1: Every substring is a prefix of some suffix, so it
suffices to store all suffixes of the text file.
\noindent Hint 2: Any suffix can be represented compactly as a single
integer indicating where the suffix begins in the text.
\noindent Test your application on some large texts like some of the books
available at Project Gutenberg \cite{gutenberg}.
\end{exc}