# Sequences of values

Two basic ways of storing a sequence of values 
1. Arrays
2. Lists

# what is difference 

Arrays :

Single block of memory, element of uniform type and size of sequence is fixed in advance.

Indexing is fast Access seq[i] -> constant time for any i
Inserting between seq[i] and seq[i+1] is expensive 

Lists :

Values scattered in memory. Each element points to the next—“linked” list and Flexible size

Follow i links to access seq[i]
Cost proportional to i
Inserting or deleting an element is easy as “Plumbing”


# Operations

1. Exchange seq[i] and seq[j]
    Array -> constant
    List  -> Linear
    
2. Delete seq[i] or Insert v after seq[i]
    Array -> Linear
    List  -> Constant


Search problem:

Algorithms on one data structure may not transfer
to another
Example: Binary search

Does the structure of seq matter?
Array vs list

Does the organization of the information matter?
Values sorted/unsorted

Searching for a value
Unsorted array — linear scan, O(n)
Sorted array — binary search, O(log n)

In [3]:
# The Unsorted case

def search(seq,v):
    for x in seq:
        if x == v:
            return True
    return False

Worst case:

Need to scan the entire sequence seq
Time proportional to length of sequence
Does not matter if seq is array or list

# Search a sorted sequence

What if seq is sorted?

1. Compare v with midpoint of seq
2. If midpoint is v, the value is found
3. If v < midpoint, search left half of seq
4. If v > midpoint, search right half of seq

# Binary search

In [2]:
def bsearch(seq,v,l,r):
    if r-l ==0:
        return False
    mid = (l+r)//2
    if seq[mid] == v:
        return True 
    if v < seq[mid]:
        return bsearch(seq,v,l,mid) 
    else:       
        return bsearch(seq,v,mid+1,r)
    
print(bsearch([1,2,3,4,5,6,7,8,9],9,0,9))

True


# How long does this take?

Each step halves the interval to search. For an interval of size 0, the answer is immediate

T(n): time to search in an array of size n
T(0) = 1
T(n) = 1 + T(n/2)

Unwind the recurrence

T(n) = 1 + T(n/2) = 1 + 1 + T(n/22) = ...
= 1 + 1 + ... + 1 + T(n/2k)
= 1 + 1 + ... + 1 + T(n/2log n) = O(log n)

Efficiency:

Measure time taken by an algorithm as a function
T(n) with respect to input size n

Usually report worst case behaviour

Worst case for searching in a sequence is when
value is not found

Worst case is easier to calculate than “average”
case or other more reasonable measures

O( ) notation

Interested in broad relationship between input size
and running time

Is T(n) proportional to log n, n, n log n, n2, ..., 2n?

Write T(n) = O(n), T(n) = O(n log n), ... to indicate this

Linear scan is O(n) for arrays and lists

Binary search is O(log n) for sorted arrays

How to sort?

You are a Teaching Assistant for a course

The instructor gives you a stack of exam answer papers with marks, ordered randomly. Your task is to arrange them in descending order



Strategy 1

1. Scan the entire stack and find the paper with minimum marks.
2. Move this paper to a new stack
3. Repeat with remaining papers
4. Each time, add next minimum mark paper on top of new stack
5. Eventually, new stack is sorted in descending order

# Selection Sort

Select the next element in sorted order
Move it into its correct place in the final sorted list

Avoid using a second list:
Swap minimum element with value in first position.
Swap second minimum element to second position.

In [4]:
def SelectionSort(seq):
    for start in range(len(seq)):
        minpos = start
        for i in range(start+1,len(seq)):
            if seq[i] < seq[minpos]:
                minpos = i
        seq[start],seq[minpos] = seq[minpos],seq[start]

seq = [5,3,8,6,2,7,4,1]
SelectionSort(seq)
print(seq)

[1, 2, 3, 4, 5, 6, 7, 8]


Analysis of Selection Sort

Finding minimum in unsorted segment of length k
requires one scan, k steps

In each iteration, segment to be scanned reduces
by 1

T(n) = n + (n-1) + (n-2) + ... + 1 = n(n+1)/2 = O(n2)

Strategy 2

First paper: put in a new stack

Second paper:

Lower marks than first? Place below first paper
Higher marks than first? Place above first paper

Third paper

Insert into the correct position with respect to first
two papers

Do this for each subsequent paper:
insert into correct position in new sorted stack

# Insertion Sort

Start building a sorted sequence with one element

Pick up next unsorted element and insert it into its
correct place in the already sorted sequence

In [None]:
def InsertionSort(seq):
    for sliceEnd in range(len(seq)):
        pos = sliceEnd
        while pos > 0 and seq[pos] < seq[pos-1]:
            (seq[pos],seq[pos-1]) = (seq[pos-1],seq[pos])
            pos = pos-1

seq = [5,3,8,6,2,7,4,1]
InsertionSort(seq)
print(seq)
list1 = list(range(1000,10,-5))
print("original",list1)
InsertionSort(list1)
print("sorted list:",list1)


[1, 2, 3, 4, 5, 6, 7, 8]
[15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 305, 310, 315, 320, 325, 330, 335, 340, 345, 350, 355, 360, 365, 370, 375, 380, 385, 390, 395, 400, 405, 410, 415, 420, 425, 430, 435, 440, 445, 450, 455, 460, 465, 470, 475, 480, 485, 490, 495, 500, 505, 510, 515, 520, 525, 530, 535, 540, 545, 550, 555, 560, 565, 570, 575, 580, 585, 590, 595, 600, 605, 610, 615, 620, 625, 630, 635, 640, 645, 650, 655, 660, 665, 670, 675, 680, 685, 690, 695, 700, 705, 710, 715, 720, 725, 730, 735, 740, 745, 750, 755, 760, 765, 770, 775, 780, 785, 790, 795, 800, 805, 810, 815, 820, 825, 830, 835, 840, 845, 850, 855, 860, 865, 870, 875, 880, 885, 890, 895, 900, 905, 910, 915, 920, 925, 930, 935, 940, 945, 950, 955, 960, 965, 970, 975, 980, 985, 990, 995, 1000]


Analysis of Insertion Sort

Inserting a new value in sorted segment of length
k requires upto k steps in the worst case

In each iteration, sorted segment in which to insert
increased by 1

T(n) = 1 + 2 + ... + n-1 = n(n-1)/2 = O(n**2)