In [1]:
# Initialization cell
try:  # for CS1302 JupyterLite pyodide kernel
    import piplite

    with open("requirements.txt") as f:
        for package in f:
            package = package.strip()
            print("Installing", package)
            await piplite.install(package)
except ModuleNotFoundError:
    pass

# Operations on Sequences

**CS1302 Introduction to Computer Programming**
___

In [2]:
import random
%reload_ext divewidgets

## Mutating a list

```{important}

For list (but not tuple), subscription and slicing can also be used as the target of an assignment operation to mutate the list.
```

In [3]:
%%optlite -h 350
b = [*range(10)]  # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
b[::2] = b[:5]    # [0, 1, 1, 3, 2, 5, 3, 7, 4, 9]
b[0:1] = b[:5]    # [0, 1, 1, 3, 2, 1, 1, 3, 2, 4, 3, 7, 4, 9]
b[::2] = b[:5]    # fails; b[::2] length 7, b[:5] length 5

OPTWidget(value=None, height=350, script='b = [*range(10)]  # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]\nb[::2] = b[:5]  …

Last assignment fails because `[::2]` with step size not equal to `1` is an *extended slice*, which can only be assigned to a list of equal size.

**What is the difference between mutation and aliasing?**

In the previous code:
- The first assignment `b = [*range(10)]` is aliasing, which gives the list the target name/identifier `b`.
- Other assignments such as `b[::2] = b[:5]` are mutations that [calls `__setitem__`](https://docs.python.org/3/reference/simple_stmts.html#assignment-statements) because the target `b[::2]` is not an identifier.

In [4]:
list.__setitem__?

[0;31mSignature:[0m      [0mlist[0m[0;34m.[0m[0m__setitem__[0m[0;34m([0m[0mself[0m[0;34m,[0m [0mkey[0m[0;34m,[0m [0mvalue[0m[0;34m,[0m [0;34m/[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mCall signature:[0m [0mlist[0m[0;34m.[0m[0m__setitem__[0m[0;34m([0m[0;34m*[0m[0margs[0m[0;34m,[0m [0;34m**[0m[0mkwargs[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mType:[0m           wrapper_descriptor
[0;31mString form:[0m    <slot wrapper '__setitem__' of 'list' objects>
[0;31mNamespace:[0m      Python builtin
[0;31mDocstring:[0m      Set self[key] to value.


**Exercise** 

Explain why the check returns False.

In [5]:
# %%optlite -l -h 400
a = b = [0]
b[0] = a[0] + 1
# print(a, b)
print(a[0] < b[0])

False


**Answer**

`a` and `b` points to the same element `[0]`.

When `b[0]` is changed to `1`, both `a` and `b` now points to the same element `[1]`.

**Exercise** 

Explain why the mutations below have different effects?

In [6]:
a = [0, 1]
i = 0
a.__setitem__(i := i + 1, i)
print(a)

[0, 1]


In [7]:
a = [0, 1]
i = 0
a[i := i + 1] = a[i]
print(a)

[0, 0]


**Answer**

In the first code script
1. `i = i + 1` -> `i = 1`
2. `a.__setitem__(i, i)` -> `a[1] = 1`

In the second code script
1. `tmp = a[i]` -> `tmp = 0`
2. `i = i + 1` -> `i = 1`
3. `a[i] = tmp` -> `a[i] = 0`

**Why mutate a list?**

The following is another implementation of `composite_sequence` that takes advantage of the mutability of list. 

In [8]:
def sieve_composite_sequence(stop):
    is_composite = [False] * stop  # initialization
    for factor in range(2,stop):
        if is_composite[factor]: continue
        for multiple in range(factor*2,stop,factor):
            is_composite[multiple] = True
    return (x for x in range(4,stop) if is_composite[x])

for x in sieve_composite_sequence(100): print(x, end=' ')

4 6 8 9 10 12 14 15 16 18 20 21 22 24 25 26 27 28 30 32 33 34 35 36 38 39 40 42 44 45 46 48 49 50 51 52 54 55 56 57 58 60 62 63 64 65 66 68 69 70 72 74 75 76 77 78 80 81 82 84 85 86 87 88 90 91 92 93 94 95 96 98 99 

The algorithm 
1. changes `is_composite[x]` from `False` to `True` if `x` is a multiple of a smaller number `factor`, and
2. returns a generator that generates composite numbers according to `is_composite`.

**Exercise** 

Is `sieve_composite_sequence` more efficient than your solution `composite_sequence`? Why?

In [9]:
composite_sequence = lambda stop: (
    x for x in range(2, stop) if any(x % divisor == 0 for divisor in range(2, x))
)

In [10]:
%%timeit
for x in composite_sequence(10000): pass

363 ms ± 1.67 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [11]:
%%timeit
for x in sieve_composite_sequence(10000): pass

1.77 ms ± 15.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [12]:
for x in sieve_composite_sequence(10000000): pass

**Answer**

`sieve_composite_sequence` is a lot more efficient.

`if is_composite[factor]: continue` reduce a lot of redundant calculations.

**Exercise** 

Note that the multiplication operation `*` is the most efficient way to [initialize a 1D list with a specified size](https://www.geeksforgeeks.org/python-which-is-faster-to-initialize-lists/), but we should not use it to initialize a 2D list. Fix the following code so that `a` becomes `[[1, 0], [0, 1]]`.

In [13]:
%%optlite -h 300
a = [[0] * 2] * 2
a[0][0] = a[1][1] = 1
print(a)

OPTWidget(value=None, height=300, script='a = [[0] * 2] * 2\na[0][0] = a[1][1] = 1\nprint(a)\n')

In [15]:
a = [[0] * 2 for _ in range(2)]
a[0][0] = a[1][1] = 1
print(a)

[[1, 0], [0, 1]]


## Different methods to operate on a sequence

Recall the `quicksort` algorithm:

In [16]:
def quicksort(seq):
    '''Return a sorted list of items from seq.'''
    if len(seq) <= 1:
        return list(seq)
    i = random.randint(0, len(seq) - 1)
    pivot, others = seq[i], [*seq[:i], *seq[i + 1:]]
    left = quicksort([x for x in others if x < pivot])
    right = quicksort([x for x in others if x >= pivot])
    return [*left, pivot, *right]


seq = [random.randint(0, 99) for i in range(10)]
print(seq, quicksort(seq), sep='\n')

[63, 88, 3, 83, 56, 62, 93, 42, 8, 37]
[3, 8, 37, 42, 56, 62, 63, 83, 88, 93]


There is also a built-in function `sorted` for sorting a sequence:

In [17]:
sorted?
sorted(seq)

[3, 8, 37, 42, 56, 62, 63, 83, 88, 93]

[0;31mSignature:[0m [0msorted[0m[0;34m([0m[0miterable[0m[0;34m,[0m [0;34m/[0m[0;34m,[0m [0;34m*[0m[0;34m,[0m [0mkey[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m [0mreverse[0m[0;34m=[0m[0;32mFalse[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Return a new list containing all items from the iterable in ascending order.

A custom key function can be supplied to customize the sort order, and the
reverse flag can be set to request the result in descending order.
[0;31mType:[0m      builtin_function_or_method


**Is `quicksort` quicker?**

In [19]:
%%timeit
quicksort(seq)

12.4 µs ± 136 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)


In [20]:
%%timeit
sorted(seq)

203 ns ± 0.66 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


Python implements the [Timsort](https://en.wikipedia.org/wiki/Timsort) algorithm, which is very efficient.

**What are other operations on sequences?**

The following compares the lists of public attributes for `tuple` and `list`. 
- We determine membership using the [operator `in` or `not in`](https://docs.python.org/3/reference/expressions.html#membership-test-operations).
- Different from the [keyword `in` in a for loop](https://docs.python.org/3/reference/compound_stmts.html#the-for-statement), operator `in` calls the method `__contains__`.

In [21]:
list_attributes = dir(list)
tuple_attributes = dir(tuple)

print(
    'Common attributes:', ', '.join([
        attr for attr in list_attributes
        if attr in tuple_attributes and attr[0] != '_'
    ]))

print(
    'Tuple-specific attributes:', ', '.join([
        attr for attr in tuple_attributes
        if attr not in list_attributes and attr[0] != '_'
    ]))

print(
    'List-specific attributes:', ', '.join([
        attr for attr in list_attributes
        if attr not in tuple_attributes and attr[0] != '_'
    ]))

Common attributes: count, index
Tuple-specific attributes: 
List-specific attributes: append, clear, copy, extend, insert, pop, remove, reverse, sort


- There are no public tuple-specific attributes, and
- all the list-specific attributes are methods that mutate the list, except `copy`.

The common attributes
- `count` method returns the number of occurrences of a value in a tuple/list, and
- `index` method returns the index of the first occurrence of a value in a tuple/list.

In [22]:
%%optlite -l -h 450
a = (1,2,2,4,5)
count_of_2 = a.count(2)
index_of_1st_2 = a.index(2)

OPTWidget(value=None, height=450, script='a = (1,2,2,4,5)\ncount_of_2 = a.count(2)\nindex_of_1st_2 = a.index(2…

`reverse` method reverses the list instead of returning a reversed list.

In [23]:
%%optlite -h 300
a = [*range(10)]
print(reversed(a))
print(*reversed(a))
print(a.reverse())

OPTWidget(value=None, height=300, script='a = [*range(10)]\nprint(reversed(a))\nprint(*reversed(a))\nprint(a.r…

- `copy` method returns a copy of a list.  
- `tuple` does not have the `copy` method but it is easy to create a copy by slicing.

In [24]:
%%optlite -h 400
a = [*range(10)]
b = tuple(a)
a_reversed = a.copy()
a_reversed.reverse()
b_reversed = b[::-1]

OPTWidget(value=None, height=400, script='a = [*range(10)]\nb = tuple(a)\na_reversed = a.copy()\na_reversed.re…

`sort` method sorts the list *in place* instead of returning a sorted list.

In [25]:
%%optlite -h 300
import random
a = [random.randint(0,10) for i in range(10)]
print(sorted(a))
print(a.sort())

OPTWidget(value=None, height=300, script='import random\na = [random.randint(0,10) for i in range(10)]\nprint(…

- `extend` method that extends a list instead of creating a new concatenated list.
- `append` method adds an object to the end of a list.
- `insert` method insert an object to a specified location.

In [26]:
%%optlite -h 300
a = b = [*range(5)]
print(a + b)
print(a.extend(b))
print(a.append('stop'))
print(a.insert(0,'start'))

OPTWidget(value=None, height=300, script="a = b = [*range(5)]\nprint(a + b)\nprint(a.extend(b))\nprint(a.appen…

- `pop` method deletes and return the last item of the list.  
- `remove` method removes the first occurrence of a value in the list.  
- `clear` method clears the entire list.

We can also use the function `del` to delete a selection of a list.

In [27]:
%%optlite -h 300
a = [*range(10)]
del a[::2]
print(a.pop())
print(a.remove(5))
print(a.clear())

OPTWidget(value=None, height=300, script='a = [*range(10)]\ndel a[::2]\nprint(a.pop())\nprint(a.remove(5))\npr…