# ArrayList

- Basic Idea
    - Store data in sequential order
    - Stored contiguously
    - Random access, variable-size list data structure that allows elements to be added or removed
    - E.g., a list of available hotel rooms, a list of cities, and a list of books
- Data Abstraction
    - Decide what data elements you will be operating on 
    - Decide what operations you will be doing to each data element 
    - Define a clean interface to these operations 
    - Implement the objects
    - Now you have an Abstract Data Type (ADT)
    

### Python Array vs. List

#### Array Module -  Sequence of fixed-type data

The array module defines a sequence data structure that looks very much like a list except that all of the members have to be of the same type. The types supported are all numeric or other fixed-size primitive types such as bytes.

A computer system will have a huge number of bytes of memory, and to keep track of what information is stored in what byte, the computer uses an abstraction known as a memory address. In effect, each byte of memory is associated with a unique number that serves as its address (more formally, the binary representation of the number serves as the address). In this way, the computer system can refer to the data in “byte #2150” versus the data in “byte #2157,” for example. Memory addresses are typically coordinated with the physical layout of the memory system, and so we often portray the numbers in sequential fashion.

<img src="../images/ch02/memory.jpg" width="500"/> 

Array, a group of related variables can be stored one after another in a <B><I>contiguous</I></B> portion of the computer’s memory. 
<img src="../images/ch02/memorystr.jpg" width="500"/>

We describe this as an array of six characters, even though it requires 12 bytes of memory. We will refer to each location within an array as a cell, and will use an integer index to describe its location within the array, with cells numbered starting with 0, 1, 2, and so on.

Each cell of an array must use the same number of bytes. This requirement is what allows an arbitrary cell of the array to be accessed in <B><I>constant time</I></B> based on its index.

The appropriate memory address can be computed using the calculation, start + cellsize index

For example, cell 4 begins at memory location 2146 + 2 · 4 = 2146 + 8 = 2154

<img src="../images/ch02/arraysimple.jpg" width="160"/>

In below example, the array is configured to hold a sequence of bytes and is initialized with a simple string.

In [1]:
import array
import binascii

s = 'sample' 
a = array.array('u', s)

print('As string:', s)
print('As array :', a)

As string: sample
As array : array('u', 'sample')


In [2]:
? array.array

[0;31mInit signature:[0m  [0marray[0m[0;34m.[0m[0marray[0m[0;34m([0m[0mself[0m[0;34m,[0m [0;34m/[0m[0;34m,[0m [0;34m*[0m[0margs[0m[0;34m,[0m [0;34m**[0m[0mkwargs[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m     
array(typecode [, initializer]) -> array

Return a new array whose items are restricted by typecode, and
initialized from the optional initializer value, which must be a list,
string or iterable over elements of the appropriate type.

Arrays represent basic values and behave very much like lists, except
the type of objects stored in them is constrained. The type is specified
at object creation time by using a type code, which is a single character.
The following type codes are defined:

    Type code   C Type             Minimum size in bytes
    'b'         signed integer     1
    'B'         unsigned integer   1
    'u'         Unicode character  2 (see note)
    'h'         signed integer     2
    'H'         unsigned integer  

In [3]:
len(a) 

6

How about string? We want to store a list of people in array, however, people's names have different length. 

Python represents a list or tuple instance using an internal storage mechanism of an array of object references. At the lowest level, what is stored is a consecutive sequence of memory addresses at which the elements of the sequence reside. A high-level diagram of such a list is shown below:

<img src="../images/ch02/memorystrings.jpg" width="420"/>

Although the relative size of the individual elements may vary, the number of bits used to store the memory address of each element is fixed (e.g., 64-bits per address). In this way, Python can support constant-time access to a list or tuple element based on its index.


A single list instance may include multiple references to the same object as elements of the list, and it is possible for a single object to be an element of two or more lists, as those lists simply store references back to that object. As an example, when you compute a slice of a list, the result is a new list instance, but that new list has references to the same elements that are in the original list.

<img src="../images/ch02/listslice.jpg" width="360"/>

In [4]:
# let's play some tricks
# array slicing: make a copy!!!

a = list(range(0, 9))
print(a)
b = a[3:7]
print(b)

b[0] = -1
print(a)
print(b)

[0, 1, 2, 3, 4, 5, 6, 7, 8]
[3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6, 7, 8]
[-1, 4, 5, 6]


In [5]:
import numpy as np

a = np.arange(10)
print(a)
b = a[3:7]
print(b)

b[0] = -1
print(a)
print(b)

b = a[3:7].copy()
print(b)
b[1] = -2
print(a)
print(b)

[0 1 2 3 4 5 6 7 8 9]
[3 4 5 6]
[ 0  1  2 -1  4  5  6  7  8  9]
[-1  4  5  6]
[-1  4  5  6]
[ 0  1  2 -1  4  5  6  7  8  9]
[-1 -2  5  6]


<img src="../images/ch02/listslice2.jpg" width="360"/>

####  Shallow Copy vs. Deep Copy

<img src="../images/ch02/shallow1.jpg" width="300"/>
<img src="../images/ch02/shallow2.jpg" width="240"/>

In [6]:
a = [0] * 8
a[2] = 1
a

[0, 0, 1, 0, 0, 0, 0, 0]

In [7]:
a = [[0] * 8] * 8
a[0][0] = 9
a

[[9, 0, 0, 0, 0, 0, 0, 0],
 [9, 0, 0, 0, 0, 0, 0, 0],
 [9, 0, 0, 0, 0, 0, 0, 0],
 [9, 0, 0, 0, 0, 0, 0, 0],
 [9, 0, 0, 0, 0, 0, 0, 0],
 [9, 0, 0, 0, 0, 0, 0, 0],
 [9, 0, 0, 0, 0, 0, 0, 0],
 [9, 0, 0, 0, 0, 0, 0, 0]]

<img src="../images/ch02/shallow3.jpg" width="240"/>

The correct representation should be like this:
<img src="../images/ch02/shallow4.jpg" width="380"/>

To properly initialize a two-dimensional list, we must ensure that each cell of the primary list refers to an independent instance of a secondary list. This can be accomplished through the use of Python’s list comprehension syntax.

In [8]:
m, n = 3, 2
data = [ [0] * n for j in range(m) ]
data

[[0, 0], [0, 0], [0, 0]]

An array can be extended and otherwise manipulated in the same ways as other Python sequences.

In [9]:
a = array.array('i', range(5))
print('Initial :', a)

a.extend(range(5))
print('Extended:', a)

print('Slice   :', a[3:6])

print('Iterator:', list(enumerate(a)))

print(type(a))
print(len(a))

Initial : array('i', [0, 1, 2, 3, 4])
Extended: array('i', [0, 1, 2, 3, 4, 0, 1, 2, 3, 4])
Slice   : array('i', [3, 4, 0])
Iterator: [(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 0), (6, 1), (7, 2), (8, 3), (9, 4)]
<class 'array.array'>
10


In [10]:
a = array.array("B", range(16)) # unsigned char
b = array.array("h", range(-8,9)) # signed short

print(a)
print(repr(a.tostring()))

print(b)
print(repr(b.tostring()))

array('B', [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])
b'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f'
array('h', [-8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8])
b'\xf8\xff\xf9\xff\xfa\xff\xfb\xff\xfc\xff\xfd\xff\xfe\xff\xff\xff\x00\x00\x01\x00\x02\x00\x03\x00\x04\x00\x05\x00\x06\x00\x07\x00\x08\x00'


### List

String is a sequence of characters, list is a sequece of object. Use list when the order of objects matters a lot.

- ADT List Operations
    - Create an empty list
    - Determine whether the list is empty
    - Determine the number of items in a list
    - Add an item at given position in a list
    - Remove the item at a given position in a list
    - Remove all the items from a list
    - Get the item at a given position in a list
    - Other operations?
    
<img src="../images/ch02/list.png" width="680"/>

<img src="../images/ch02/list2.png" width="300"/>

<img src="../images/ch02/list3.png" width="300"/>    

In [11]:
fruits = ['orange', 'apple', 'pear', 'banana', 'kiwi', 'apple', 'banana']
fruits.index('banana')

3

In [12]:
fruits.index('banana', 4)

6

In [13]:
fruits.reverse()
fruits

['banana', 'apple', 'kiwi', 'banana', 'pear', 'apple', 'orange']

In [14]:
fruits.append('grape')
fruits

['banana', 'apple', 'kiwi', 'banana', 'pear', 'apple', 'orange', 'grape']

In [15]:
fruits.sort()
fruits

['apple', 'apple', 'banana', 'banana', 'grape', 'kiwi', 'orange', 'pear']

In [16]:
squares = []
for x in range(10):
    squares.append(x**2)
squares

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [17]:
squares = [x**2 for x in range(10)]
squares

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [18]:
[(x, y) for x in [1,2,3] for y in [3,1,4] if x != y]

[(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4)]

In [19]:
vec = [-4, -2, 0, 2, 4]
[x*2 for x in vec]

[-8, -4, 0, 4, 8]

In [20]:
matrix = [
   [1, 2, 3, 4],
   [5, 6, 7, 8],
   [9, 10, 11, 12],
]
[[row[i] for row in matrix] for i in range(4)]

[[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, 12]]