# 03 Python Data Structures I - Lists and Tuples


## Plan for the Lecture:

1. Concept of a Data Structure

2. Array and Strings

3. Lists 

4. Tuples 

## 1.0 Concept of a Data Structure

* Unlike a variable which stores one value at a time, a data structure is built to store a collection of values. 

* Indexed structures allow for random access (RAM) – can locate an item by the index location. An array and vector allow for this.  

* Non-indexed structures (or referenced) structures, on the other hand, are navigated sequentially. For example, a stream of data from the keyboard or from a file, or a linked list in which each node has a pointer the next in the sequence.



In [29]:
num1 = 10
num2 = 20
num3 = 30
num4 = 40
num5 = 50
num6 = 60
num7 = 70
num8 = 80
num9 = 90
num10 = 100

This is a lot of variables... isn't there a better way to <b>collect</b> these values in one place?

In [40]:
numbers = [10,20,30,40,50,60,70,80,90,100]

## 1.1 Application of Data Structures
* There are entire modules (courses) dedicated to this subject.

* The performance of typical operations (insert, delete, search and sort) vary across the structures. 

* Big O notation (complexity): constant, linear, polynomial, linearithmic, quadratic etc. 

* Path finding algorithms. 

* Computer vision. 


## 1.2 Recap on the Data Types we've seen: 

* `str` = string 

* `int` = integer

* `float` = floating point number

* `bool` = boolean (True/False)

In [None]:
name = "Nick"

In [None]:
num = 5

In [None]:
pi = 3.14

In [None]:
is_enrolled = True

## 1.3 Memory (briefly) - Base-10 to Base-2

* We're familiar with base-10 (decimal): 10, 255, 1024 etc

* A byte is 8 bits. 

* Bits (base-2) can be turned on or off. 

<img src="https://knowthecode.io/wp-content/uploads/2016/10/CS_0100_Understanding_How_a_Computer_Works__1__key10.png" alt="base_2_number_system" width="650"> 

In [17]:
num = 10 

Easy way to convert decimal to binary (base-2) in Python is the `bin()` function 

In [19]:
bin(num)

'0b1010'

In [20]:
bin(num)[2:]

'1010'

In [23]:
binary = "1010"
decimal = int(binary, 2)
decimal

10

## 0.2 How Data Types are represented in Memory

* These figures below are <b>signed</b> - that means it spreads the range across positive and <b>negative</b> numbers.


<img src="https://scaler.com/topics/images/data-type-size-based-3.webp" alt="base_2_number_system" width="650"> 

What about a boolean? Could this be represented as a single 'bit'?   
   0 = False and 1 = True?

In [25]:
binary = "00000000"
decimal = int(binary, 2)
decimal

0

In [24]:
binary = "11111111"
decimal = int(binary, 2)
decimal

255

* The above is <b>unsigned</b> as the range is 0 to 255

* As you can see in the diagram, a <b>signed</b> byte would be -127 to 128

## 0.3 Signed vs unsigned 

* The diagram below shows the contrast between signed and unsigned ranges: 

![signed_unsigned](https://fastbitlab.com/wp-content/uploads/2022/05/Figure-2-1024x544.png)

In [39]:
import sys

num = 10
print(type(num))
print("number of bytes:", sys.getsizeof(num))

<class 'int'>
number of bytes: 28


Huh?!

  Shouldn't an integer be either 32 bits (4 bytes) or 64 bits (8 bytes)? 

   <b>Python adds a significant overhead of bytes here</b>

## 2.0 Array 

* The items in an array are called elements.

* We specify how many elements an array will have when we declare the size of the array (if ‘fixed-size’), unlike flexible sized collections (e.g. ArrayList in Java).

* Elements are numbered and can referred to by number inside the `[ ]` is called the index. This is used when data is input and output.

* Can only store data if it matches the type the array is declared with.



<img src="https://scaler.com/topics/images/character-in-character-array.webp" alt="char_array" width="650"> 

* In 'strongly typed languages' such as C, we have to declare up-front how much memory is required:

* `char name[5] = {'N', 'i', 'c', 'k', '\0'}; `

* Here, 5 x byte - 4 characters + the null terminator `'\0'`

* Not necessary to allocate memory at declaration with Python. 

## ASCII Table (quickly)

* How to represent alphanumerics (characters, symbols, numbers, keyboard characters) in decimal format: 

![ascii_table](https://www.asciitable.com/asciifull.gif)

* Like `bin()`, the function `ord()` converts a character to its decimal equivalent.

* Can also use the `chr()` to convert back to character.

In [32]:
ord('A')

65

In [31]:
ord('a')

97

In [30]:
ord('N')

78

In [36]:
name = "Nick"

for letter in name:
    print(letter, ":", ord(letter))

N : 78
i : 105
c : 99
k : 107


In [43]:
ord('A')

65

In [42]:
chr(65)

'A'

In [44]:
bin(65)

'0b1000001'

In [47]:
for i in range(65, 91):
    print(i, chr(i))

65 A
66 B
67 C
68 D
69 E
70 F
71 G
72 H
73 I
74 J
75 K
76 L
77 M
78 N
79 O
80 P
81 Q
82 R
83 S
84 T
85 U
86 V
87 W
88 X
89 Y
90 Z


## 2.1 Strings are an Array (or in Python, an str list)

* A String (str) object is an immutable array of characters. 

* Each character has a numbered position in the array (index):

* We can make use of functions to be able to perform operations on the string.



In [1]:
name = "Nick"

In [2]:
i = 0
for x in name: 
    print("[" + str(i) + "]" + " : " + str(x))
    i += 1

[0] : N
[1] : i
[2] : c
[3] : k


In [3]:
name[0]

'N'

In [4]:
name[3]

'k'

In [6]:
type(name[0])

str

In [5]:
print(type(name[0]))

<class 'str'>


## 2.2 The `dir` of methods

In [None]:
dir(str)

In [None]:
name

In [None]:
name.find('c')

In [None]:
name.find('C')

In [None]:
name.lower()

In [None]:
name.upper()

## 3. Python Lists `[ ]`

* A list in Python does use the subscript operator `[ ]` typically associated with an array. Elements in this list are also indexed.

* The list will maintain a pointer (reference) to objects, rather the integer values (remember Python types are classes).

* Lists in python are resizable, unlike static arrays which are fixed.

* Python lists can store elements of different types, whereas arrays are declared to store values of one type.


In [None]:
l = [1,2.25,"Nick","N",True]
l

In [None]:
l[5]

In [None]:
len(l)

In [None]:
l = [1,2,3,4,5,6]
l

In [None]:
l[0]

In [None]:
l[-1]

In [None]:
l[-2]

In [None]:
l

In [None]:
l[2:4]

In [None]:
type(name)

In [None]:
type(l)

In [None]:
dir(list)

In [None]:
l

In [None]:
l.append(7)
l

In [None]:
l.remove(7)
l

## 3.2 Numpy Arrays

![numpy_logo](https://upload.wikimedia.org/wikipedia/commons/thumb/3/31/NumPy_logo_2020.svg/320px-NumPy_logo_2020.svg.png)
* Numerical Python (NumPy) is a package full of methods that can perform useful operations on data.  

* NumPy provides a convenient API (Application Programmable Interface) that provides a way to ‘interface’ with / operate on data. 

* It reintroduces types which is more coding but more efficient way to search/sort/store data than the ‘loosely’ typed nature of Python that we’ve seen so far. 

* More documentation available at: https://numpy.org 

## 3.2.1 Numpy Arrays vs Python List

* NumPy arrays are different to Python Lists. 

* NumPy arrays reintroduce the ‘typed’ nature of more ‘verbose’ languages (C, C++, Java), where everything is explicitly typed. 

* NumPy arrays operate like arrays from C and Java where they declared to store data of one type (only integers), unlike Python and JS, which can store data of different types. 

* NumPy arrays therefore data is ‘cast’ – floating point numbers to integers, or in some cases – an error is produced (strings to integers).

In [12]:
import numpy as np

In [13]:
numpy_arr = np.array([3.14,2,3,4,5]) 
numpy_arr

array([3.14, 2.  , 3.  , 4.  , 5.  ])

In [14]:
py_list = [0, 1.0, "N", True]
py_list

[0, 1.0, 'N', True]

In [16]:
numpy_arr[3].dtype

dtype('float64')

How many bytes is 64-bits? 

In [45]:
a = np.array([1,2,3],dtype='float32')
a


array([1., 2., 3.], dtype=float32)

In [46]:
a[0].dtype

dtype('float32')

How many bytes? 

## 4. Tuples in Python `( )`

* We’ve seen that a Python list is indexed and can store elements of different types (heterogeneity) 

* Tuples are constant (immutable) – once they are declared, they cannot be reassigned. 

* A list is declared with `[ ]` whereas the tuple is declared with `( )`

* We can still refer to elements in a tuple via the `[ ]` 


In [None]:
l = []

In [None]:
t = (1,2,3,4,5,6)
t

In [None]:
t[0]

In [None]:
t[0] = 5

In [None]:
type(t)

In [None]:
dir(tuple)

In [97]:
t1 = (1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,3,3)

In [None]:
t1.count(3)

In [None]:
t.index(5)

## 5. Tuples vs Lists 

* Tuples are immutable (constant) – once they are declared, they cannot be reassigned. 

* A list is mutable – elements can be reassigned. 

* A list is declared with `[ ]` whereas the tuple is declared with `( )`

* We can refer to elements in both a list and tuple via the `[ ]` 


## Summary 

* You can distinguish between the key collections by the pairs of brackets used: 

| Structure | Brackets | Characteristics |
| ----------- | ----------- | --------- |
| Lists |	`[ , ]` | mutable |
| Tuples |	`( , )` | immutable | 


#### This Jupyter Notebook contains exercises for you to extend your introduction to OOP, by creating lists, tuples, sets, dictionaries of objects. Attempt the following exercises, which slowly build in complexity. If you get stuck, check back to the <a href = "https://www.youtube.com/watch?v=359eGFD7hS4"> Python lecture recording on Data Structures here</a> or view the <a href = "https://www.w3schools.com/python/python_lists.asp">W3Schools page on Python Variables</a>, which includes examples, exercises and quizzes to help your understanding. 

### Exercise 1: 

Write your name as a string, and print out the last character of your name. Is there a convenient way to get to the last character of a `list`?

Extension: Can you write a loop to print out all letters of your name on a separate line?

Extension: Can you use Python to write your name in binary (base-2)?

In [None]:
# Write your solution here


### Exercise 2:
Create a Python `list` that stores the numbers 1-10 in indivdual elements. Then print out the contents of the `list` to check the values have been stored correctly.

Extension: make use of an appropriate `list` method to reverse the order of this `list`.

Extension: how much memory does this list take up? Can you use Python to calculate the total size in memory? 

In [None]:
# Write your solution here


### Exercise 3: 

Now can you create a function which will return the first and last value of a `list` passed in. Code this function so that it will work with any length of `list`. 

Question: Which `list` function will return the length of a `list`?

In [None]:
def first_and_last():
    ... # Write your solution here.

In [None]:
l = ...
first_and_last(l)

### Exercise 4: 

Consider the given `tuple` below. Is there a function you call to return the `count` of the value `6`?

In [1]:
t = (1,2,3,4,5,6,4,5,6,7,8,9,1,2,3,7,8,9,4,5,6)

# Write your solution here

### Exercise 5:

Write a function which will square (raise to the power of 2) the contents of a list of values passed in. Test this works by passing in your list of numbers (1-10) you created in the first exericse.

<b>Extension</b>: what happens if the values in a list are not `ints` or `floats`? How would you respond to this event?

In [None]:
def square():
    ...

In [None]:
# Write your solution here


### Exercise 6: 

Write function that will find a middle element of a given `list`. 

If you had a `list` with an odd number as its length : `[1,2,3,4,5,6,7,8,9]`, then the middle element will have a symmetric halves (four values either side of the middle value 5). 

For an even number as a length, either return both elements in the middle or round up or down to either side.  

Check your function works for a variety of `list` sizes.

In [None]:
# Write your solution here


### Exercise 7:
Write a function that takes two `lists` of numbers, and returns the number that appears most frequently across both `lists` (the mode). 

Hint: if you get stuck, try creating a tally of how many times each number appears. Have you `counted` the instances of the same value before?


In [None]:
def most_frequent(a, b):
    ... #write your solution here

a = [0,1,3,4,6,3,2,4,1,9,5,6,7,7,1,8,4,0]
b = [7,3,9,6,7,4,2,1,3,9,7,5,1,3,4,2,1,8]

result = most_frequent(a, b)
result


### (Bonus) Exercise (in the style of an interview question)

You are given a list of integers, and your task is to find the longest subsequence of consecutive integers within the list. A subsequence is a sequence that can be derived from another sequence by deleting some or no elements without changing the order of the remaining elements. 

Write a Python function to solve this problem. Your function should return the longest consecutive subsequence found in the original list.

For example, given the input list: ``` [4, 2, 8, 5, 6, 7, 11, 12, 10]```

The longest consecutive subsequence is: ``` [4, 5, 6, 7, 8] ```


In [None]:
def longest_consecutive_subsequence(numbers):
    #write your solution here
    ...
    #write your solution above


numbers = [4, 2, 8, 5, 6, 7, 11, 12, 10]
result = longest_consecutive_subsequence(numbers)
print(result)  