It might help to remember that Python string type holds Unicode (UTF-8 encoded, commonly used) strings, whereas bytes are raw ASCII bytes (Unicode encoded as bytes).

In [56]:
# convert a unicode string into its UTF-8 bytes representation
str_val = 'foo'
print(type(str_val))
val_utf8 = str_val.encode('utf-8') #reminder: encode() is a method
print(type(val_utf8))

# decode it back
val_utf8.decode('utf-8') # as we already knew the encoding

<class 'str'>
<class 'bytes'>


'foo'

OOP Talk: Everything is an object in Python, numbers, functions, strings, data structures - you name it. This also means that every object has an inherent type (class).

In [57]:
#To check if an object is a particular type
integer = 5
print(isinstance(integer, int))

#To check if an object is of one of many types
integer = 5
isinstance(integer, (float, str))

True


False

In [58]:
# floats can be written in scientific notation
float_val = 9.3e-4
float_val

0.00093

We start with sequences. <br>
Note: Lists, dicts, Numpy arrays are mutable (values can be modified). But strings and tuples are immutable.

### List

- [ ] to define a list
- the values within a list are modifiable i.e. you can change them
- the order of elements is maintained 

In [75]:
letters_list = ['a','b','c']
letters_list

['a', 'b', 'c']

Subscript notation <br><br>
letters_list[index] is used to access the value at index number (say index = 0 for the first element which is letters_list[0] = a)

**Slicing** is a frequently used feature. The notation is `list[start index: stop index: step]` i.e. slice starting at the _start index_ and stop at the one before the _stop index_. 

In [7]:
init_list = [1, 2, 3, 4]
print(init_list)

sliced_list = init_list[2:3] 
# the output does NOT include the element at the 3rd position
print("Sliced list:",  sliced_list)

init_list[1:2] = [100, 200]
print("Same list with an assigned slice:", init_list)

print("And this is negative slicing that start at the end:", init_list[-2:])

# A common use of slicing is reversing a list using a negative step
print("Reversed list", init_list[::-1])

# alternatively you can use reversed()
print(list(reversed(init_list)))

[1, 2, 3, 4]
Sliced list: [3]
Same list with an assigned slice: [1, 100, 200, 3, 4]
And this is negative slicing that start at the end: [3, 4]
Reversed list [4, 3, 200, 100, 1]
[4, 3, 200, 100, 1]


#### append() and insert()

_method_
<br>
append() adds a new element at the **end** of the list but with insert() you can specify its location. insert() is however, computationally expensive.

In [76]:
letters_list.append('d')
print(letters_list)

letters_list.insert(1, 'foo')
print(letters_list)

['a', 'b', 'c', 'd']
['a', 'foo', 'b', 'c', 'd']


#### remove() and pop()
*method*
<br>remove() removes an element (big brain time), with pop you can remove and return an element given the index.

In [77]:
letters_list.remove('c')
print(letters_list)

print(letters_list.pop(-1))
print(letters_list)

['a', 'foo', 'b', 'd']
d
['a', 'foo', 'b']


#### extend()
_method_<br>
Append multiple elements to an existing list

In [78]:
new_list = [4, 44, 444]
letters_list.extend(new_list)
print(letters_list)

['a', 'foo', 'b', 4, 44, 444]


#### list()
_function_<br>
`list()` always creates a new list i.e. a copy. 

In [62]:
list(range(6))

[0, 1, 2, 3, 4, 5]

In [21]:
seq = [1, 2, 3, 4]
for i in range(len(seq)):
    val = seq[i]
    print(val)

1
2
3
4


_Checking whether a list contains a value is a lot slower than doing so with dicts and
sets (to be introduced shortly), as Python makes a linear scan across the values of the
list, whereas it can check the others (based on hash tables) in constant time._ - Wes McKinney, Python for Data Analysis

#### enumerate()
_function_<br>
A widely used function which returns a tuple of (index, value) for when you want to iterate over a sequence. 

In [9]:
some_list = ['a', 'b']

for i, v in enumerate(some_list):
    print("index {0} value {1}".format(i, v))

index 0 value a
index 1 value b


#### zip()

_function_<br>
Takes elements from different sequences and creates a list of tuples. Widely used with enumerate for iterating over multiple sequences as follows

In [1]:
seq1 = ['a', 'b', 'c']
seq2 = ['A', 'B', 'C']

for i, (element1, element2) in enumerate(zip(seq1, seq2)):
    print('Index {0}: {1} is from seq1 and {2} is from seq2'.format(i, element1, element2))

Index 0: a is from seq1 and A is from seq2
Index 1: b is from seq1 and B is from seq2
Index 2: c is from seq1 and C is from seq2


In [4]:
# Another use of zip is to unzip pairs into tuples
gunners = [('Bukayo', 'Saka'), ('Emile', 'Smith Rowe'), ('Eddie', 'Nketiah')]
first, last = zip(*gunners)

print("First names: ", first)
print("Last names: ", last)

First names:  ('Bukayo', 'Emile', 'Eddie')
Last names:  ('Saka', 'Smith Rowe', 'Nketiah')


#### *List Comprehension*

In [1]:
# instead of a code block, you do the iteration in one line of code
# like this

list_comp = [num for num in range(10) if num % 3 == 0]
list_comp # it's clean, compact and often faster

# set and dict comprehensions are similar

[0, 3, 6, 9]

In [5]:
# let's try a nested for loop

capital_letters = 'ABCDE'
small_letters = 'abcde'

combinations = [caps + smalls for caps in capital_letters for smalls in small_letters]
combinations[:7]


['Aa', 'Ab', 'Ac', 'Ad', 'Ae', 'Ba', 'Bb']

### Tuple

- ( ) are used for tuples
- the values within a list are immutable i.e. you can't change them afterwards. Also means they're fixed-length
- the order is maintained (a is the first value, b is the second...)
- subscript notation applicable
- widely used in returning multiple values from a function

In [25]:
letters_tuple = ('a','b','c')
print("This is a tuple:", letters_tuple)

nested_tuple = (1,2),('a','b')
print("This is a nested tuple:", nested_tuple)

string_tuple = tuple("foo") # type casting
print("This is a string converted tuple:", string_tuple)

print("Though immutable, you can access elements of a tuple: ", letters_tuple[1])
print("This is a concatenated tuple: ", letters_tuple + string_tuple)
print("So is this: ", string_tuple * 3)

This is a tuple: ('a', 'b', 'c')
This is a nested tuple: ((1, 2), ('a', 'b'))
This is a string converted tuple: ('f', 'o', 'o')
Though immutable, you can access elements of a tuple:  b
This is a concatenated tuple:  ('a', 'b', 'c', 'f', 'o', 'o')
So is this:  ('f', 'o', 'o', 'f', 'o', 'o', 'f', 'o', 'o')


In [37]:
# unpacking a tuple comes in handy
shirt_numbers = (7, 8, 11, 30, 99)
Saka, Ode, Gabi, *_ = shirt_numbers # _ or *_ typically used for unwanted values
print("Gabi got {}!".format(Gabi))

# this makes swapping variable values easy
White = 22
Elneny = 4
Elneny, White = White, Elneny
print("White grabbed {} from Elneny".format(White))

Gabi got 11!
White grabbed 4 from Elneny


#### count()
_method_
<br>counts the frequency of a value in a tuple

In [40]:
shirt_numbers.count(7)

1

### Dict

- { key: value }
- key can be scalar types or tuples - immutable, a.k.a hashable objects
- value can be any Python object

In [16]:
e = {'e': 'y'}
print("Keys of the dict: ", list(e.keys()))
print("Values of the dict: ", list(e.values()))

f = {'f': 'z'}

# now let's merge them into one dict
e.update(f)
print(e)

'y' in e #because in is only used to check whether a key is present in a dict

Keys of the dict:  ['e']
Values of the dict:  ['y']
{'e': 'y', 'f': 'z'}


False

--> method setdefault() and defaultdict() class from collections library (add later)

{'a': 'Arsenal', 'b': 'Bournemouth', 'm': 'mouth'}

### Set

- { }, but to create an empty set `set()`
- the values within a list are modifiable i.e. you can change themthe values within a list are modifiable i.e. you can change them
- no order is maintained so subscript notation does not work
- you can perform set operations like union (`set_a.union(set_b) or set_a | set_b`) and intersection (`set_a.intersection(set_b) or set_a & set_b`) on sets

In [5]:
letters_set = {'a','b','c'}
letters_set

{'a', 'b', 'c'}

#### add()
_method_
<br> Adds an element to a set. `append()` doesn't work on sets, because it has no order.

In [18]:
letters_set.add('d')
letters_set

#if we add the same thing twice it will be added once
letters_set.add('d')
letters_set

{'a', 'b', 'c', 'd'}

### Iterator

In [13]:
# The range function returns an iterator of evenly spaced integers
range_here = range(6)
print(range_here)

range_to_list = list(range_here)
print(range_to_list)

range_negative_step = range(8, 3, -1)
print(list(range_negative_step))

# does it work without the negative step?
does_it_work = range(9, 2)
print("Nyah ", list(does_it_work))

range(0, 6)
[0, 1, 2, 3, 4, 5]
[8, 7, 6, 5, 4]
Nyah  []


This iterator created using `range()` is used to iterate through a sequence by **index**. 

In [16]:
# Two common use cases to iterate over range-produced sequence

print("Over a list")
list_iter = [1, 2, 3]
for i in range(len(list_iter)):
    print(i, list_iter[i])

print("Over an iterator")
for i in range(3):
    print(i)

Over a list
0 1
1 2
2 3
Over an iterator
0
1
2


### Duck Typing <br> 
For when you care less about the object type and more about whether it has a certain method. Because _if it walks like a duck and quacks like a duck, then it's a duck_.

In [1]:
#An useful function, that comes is handy in the following setting

def isiterable(obj): 
    try:
        iter(obj)
        return True
    except TypeError:
        return False

print(isiterable('foo'))

True


Say you have a function that can take any type of sequence as input, you can use this function to maintain consistency for further down the code <br>
f.e <br>
`if not instance(x, list) and isiterable(x):`
   <br> &nbsp;&nbsp;&nbsp;&nbsp; `x = list(x)`