# <font color=red>Tutorial 2 - Python Lists & Numpy Arrays</font>

## <b>Introduction</b>

### <b>Python Data Types</b>

In [1]:
print(type(123)) # int
print(type(12.34)) # float
print(type('hello world')) # string (can also use "" instead of '')
print(type([1,2,3,4])) # list
print(type((1,2,3))) # tuple
print(type({1:'one', 2:'two', 3:'three'})) # dictionary

<class 'int'>
<class 'float'>
<class 'str'>
<class 'list'>
<class 'tuple'>
<class 'dict'>


### <b>Mutable and Immutable Objects</b>

Numeric values (int and float), strings and tuples are immutable, which means their content can't be altered after creation. On the other hand, collection of items in a List or Dictionary object can be modified

In [2]:
my_str = 'immutable objects like strings cannot be changed'
print(my_str[1])
#my_str[1] = 's' # Error

m


In [3]:
list_is_mutable = [1, 2, 3]
list_is_mutable[1] = 20
list_is_mutable

[1, 20, 3]

### <b>Explore Object Attributes and Functionality</b>

To explore any python type attributes and functionality we can use the dir() method and then use help on any of the available methods. The double underscore ('__') before the attribute's name means that the attribute will not be directly accessible/visible outside.

In [4]:
numbers = [1, 2, 3, 4] # list
dir(numbers)

['__add__',
 '__class__',
 '__class_getitem__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__imul__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__rmul__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'append',
 'clear',
 'copy',
 'count',
 'extend',
 'index',
 'insert',
 'pop',
 'remove',
 'reverse',
 'sort']

In [5]:
help(numbers.reverse)

Help on built-in function reverse:

reverse() method of builtins.list instance
    Reverse *IN PLACE*.



In [6]:
numbers = list(range(10))
print(numbers)
numbers.reverse()
numbers

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

## <b>Lists</b>

Lists are a built-in data structures for storing and accessing objects in specific order. It can contain various types of values. A list is a mutable container, this means that we can add values, delete values, or modify existing values.<br>
Python list represents a mathematical concept of a finite sequence. Values of a list are called items or elements of the list. A list can contain the same value multiple times. Each occurrence is considered a distinct item. In this tutorial we'll learn how to create and use lists.

### <b>Creating Lists</b>

Following are two different ways to create empty lists. We can confirm we created a list using the type() function.

In [7]:
example1 = list()
type(example1)

list

In [8]:
example2 = []
type(example2)

list

We can also create and initialize a list. For example we can initialize a list of prime numbers

In [9]:
primes = [2, 3, 5, 7, 11, 13, 17]
primes

[2, 3, 5, 7, 11, 13, 17]

List can contain various types of data

In [10]:
various = [100, 'mylist', True, 3.72, [1, 2, 3], (4, 6)]
various

[100, 'mylist', True, 3.72, [1, 2, 3], (4, 6)]

We can also create a list conditioned on our first list, for example lets create a list of small prime numbers

In [11]:
small_primes_1 = []
for item in primes:
    if item < 10:
        small_primes_1.append(item)
        
small_primes_1

[2, 3, 5, 7]

In [12]:
small_primes_2 = [item for item in primes if item < 10]
small_primes_2

[2, 3, 5, 7]

We can create a list from any iterable collection (e.g. tuple, string,..)

In [13]:
tup = ('one', 'two', 'three')
print(type(tup))
tup

<class 'tuple'>


('one', 'two', 'three')

In [14]:
my_list = list(tup)
print(type(my_list))
my_list

<class 'list'>


['one', 'two', 'three']

In [15]:
my_str = 'string to list'
print(my_str)
my_list = list(my_str)
my_list

string to list


['s', 't', 'r', 'i', 'n', 'g', ' ', 't', 'o', ' ', 'l', 'i', 's', 't']

### <b>Accessing Lists</b>

Let's first check the size of our list. Then we'll use [index] to refer to specific item

In [16]:
print(len(primes))

7


In [17]:
primes[0]

2

In [18]:
primes[1]

3

We can refer to an item from the end of the list

In [19]:
primes[-1]

17

In [20]:
primes[-2]

13

We can use slicing ([start:stop]) to refer to multiple items - notice that we do not get the stop index item

In [21]:
primes[2:5]

[5, 7, 11]

In [22]:
primes[0:4]

[2, 3, 5, 7]

We can also use steps - list[start:stop:step]. For exampe lets get every second prime number

In [23]:
primes[0:7:2]

[2, 5, 11, 17]

Notice that if we leave the start and stop empty, the default is the starting and ending indices of the list - the following will have the same result

In [24]:
primes[::2]

[2, 5, 11, 17]

Lets get every second prime number, starting from the second

In [25]:
primes[1::2]

[3, 7, 13]

### <b>Looping over lists</b>

In [26]:
for item in primes:
    print(item)

2
3
5
7
11
13
17


If we need both the index and the item, we can use the enumerate function: 

In [27]:
for index, item in enumerate(primes):
    print(index, item)

0 2
1 3
2 5
3 7
4 11
5 13
6 17


### <b>List concatenation</b>

We can concatenate lists just by using the '+' sign, note that this operation does not change the original lists

In [28]:
numbers = [1, 2, 3]
letters = ['a', 'b', 'c']
print(numbers + letters)

[1, 2, 3, 'a', 'b', 'c']


To concatenate in-place we can use the extend() function

In [29]:
numbers.extend(letters)
print(numbers)

[1, 2, 3, 'a', 'b', 'c']


### <b>Modifying Lists</b>

In [30]:
print(numbers)
numbers[3] = 4
numbers[4:6] = [5,6]
print(numbers)

[1, 2, 3, 'a', 'b', 'c']
[1, 2, 3, 4, 5, 6]


To create a copy of a list we can use slicing or the list function. Notice that just using the '=' sign creates a reference to the original list - In the following example we can see that numbers3 referes to the original numbers list.

In [31]:
numbers = [1, 2, 3]
numbers1 = numbers[:]
numbers2 = list(numbers)
numbers3 = numbers
numbers4 = numbers.copy()
numbers[0] = 100
print(numbers)
print(numbers1)
print(numbers2)
print(numbers3)
print(numbers4)

[100, 2, 3]
[1, 2, 3]
[1, 2, 3]
[100, 2, 3]
[1, 2, 3]


To add an item to a list we can use either the append or the insert - **append** inserts the item to the end of the list, **insert** inserts an item at a given index, and move the remaining items to the right.

In [32]:
print(primes)
primes.append(21)
print(primes)
primes.insert(7, 19)
print(primes)

[2, 3, 5, 7, 11, 13, 17]
[2, 3, 5, 7, 11, 13, 17, 21]
[2, 3, 5, 7, 11, 13, 17, 19, 21]


To delete items from a list we can use each of the following:
- del - removes an indivudual item or all items identified by a slice
- pop - removes an individual item at specific index and returns it
- remove - searches for an item, and removes the first matching item from the list

In [33]:
numbers = list(range(1,20,2)) # create a list starting from 1 to 20 with steps of 2
print(numbers)
first = numbers.pop(0) # remove and return item at index 0
print(first) # pop returns the item we removed from the list
print(numbers)
del numbers[-2:] # remove last 2 items
print(numbers)
numbers.insert(0, first) # insert 'first' to index 0
print(numbers)
numbers.remove(5) # searching for item with value of 5 and removes it
print(numbers)

[1, 3, 5, 7, 9, 11, 13, 15, 17, 19]
1
[3, 5, 7, 9, 11, 13, 15, 17, 19]
[3, 5, 7, 9, 11, 13, 15]
[1, 3, 5, 7, 9, 11, 13, 15]
[1, 3, 7, 9, 11, 13, 15]


# <b>Numpy</b>

Numpy is the core library for scientific computing in Python. It provides a high-performance multi-dimensional array object, and tools for working with these arrays.  If you are going to work on data analysis or machine learning projects, then having a solid understanding of Numpy is nearly mandatory, since both the Pandas (data analysis) and the Scikit-learn (machine learning) packages are built on top of Numpy.

### <b>Numpy Arrays</b>

At the core, Numpy provides the excellent ndarray objects, short for n-dimensional arrays. In a ‘ndarray’ object, you can store multiple items of the same data type. It is the facilities around the array object that makes Numpy so convenient for performing math and data manipulations.<br> There are multiple ways to create a numpy array, one of the most common ways is to create one from a list or a list like object by passing it to the np.array function:

In [34]:
import numpy as np

py_list = list(range(10))
np_array = np.array(py_list)
print(str(py_list) + ', Type: ' + str(type(py_list)))
print(str(np_array) + ', Type: ' + str(type(np_array)))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9], Type: <class 'list'>
[0 1 2 3 4 5 6 7 8 9], Type: <class 'numpy.ndarray'>


The key difference between an array and a list is that array is designed to handle vectorized operations while a python list is not. Let’s suppose you want your list to range from 1 to 10, so you want to add 1 to every item. The intuitive way to do it is something like this:

In [35]:
#py_list + 1 # Error

That is not possible with a list. But you can do that on a ndarray:

In [36]:
np_array + 1

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

Once a ndarray is created, you cannot alter its size. To do so, you will have to create a new ndarray:

In [37]:
py_list + [11, 12]

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12]

In [38]:
#np_array + np.array([11,12]) # Error

You may also specify the datatype by setting the dtype argument. Some of the most commonly used numpy dtypes are: 'float', 'int', 'bool', 'str' and 'object'. Unlike lists, numpy arrays should have all items to be of the same data type. This is another significant difference. 

In [39]:
np_array_float = np.array([1, 2, 3 ,4 ,5], dtype='float')
np_array_float

array([1., 2., 3., 4., 5.])

To summarise, the main differences between Numpy arrays and python lists are::

1. Numpy Arrays support vectorized operations, while lists don’t.
2. Once an array is created, you cannot change its size. You will have to create a new array or overwrite the existing one.
3. Every array has one and only one dtype. All items in it should be of that dtype.

### <b>Numpy Basic Attributes and Functionality</b>

The attributes dtype, shape, and size of Numpy array object:

In [40]:
np_arr = np.array([[1, 2, 3], [4, 5, 6]])
print(str(np_arr) + ', Type: ' + str(np_arr.dtype) + ', Shape: ' + str(np_arr.shape) + ', Size: ' + str(np_arr.size))

[[1 2 3]
 [4 5 6]], Type: int32, Shape: (2, 3), Size: 6


Basic functionality of Numpy array object: 

In [41]:
# mean, max and min
print(np_arr)
print("Mean value is: ", np_arr.mean()) # the mean method returns the average of the numpay array
print("Max value is: ", np_arr.max())
print("Min value is: ", np_arr.min())

[[1 2 3]
 [4 5 6]]
Mean value is:  3.5
Max value is:  6
Min value is:  1


In [42]:
print(np_arr)
print(np_arr**2)
print(np.sqrt(np_arr))

[[1 2 3]
 [4 5 6]]
[[ 1  4  9]
 [16 25 36]]
[[1.         1.41421356 1.73205081]
 [2.         2.23606798 2.44948974]]


The random module provides nice functions to generate random numbers (and also statistical distributions) of any given shape.

In [43]:
# Random integers between [0, 10) of shape 2,2
print(np.random.randint(0, 10, size=[2,2]))

[[2 8]
 [7 3]]


In [44]:
# One random number between [0,1)
print(np.random.random())

0.49562276312320463


In [45]:
# Random numbers between [0,1) of shape 2,2
print(np.random.random(size=[2,2]))

[[0.50622657 0.058678  ]
 [0.890051   0.48619705]]


In [46]:
# Pick 10 items from a given list, with equal probability
print(np.random.choice(['a', 'e', 'i', 'o', 'u'], size=10))

['a' 'u' 'a' 'i' 'i' 'o' 'u' 'u' 'e' 'i']


In [47]:
# Pick 10 items from a given list with a predefined probability 'p'
print(np.random.choice(['a', 'e', 'i', 'o', 'u'], size=10, p=[0.3, 0.1, 0.1, 0.4, 0.1]))  # picks more o's and a's

['a' 'i' 'a' 'a' 'a' 'a' 'o' 'a' 'a' 'u']


The np.unique() method can be used to get the unique items of a Numpy array. If we want the repetition counts of each item, we should set the return_counts parameter to True:

In [48]:
bin_array = np.random.randint(0, 2, size=10) # Create random binary array of size 10
print(bin_array)
uniques, counts = np.unique(bin_array, return_counts=True)
print("Unique items : ", uniques)
print("Counts       : ", counts)

[0 0 0 1 0 0 1 0 0 1]
Unique items :  [0 1]
Counts       :  [7 3]


## <font color=blue> **Exercise** </font>

<b>Important</b>: Use the help() method to get more information regarding the suggested methods (e.g. help(numpy.random.uniform)) <br>
1. Use the `numpy.random.uniform` method to create a random uniform numpy array containing 100 numbers representing basketball players' heights ranging from 1.65 to 2.20.
2. Print the minimum, maximum and mean of the players' heights.
3. Use the `np.where` method to create a new binary array in which heights above 1.80 are represented by '1' and below 1.80 are represented by '0'.
4. Use the `np.unique` function to present how many players are above 1.80, and how many are below.

In [49]:
# Write your code here


In [50]:
# Solution
import numpy as np

# 1
heights = np.random.uniform(1.65, 2.20, 100)

# 2
print('Min: ' + str(heights.min()) + ', Max: ' + str(heights.max()) + ', Mean: ' + str(heights.mean()) )

# 3
bin_heights = np.where(heights < 1.80, 0, 1)

# 4
uniques, counts = np.unique(bin_heights, return_counts=True)
print("Unique items : ", uniques)
print("Counts       : ", counts)

# 4 Alternative solution
bin_heights.size - bin_heights.sum()

Min: 1.6579896247786208, Max: 2.188516091437082, Mean: 1.9114109140278248
Unique items :  [0 1]
Counts       :  [34 66]


34