# NumPy

In this section, you will:

* Compare NumPy and standard Python ways to do computation
    * Understand the advantages of NumPy by comparing computation times of NumPy and standard python
* Create NumPy arrays
    * Convert lists to numpy arrays 
    * Create (initialise) empty and non-empty arrays
* Manipulate one-dimensional arrays
    * Apply built-in functions on arrays
    * Manipulate multiple arrays
    * Apply your own functions on arrays
* Create and manipulate two, three and higher dimensional arrays

In [3]:
import numpy as np

The basic data structure in numpy is called an array, and is created using the method np.array(). <br>
The following code converts a list into a numpy array.

In [4]:
numpy_array = np.array([2, 4, 5, 6, 7, 9])
print(numpy_array)
print(type(numpy_array))

[2 4 5 6 7 9]
<class 'numpy.ndarray'>


### Advantages of NumPy 

So what's the use of an array over a list, especially for data analysis? <br>
1. You can write **vectorised** code on numpy arrays, not on lists. 
2. Numpy is much faster than the standard python ways to do computations.

Let's see an example. 

Say you have two lists of numbers, and want to calculate the element-wise product. 

In [13]:
list_1 = [3, 6, 7, 5]
list_2 = [4, 5, 1, 7]

# the list way to do it: map a product function to the two lists
product_list = list(map(lambda x, y: x*y, list_1, list_2))
print(product_list)


[12, 30, 7, 35]


In [14]:
# The numpy array way to do it: simply multiply the two lists
array_1 = np.array(list_1)
array_2 = np.array(list_2)

array_3 = array_1*array_2
print(array_3)
print(type(array_3))

[12 30  7 35]
<class 'numpy.ndarray'>


### Creating NumPy Arrays 

You can generate sequences of numbers in a numpy array using the method np.arange().

In [7]:
# generating a sequence of numbers from 10 to 95 with a gap of 5, in a numpy array
numbers = np.arange(10, 100, 5)
print(numbers)


[10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95]


### Applying Operations (Functions) on NumPy arrays

You can apply almost all common operations on NumPy arrays, such as mean, median, sum, etc.
There are two ways to apply functions on arrays, as shown in these examples:
1. numpy_array.mean()
2. np.median(numpy_array)

In [8]:
# mean of an array
print(array_1.mean())

# max value in an array
print(array_1.max())

# median
print(np.median(array_1))


5.25
7
5.5


###  Applying your own functions to numpy arrays

You can apply your own functions on np arrays, for instance, say you want to apply the operation x/(x+1) <br>on each element of an array.

In [50]:
# a numpy array
some_array = np.arange(100, 150)
new_array = map(lambda x: x/(x+1), some_array)
print(list(new_array))
print(type(new_array))


[0.99009900990099009, 0.99019607843137258, 0.99029126213592233, 0.99038461538461542, 0.99047619047619051, 0.99056603773584906, 0.99065420560747663, 0.9907407407407407, 0.99082568807339455, 0.99090909090909096, 0.99099099099099097, 0.9910714285714286, 0.99115044247787609, 0.99122807017543857, 0.99130434782608701, 0.99137931034482762, 0.99145299145299148, 0.99152542372881358, 0.99159663865546221, 0.9916666666666667, 0.99173553719008267, 0.99180327868852458, 0.99186991869918695, 0.99193548387096775, 0.99199999999999999, 0.99206349206349209, 0.99212598425196852, 0.9921875, 0.99224806201550386, 0.99230769230769234, 0.99236641221374045, 0.99242424242424243, 0.99248120300751874, 0.9925373134328358, 0.99259259259259258, 0.99264705882352944, 0.99270072992700731, 0.99275362318840576, 0.9928057553956835, 0.99285714285714288, 0.99290780141843971, 0.99295774647887325, 0.99300699300699302, 0.99305555555555558, 0.99310344827586206, 0.99315068493150682, 0.99319727891156462, 0.9932432432432432, 0.99328

In [51]:
# Say you want to round this off to two decimals

# The standard round() function
print(list(new_array))
new_array_rounded = map(lambda x: round(x, 2), list(new_array))
print(list(new_array_rounded))

[]
[]


In [10]:
## Comparing time taken for computation
list_1 = [i for i in range(1000000)]
list_2 = [j**2 for j in range(1000000)]

# list multiplication
import time
t0 = time.time()
product_list = list(map(lambda x, y: x*y, list_1, list_2))
t1 = time.time()
list_time = t1 - t0 
print(t1-t0)


# numpy array 
array_1 = np.array(list_1)
array_2 = np.array(list_2)

t0 = time.time()
array_3 = array_1*array_2
t1 = time.time()
numpy_time = t1 - t0

print(t1-t0)

print("The ratio of time taken is {}".format(list_time/numpy_time))

0.5543947219848633
0.023016691207885742
The ratio of time taken is 24.086638560581733


For calculating the product of two arrays, the *list* way took 1.11 seconds, whereas numpy took about 0.03 seconds. 
In this case, numpy is about 24 times faster than the standard implementation of python. 



### Working with 2-D and 3-D Arrays

In [65]:
# Creating 2-D arrays

# Using each row as a tuple
array_2d = np.array([(1, 4, 5), (3, 8, 6)])
print(array_2d)

# Intialising to Zeros
array_of_zeros = np.zeros((3, 4))
print(array_of_zeros)

# Ones
array_of_ones = np.ones((2, 4))
print(array_of_ones)

# 
print("\n", np.arange(10).reshape(2, 5))


[[1 4 5]
 [3 8 6]]
[[ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]]
[[ 1.  1.  1.  1.]
 [ 1.  1.  1.  1.]]
[[0 1 2 3 4]
 [5 6 7 8 9]]


In [61]:
# Creating 3-d arrays
array_3d = np.ones((2, 3, 4))
print(array_3d)

[[[ 1.  1.  1.  1.]
  [ 1.  1.  1.  1.]
  [ 1.  1.  1.  1.]]

 [[ 1.  1.  1.  1.]
  [ 1.  1.  1.  1.]
  [ 1.  1.  1.  1.]]]


In [57]:
# Indexing in arrays
array_1d = np.arange(1, 10)
print(array_1d)

# getting the nth element in one- array
print(array_1d[3])


[1 2 3 4 5 6 7 8 9]
4


In [63]:
# Indexing in 2-d arrays
print(array_2d)

# First row, second column
print(array_2d[0, 1])

[[1 4 5]
 [3 8 6]]
4
