# NumPy (Numerical Python)

Numpy is the core library for scientific computing in Python. It provides a high-performance multidimensional array object, and tools for working with these arrays.


*   An open source extension module for Python.
*   provides fast precompiled functions for mathematical and numerical routines.
*   Focuses on huge matrices and arrays.
*   Supplies a large library of high-level mathematical functions to operate on these matrices and arrays.

**Advantages of using Numpy**

*   Array Oriented Computing.
*   Efficiently implemented multi-dimensional arrays.
*   Designed for scientific computation



# **numpy Arrays**

In [0]:
import numpy as np

In [2]:
#1D Array

a=np.array([1,2,3])
print(a)


[1 2 3]


In [3]:
a=np.array([(1,2,3),(4,5,6)])
print(a)

[[1 2 3]
 [4 5 6]]


# NumPy Array v/s List

Reasons to choose numpy array v/s Python lists:


1.   Less Memory Consumption
2.   Fast
2.   Convinient





In [5]:
#Size Comparison between numpy array and list
import numpy as np
from sys import getsizeof as size

lst = [1,2,3]
size_of_elements = len(lst) * size(0)
size_of_list_object = size(lst)   
total_list_size = size_of_list_object + size_of_elements
print("Size without the size of the elements: ", size_of_list_object)
print("Size of all the elements: ", size_of_elements)
print("Total size of list, including elements: ", total_list_size)

print()

arr=np.array([1,2,3])
size_of_elements = arr.itemsize * arr.size
size_of_array_object = size(arr) - size_of_elements
total_list_size = size(arr)
print("Size without the size of the elements : ", size_of_array_object)
print("Size of all the elements: ", size_of_elements)
print("Total size of list, including elements: ", total_list_size)


Size without the size of the elements:  88
Size of all the elements:  72
Total size of list, including elements:  160

Size without the size of the elements :  96
Size of all the elements:  24
Total size of list, including elements:  120


**Aggregate Functions : max/min/sum**



In [7]:
a= np.array([54,76,45,76,45,76,23,43,687,54,-23])
print(a.min())
print(a.max())
print(a.sum())

-23
687
1156


# Creating numpy Arrays

**arange**

arange([start,] stop[, step], [, dtype=None])

arange returns evenly spaced values within a given interval. The values are generated within the half-open interval '[start, stop)' If the function is used with integers, it is nearly equivalent to the Python built-in function range, but arange returns an ndarray rather than a list iterator as range does. If the 'start' parameter is not given, it will be set to 0. The end of the interval is determined by the parameter 'stop'. Usually, the interval will not include this value, except in some cases where 'step' is not an integer and floating point round-off affects the length of output ndarray. The spacing between two adjacent values of the output array is set with the optional parameter 'step'. The default value for 'step' is 1. 

In [12]:
import numpy as np

a = np.arange(1, 10)
print(a)

#x = range(1, 10)
#print(x)    # x is an iterator
#print(list(x))

# further arange examples:
x = np.arange(10.4)
print(x)

x = np.arange(0.5, 10.4, 0.8)
print(x)

x = np.arange(0.5, 10.4, 0.8, int)
print(x)

[1 2 3 4 5 6 7 8 9]
[ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9. 10.]
[ 0.5  1.3  2.1  2.9  3.7  4.5  5.3  6.1  6.9  7.7  8.5  9.3 10.1]
[ 0  1  2  3  4  5  6  7  8  9 10 11 12]


## **linspace**

linspace(start, stop, num=50, endpoint=True, retstep=False)

linspace returns an ndarray, consisting of 'num' equally spaced samples in the closed interval [start, stop] or the half-open interval [start, stop). If a closed or a half-open interval will be returned, depends on whether 'endpoint' is True or False. The parameter 'start' defines the start value of the sequence which will be created. 'stop' will the end value of the sequence, unless 'endpoint' is set to False. In the latter case, the resulting sequence will consist of all but the last of 'num + 1' evenly spaced samples. This means that 'stop' is excluded. 

In [21]:
import numpy as np

# 50 values between 1 and 10:
print(np.linspace(1, 10))

# 7 values between 1 and 10:
print(np.linspace(1, 10, 7))

# excluding the endpoint:
print(np.linspace(1, 10, 7, endpoint=False))

[ 1.          1.18367347  1.36734694  1.55102041  1.73469388  1.91836735
  2.10204082  2.28571429  2.46938776  2.65306122  2.83673469  3.02040816
  3.20408163  3.3877551   3.57142857  3.75510204  3.93877551  4.12244898
  4.30612245  4.48979592  4.67346939  4.85714286  5.04081633  5.2244898
  5.40816327  5.59183673  5.7755102   5.95918367  6.14285714  6.32653061
  6.51020408  6.69387755  6.87755102  7.06122449  7.24489796  7.42857143
  7.6122449   7.79591837  7.97959184  8.16326531  8.34693878  8.53061224
  8.71428571  8.89795918  9.08163265  9.26530612  9.44897959  9.63265306
  9.81632653 10.        ]
[ 1.   2.5  4.   5.5  7.   8.5 10. ]
[1.         2.28571429 3.57142857 4.85714286 6.14285714 7.42857143
 8.71428571]


**Zero Dimensional Array**

It's possible to create multidimensional arrays in numpy. Scalars are zero dimensional in numpy

In [1]:
import numpy as np
x = np.array(99)
print("x: ", x)
print("The type of x: ", type(x))
print("The dimension of x:", np.ndim(x))

x:  99
The type of x:  <class 'numpy.ndarray'>
The dimension of x: 0


**One Dimensional Array**

Numpy arrays are containers of items of the same type, e.g. only integers. The homogenous type of the array can be determined with the attribute "dtype"

In [2]:
F = np.array([1, 1, 2, 3, 5, 8, 13, 21])
V = np.array([3.4, 6.9, 99.8, 12.8])

print("F: ", F)
print("V: ", V)

print("Type of F: ", F.dtype)
print("Type of V: ", V.dtype)

print("Dimension of F: ", np.ndim(F))
print("Dimension of V: ", np.ndim(V))

F:  [ 1  1  2  3  5  8 13 21]
V:  [ 3.4  6.9 99.8 12.8]
Type of F:  int64
Type of V:  float64
Dimension of F:  1
Dimension of V:  1


**M Dimensional Array**

In [5]:
B = np.array([ [[111, 112], [121, 122]],
               [[211, 212], [221, 222]],
               [[311, 312], [321, 322]] ])
print("Elements of B : ", B)
print("Dimension of B : ", B.ndim)

Elements of B :  [[[111 112]
  [121 122]]

 [[211 212]
  [221 222]]

 [[311 312]
  [321 322]]]
Dimension of B :  3


**shape**

"shape" returns the shape of an array. The shape is a tuple of integers. These numbers denote the lengths of the corresponding array dimension.

In [0]:
x = np.array([ [67, 63, 87],
               [77, 69, 59],
               [85, 87, 99],
               [79, 72, 71],
               [63, 89, 93],
               [68, 92, 78]])
print(np.shape(x))


#"shape" can also be used to change the shape of an array.

x.shape = (3, 6)
print(x)

x.shape = (2, 9)
print(x)

**reshape**

Reshape changes the number of rows and columns in an array which gives a new view to an object.

In [9]:
x = np.array([ [67, 63, 87],
               [77, 69, 59],
               [85, 87, 99],
               [79, 72, 71],
               [63, 89, 93],
               [68, 92, 78]])
print(np.shape(x))

y=x.reshape(2,9)
print(y)

(6, 3)
[[67 63 87 77 69 59 85 87 99]
 [79 72 71 63 89 93 68 92 78]]


**arrays of ones and zeroes**

There are two ways of initializing Arrays with Zeros or Ones.

In [10]:
F = np.ones((3,4),dtype=int)
print(F)

Z = np.zeros((2,4))
print(Z)

[[1 1 1 1]
 [1 1 1 1]
 [1 1 1 1]]
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]]


**Identity Array (NXN Matrix)**

identity(n, dtype=None)

The output of identity is an 'n' x 'n' array with its main diagonal set to one, and all other elements are 0.

In [13]:
np.identity(4, dtype=int)

array([[1, 0, 0, 0],
       [0, 1, 0, 0],
       [0, 0, 1, 0],
       [0, 0, 0, 1]])

**Data Type Objects, dtype**

dtype allows to create "Structured Arrays", - also known as "Record Arrays". The structured arrays provides the ability to have different data types per column.

In [18]:
dt = np.dtype([('country', 'S20'), ('density', 'i4'), ('area', 'i4'), ('population', 'i4')])

population_table = np.array([
    ('Netherlands', 393, 41526, 16928800),
    ('Belgium', 337, 30510, 11007020),
    ('United Kingdom', 256, 243610, 62262000),
    ('Germany', 233, 357021, 81799600),
    ('Liechtenstein', 205, 160, 32842),
    ('Italy', 192, 301230, 59715625),
    ('Switzerland', 177, 41290, 7301994),
    ('Luxembourg', 173, 2586, 512000),
    ('France', 111, 547030, 63601002),
    ('Austria', 97, 83858, 8169929),
    ('Greece', 81, 131940, 11606813),
    ('Ireland', 65, 70280, 4581269),
    ('Sweden', 20, 449964, 9515744),
    ('Finland', 16, 338424, 5410233),
    ('Norway', 13, 385252, 5033675)],
    dtype=dt)

print(population_table)

print("Countries : ", population_table['country'])

[(b'Netherlands', 393,  41526, 16928800)
 (b'Belgium', 337,  30510, 11007020)
 (b'United Kingdom', 256, 243610, 62262000)
 (b'Germany', 233, 357021, 81799600)
 (b'Liechtenstein', 205,    160,    32842)
 (b'Italy', 192, 301230, 59715625) (b'Switzerland', 177,  41290,  7301994)
 (b'Luxembourg', 173,   2586,   512000) (b'France', 111, 547030, 63601002)
 (b'Austria',  97,  83858,  8169929) (b'Greece',  81, 131940, 11606813)
 (b'Ireland',  65,  70280,  4581269) (b'Sweden',  20, 449964,  9515744)
 (b'Finland',  16, 338424,  5410233) (b'Norway',  13, 385252,  5033675)]
Countries :  [b'Netherlands' b'Belgium' b'United Kingdom' b'Germany' b'Liechtenstein'
 b'Italy' b'Switzerland' b'Luxembourg' b'France' b'Austria' b'Greece'
 b'Ireland' b'Sweden' b'Finland' b'Norway']


**Input and Output of Structured Arrays**



In [0]:
np.savetxt("population_table.csv",
           population_table,
           fmt="%s;%d;%d;%d",           
           delimiter=";")

In [20]:
dt = np.dtype([('country', np.unicode, 20), ('density', 'i4'), ('area', 'i4'), ('population', 'i4')])
x = np.genfromtxt("population_table.csv",
               dtype=dt,
               delimiter=";")
print(x)

[("b'Netherlands'", 393,  41526, 16928800)
 ("b'Belgium'", 337,  30510, 11007020)
 ("b'United Kingdom'", 256, 243610, 62262000)
 ("b'Germany'", 233, 357021, 81799600)
 ("b'Liechtenstein'", 205,    160,    32842)
 ("b'Italy'", 192, 301230, 59715625)
 ("b'Switzerland'", 177,  41290,  7301994)
 ("b'Luxembourg'", 173,   2586,   512000)
 ("b'France'", 111, 547030, 63601002)
 ("b'Austria'",  97,  83858,  8169929)
 ("b'Greece'",  81, 131940, 11606813)
 ("b'Ireland'",  65,  70280,  4581269)
 ("b'Sweden'",  20, 449964,  9515744)
 ("b'Finland'",  16, 338424,  5410233)
 ("b'Norway'",  13, 385252,  5033675)]


**Exercise**

Define a structured array with two columns. The first column contains the product ID, which can be defined as an int32. The second column shall contain the price for the product. How can you print out the column with the product IDs, the first row and the price for the third article of this structured array?

In [21]:
mytype = [('productID', np.int32), ('price', np.float64)]
stock = np.array([(34765, 603.76), 
                  (45765, 439.93),
                  (99661, 344.19),
                  (12129, 129.39)], dtype=mytype)

print(stock[2]["price"])


344.19


# **Numerical Operations on numpy arrays**



In [23]:
#Using Scalars

a= np.array([1,2,3])

print(a*5)
print(a-5)
print(a*a)


#Arithmetic Operations with Two arrays

A = np.array([ [11, 12, 13], [21, 22, 23], [31, 32, 33] ])
B = np.ones((3,3))

print("Adding to arrays: ")
print(A + B)
print("Multiplying two arrays: ")
print(A * (B + 1))
print("Matrix Multiplication:")
print(np.dot(A,B))

[ 5 10 15]
[-4 -3 -2]
[1 4 9]
Adding to arrays: 
[[12. 13. 14.]
 [22. 23. 24.]
 [32. 33. 34.]]
Multiplying two arrays: 
[[22. 24. 26.]
 [42. 44. 46.]
 [62. 64. 66.]]
Matrix Multiplication:
[[36. 36. 36.]
 [66. 66. 66.]
 [96. 96. 96.]]


**Logical Operations**



In [25]:
a = np.array([ [True, True], [False, False]])
b = np.array([ [True, False], [True, False]])
print(np.logical_or(a, b))
print(np.logical_and(a, b))
print(np.array_equal(a,b))

[[ True  True]
 [ True False]]
[[ True False]
 [False False]]
False


**Distance Matrix**

In mathematics, computer science and especially graph theory, a distance matrix is a matrix or a two-dimensional array, which contains the distances between the elements of a set, pairwise taken.

In [26]:
cities = ["Barcelona", "Berlin", "Brussels", "Bucharest",
          "Budapest", "Copenhagen", "Dublin", "Hamburg", "Istanbul",
          "Kiev", "London", "Madrid", "Milan", "Moscow", "Munich",
          "Paris", "Prague", "Rome", "Saint Petersburg", 
          "Stockholm", "Vienna", "Warsaw"]
dist2barcelona = [0,  1498, 1063, 1968, 
                  1498, 1758, 1469, 1472, 2230, 
                  2391, 1138, 505, 725, 3007, 1055, 
                  833, 1354, 857, 2813, 
                  2277, 1347, 1862]
dists =  np.array(dist2barcelona[:12])
print(dists)
print(np.abs(dists - dists[:, np.newaxis]))

[   0 1498 1063 1968 1498 1758 1469 1472 2230 2391 1138  505]
None
[[   0 1498 1063 1968 1498 1758 1469 1472 2230 2391 1138  505]
 [1498    0  435  470    0  260   29   26  732  893  360  993]
 [1063  435    0  905  435  695  406  409 1167 1328   75  558]
 [1968  470  905    0  470  210  499  496  262  423  830 1463]
 [1498    0  435  470    0  260   29   26  732  893  360  993]
 [1758  260  695  210  260    0  289  286  472  633  620 1253]
 [1469   29  406  499   29  289    0    3  761  922  331  964]
 [1472   26  409  496   26  286    3    0  758  919  334  967]
 [2230  732 1167  262  732  472  761  758    0  161 1092 1725]
 [2391  893 1328  423  893  633  922  919  161    0 1253 1886]
 [1138  360   75  830  360  620  331  334 1092 1253    0  633]
 [ 505  993  558 1463  993 1253  964  967 1725 1886  633    0]]


# Concatenating, Flattening and Adding Dimensions

**flatten**

flatten is a ndarry method with an optional keyword parameter "order". order can have the values "C", "F" and "A". The default of order is "C". "C" means to flatten C style in row-major ordering, i.e. the rightmost index "changes the fastest" or in other words: In row-major order, the row index varies the slowest, and the column index the quickest, so that a[0,1] follows [0,0]. 
"F" stands for Fortran column-major ordering. "A" means preserve the the C/Fortran ordering.

In [27]:
A = np.array([[[ 0,  1],
               [ 2,  3],
               [ 4,  5],
               [ 6,  7]],
              [[ 8,  9],
               [10, 11],
               [12, 13],
               [14, 15]],
              [[16, 17],
               [18, 19],
               [20, 21],
               [22, 23]]])
Flattened_X = A.flatten()
print(Flattened_X)
print(A.flatten(order="C"))
print(A.flatten(order="F"))
print(A.flatten(order="A"))

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]
[ 0  8 16  2 10 18  4 12 20  6 14 22  1  9 17  3 11 19  5 13 21  7 15 23]
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]
