# Operations on NumPy Arrays

The learning objectives of this section are:

* Manipulate arrays
    * Reshape arrays
    * Stack arrays
* Perform operations on arrays
    * Perform basic mathematical operations
    * Apply built-in functions 
    * Apply your own functions 
    * Apply basic linear algebra operations 


### Manipulating Arrays

Let's look at some ways to manipulate arrays, i.e. changing the shape, combining and splitting arrays, etc.   

#### Reshaping Arrays

Reshaping is done using the ```reshape()``` function.


In [2]:
import numpy as np

# Reshape a 1-D array to a 3 x 4 array
some_array = np.arange(0, 12).reshape(3, 4)
print(some_array)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]


In [26]:
np_array = np.random.rand(24).reshape(4,-1)
np_array

array([[0.84031406, 0.26965811, 0.93470515, 0.23841032, 0.38779949,
        0.5290163 ],
       [0.30700639, 0.83277247, 0.78702489, 0.80612649, 0.34200263,
        0.86437963],
       [0.80916323, 0.53974556, 0.58882123, 0.31634349, 0.53091729,
        0.20156946],
       [0.91843169, 0.3315271 , 0.67847363, 0.47495926, 0.32195073,
        0.51234671]])

In [27]:
np_array.shape

(4, 6)

In [30]:
# Can reshape it further 
some_array.reshape(2, 6)

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11]])

In [31]:
# If you specify -1 as a dimension, the dimensions are automatically calculated
# -1 means "whatever dimension is needed" 
some_array.reshape(4, -1)

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

```array.T``` returns the transpose of an array.

In [32]:
# Transposing an array
some_array.T

array([[ 0,  4,  8],
       [ 1,  5,  9],
       [ 2,  6, 10],
       [ 3,  7, 11]])

In [33]:
np_array.T

array([[0.84031406, 0.30700639, 0.80916323, 0.91843169],
       [0.26965811, 0.83277247, 0.53974556, 0.3315271 ],
       [0.93470515, 0.78702489, 0.58882123, 0.67847363],
       [0.23841032, 0.80612649, 0.31634349, 0.47495926],
       [0.38779949, 0.34200263, 0.53091729, 0.32195073],
       [0.5290163 , 0.86437963, 0.20156946, 0.51234671]])

### Stacking and Splitting Arrays

#### Stacking: ```np.hstack()``` and ```n.vstack()```

Stacking is done using the ```np.hstack()``` and ```np.vstack()``` methods. For horizontal stacking, the number of rows should be the same, while for vertical stacking, the number of columns should be the same.

In [34]:
# Creating two arrays
array_1 = np.arange(12).reshape(3, 4)
array_2 = np.arange(20).reshape(5, 4)

print(array_1)
print("\n")
print(array_2)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]


[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]
 [16 17 18 19]]


In [36]:
# vstack
# Note that np.vstack(a, b) throws an error - you need to pass the arrays as a list
np.vstack((array_1, array_2))

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19]])

In [38]:
np_arrays1 = np.random.rand(5,4)
np_arrays2= np.random.rand(6, 4)
print(np_arrays1)
print()
print(np_arrays2)

[[0.33112064 0.27949875 0.98047647 0.05222393]
 [0.80431935 0.59611061 0.9624192  0.81935952]
 [0.65771846 0.98661731 0.96767623 0.51034578]
 [0.04164961 0.04851168 0.61549497 0.42007576]
 [0.43230627 0.46867022 0.97002653 0.47004848]]

[[0.93902716 0.89578073 0.31486321 0.2017403 ]
 [0.69203795 0.96584573 0.27614724 0.44993481]
 [0.26304609 0.53369815 0.87073121 0.76557653]
 [0.04980542 0.0718783  0.57557772 0.07912574]
 [0.90039579 0.00401615 0.96848228 0.70924557]
 [0.4016378  0.39915533 0.42553754 0.81231506]]


In [39]:
stacked_np_array =  np.vstack((np_arrays1, np_arrays2))
stacked_np_array

array([[0.33112064, 0.27949875, 0.98047647, 0.05222393],
       [0.80431935, 0.59611061, 0.9624192 , 0.81935952],
       [0.65771846, 0.98661731, 0.96767623, 0.51034578],
       [0.04164961, 0.04851168, 0.61549497, 0.42007576],
       [0.43230627, 0.46867022, 0.97002653, 0.47004848],
       [0.93902716, 0.89578073, 0.31486321, 0.2017403 ],
       [0.69203795, 0.96584573, 0.27614724, 0.44993481],
       [0.26304609, 0.53369815, 0.87073121, 0.76557653],
       [0.04980542, 0.0718783 , 0.57557772, 0.07912574],
       [0.90039579, 0.00401615, 0.96848228, 0.70924557],
       [0.4016378 , 0.39915533, 0.42553754, 0.81231506]])

Similarly, two arrays having the same number of rows can be horizontally stacked using ```np.hstack((a, b))```.

In [42]:
a = np.random.rand(4, 5)
print(a)
print()

b = np.random.rand(4, 6)
print(b)
print()

c = np.hstack((a, b))
print(c)

[[0.41723231 0.34591743 0.88468288 0.76886988 0.81339919]
 [0.4974245  0.06467744 0.14562472 0.32918862 0.38522401]
 [0.69372557 0.36821633 0.47568952 0.29220725 0.96136218]
 [0.34981024 0.50896369 0.80862266 0.45114476 0.35954806]]

[[0.92142262 0.58622148 0.21628602 0.18199134 0.61834998 0.54240743]
 [0.66144277 0.09788981 0.01949405 0.78945665 0.67974322 0.26173504]
 [0.29916706 0.53482105 0.90816348 0.04192444 0.94780113 0.09030388]
 [0.83345818 0.87869257 0.55712822 0.80896268 0.3998586  0.79580228]]

[[0.41723231 0.34591743 0.88468288 0.76886988 0.81339919 0.92142262
  0.58622148 0.21628602 0.18199134 0.61834998 0.54240743]
 [0.4974245  0.06467744 0.14562472 0.32918862 0.38522401 0.66144277
  0.09788981 0.01949405 0.78945665 0.67974322 0.26173504]
 [0.69372557 0.36821633 0.47568952 0.29220725 0.96136218 0.29916706
  0.53482105 0.90816348 0.04192444 0.94780113 0.09030388]
 [0.34981024 0.50896369 0.80862266 0.45114476 0.35954806 0.83345818
  0.87869257 0.55712822 0.80896268 0.39985

### Perform Operations on Arrays

Performing mathematical operations on arrays is extremely simple. Let's see some common operations.


#### Basic Mathematical Operations

NumPy provides almost all the basic math functions - exp, sin, cos, log, sqrt etc. The function is applied to each element of the array.


In [43]:
# Basic mathematical operations
a = np.arange(1, 20)

# sin, cos, exp, log
print(np.sin(a))
print(np.cos(a))
print(np.exp(a))
print(np.log(a))

[ 0.84147098  0.90929743  0.14112001 -0.7568025  -0.95892427 -0.2794155
  0.6569866   0.98935825  0.41211849 -0.54402111 -0.99999021 -0.53657292
  0.42016704  0.99060736  0.65028784 -0.28790332 -0.96139749 -0.75098725
  0.14987721]
[ 0.54030231 -0.41614684 -0.9899925  -0.65364362  0.28366219  0.96017029
  0.75390225 -0.14550003 -0.91113026 -0.83907153  0.0044257   0.84385396
  0.90744678  0.13673722 -0.75968791 -0.95765948 -0.27516334  0.66031671
  0.98870462]
[2.71828183e+00 7.38905610e+00 2.00855369e+01 5.45981500e+01
 1.48413159e+02 4.03428793e+02 1.09663316e+03 2.98095799e+03
 8.10308393e+03 2.20264658e+04 5.98741417e+04 1.62754791e+05
 4.42413392e+05 1.20260428e+06 3.26901737e+06 8.88611052e+06
 2.41549528e+07 6.56599691e+07 1.78482301e+08]
[0.         0.69314718 1.09861229 1.38629436 1.60943791 1.79175947
 1.94591015 2.07944154 2.19722458 2.30258509 2.39789527 2.48490665
 2.56494936 2.63905733 2.7080502  2.77258872 2.83321334 2.89037176
 2.94443898]


In [55]:
array_np = np.random.rand(20)
array_np

array([0.14189215, 0.06847257, 0.2153452 , 0.69122867, 0.76355413,
       0.60681229, 0.63591087, 0.55062461, 0.49316719, 0.45916749,
       0.24289218, 0.5640661 , 0.56165623, 0.91685842, 0.75694209,
       0.14005068, 0.44880079, 0.96125577, 0.03600147, 0.14170773])

In [56]:
import time

t0 = time.time()

print('sin...')
print(np.sin(array_np))
print('cos...')
print(np.cos(array_np))
print('tan...')
print(np.tan(array_np))
print('log...')
print(np.log(array_np))
print('exp...')
print(np.exp(array_np))
print('sqrt...')
print(np.sqrt(array_np))


t1 = time.time()
print()
print('Time taken to finsh above mathmatical operation in milli seconds')
print((t1-t0)/1000)

sin...
[0.1414165  0.06841908 0.21368467 0.63748431 0.69149325 0.57025176
 0.59391059 0.52321962 0.47341804 0.44320198 0.24051091 0.53462682
 0.53258872 0.79369447 0.68670175 0.1395933  0.4338854  0.81991113
 0.0359937  0.14123393]
cos...
[0.98995019 0.99765667 0.97690269 0.77046334 0.72238292 0.82146998
 0.80453105 0.85219788 0.88083787 0.89642178 0.97064643 0.84508826
 0.84637418 0.60831661 0.72693928 0.99020892 0.90096807 0.57249082
 0.99935202 0.98997625]
tan...
[0.14285213 0.06857978 0.21873691 0.82740382 0.95723919 0.69418453
 0.73820716 0.61396494 0.53746332 0.49441233 0.24778427 0.63262839
 0.62925917 1.30473911 0.94464802 0.14097358 0.48157689 1.43218214
 0.03601704 0.14266396]
log...
[-1.95268805 -2.68132205 -1.53551294 -0.36928458 -0.26977127 -0.49953577
 -0.45269686 -0.59670199 -0.70690703 -0.77834023 -1.41513763 -0.57258383
 -0.57686531 -0.08680221 -0.27846852 -1.96575092 -0.80117616 -0.03951476
 -3.3241954  -1.95398856]
exp...
[1.15245235 1.07087125 1.24028997 1.99616666 

#### Apply User Defined Functions

You can also apply your own functions on arrays. For e.g. applying the function ```x/(x+1)``` to each element of an array.

One way to do that is by looping through the array, which is the non-numpy way. You would rather want to write **vectorised code**. 

The simplest way to do that is to vectorise the function you want, and then apply it on the array. Numpy provides the ```np.vectorize()``` method to vectorise functions.

Let's look at both the ways to do it.

In [57]:
print(a)

[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]


In [58]:
# The non-numpy way, not recommended
a_list = [x/(x+1) for x in a]
print(a_list)

[0.5, 0.6666666666666666, 0.75, 0.8, 0.8333333333333334, 0.8571428571428571, 0.875, 0.8888888888888888, 0.9, 0.9090909090909091, 0.9166666666666666, 0.9230769230769231, 0.9285714285714286, 0.9333333333333333, 0.9375, 0.9411764705882353, 0.9444444444444444, 0.9473684210526315, 0.95]


In [60]:
print(array_np)

[0.14189215 0.06847257 0.2153452  0.69122867 0.76355413 0.60681229
 0.63591087 0.55062461 0.49316719 0.45916749 0.24289218 0.5640661
 0.56165623 0.91685842 0.75694209 0.14005068 0.44880079 0.96125577
 0.03600147 0.14170773]


In [64]:
list_of_np_array= [item/(item+1) for item in array_np]
print(list_of_np_array)

[0.12426054980011332, 0.06408453695306919, 0.1771885077827871, 0.4087139036698376, 0.4329632497337398, 0.3776497697700838, 0.3887197544170875, 0.3550985886101051, 0.33028263440274924, 0.3146777148079756, 0.19542498146585408, 0.36064083243039174, 0.3596542049989793, 0.4783130629209152, 0.43082927791229086, 0.12284601233661158, 0.30977398405480044, 0.4901225954660245, 0.03475040811533494, 0.12411909680249893]


In [62]:
# The numpy way: vectorize the function, then apply it
f = np.vectorize(lambda x: x/(x+1))
f(a)

array([0.5       , 0.66666667, 0.75      , 0.8       , 0.83333333,
       0.85714286, 0.875     , 0.88888889, 0.9       , 0.90909091,
       0.91666667, 0.92307692, 0.92857143, 0.93333333, 0.9375    ,
       0.94117647, 0.94444444, 0.94736842, 0.95      ])

In [65]:
vactorized_fun = np.vectorize(lambda item: item/(item+1))
vactorized_fun(list_of_np_array)

array([0.11052647, 0.06022504, 0.15051838, 0.29013265, 0.3021454 ,
       0.27412611, 0.27991231, 0.26204631, 0.24828005, 0.23935731,
       0.16347741, 0.26505219, 0.26451888, 0.3235533 , 0.3011046 ,
       0.10940593, 0.2365095 , 0.32891428, 0.03358337, 0.11041454])

In [68]:
# Apply function on a 2-d array: Applied to each element 
b = np.linspace(1, 100, 10)
print(b)
print()
print(f(b))

[  1.  12.  23.  34.  45.  56.  67.  78.  89. 100.]

[0.5        0.92307692 0.95833333 0.97142857 0.97826087 0.98245614
 0.98529412 0.98734177 0.98888889 0.99009901]


In [69]:
bb = np.linspace(1, 50 ,20)
print(bb)
print()
print(vactorized_fun(bb))

[ 1.          3.57894737  6.15789474  8.73684211 11.31578947 13.89473684
 16.47368421 19.05263158 21.63157895 24.21052632 26.78947368 29.36842105
 31.94736842 34.52631579 37.10526316 39.68421053 42.26315789 44.84210526
 47.42105263 50.        ]

[0.5        0.7816092  0.86029412 0.8972973  0.91880342 0.93286219
 0.94277108 0.95013123 0.95581395 0.96033403 0.96401515 0.96707106
 0.96964856 0.97185185 0.97375691 0.97542044 0.97688564 0.97818599
 0.97934783 0.98039216]


This also has the advantage that you can vectorize the function once, and then apply it as many times as needed. 

#### Apply Basic Linear Algebra Operations

NumPy provides the ```np.linalg``` package to apply common linear algebra operations, such as:
* ```np.linalg.inv```: Inverse of a matrix
* ```np.linalg.det```: Determinant of a matrix
* ```np.linalg.eig```: Eigenvalues and eigenvectors of a matrix
    
Also, you can multiple matrices using ```np.dot(a, b)```. 


In [71]:
# np.linalg documentation
help(np.linalg)

Help on package numpy.linalg in numpy:

NAME
    numpy.linalg

DESCRIPTION
    ``numpy.linalg``
    
    The NumPy linear algebra functions rely on BLAS and LAPACK to provide efficient
    low level implementations of standard linear algebra algorithms. Those
    libraries may be provided by NumPy itself using C versions of a subset of their
    reference implementations but, when possible, highly optimized libraries that
    take advantage of specialized processor functionality are preferred. Examples
    of such libraries are OpenBLAS, MKL (TM), and ATLAS. Because those libraries
    are multithreaded and processor dependent, environmental variables and external
    packages such as threadpoolctl may be needed to control the number of threads
    or specify the processor architecture.
    
    - OpenBLAS: https://www.openblas.net/
    - threadpoolctl: https://github.com/joblib/threadpoolctl
    
    Please note that the most-used linear algebra functions in NumPy are present in
    t

In [13]:
# Creating arrays
a = np.arange(1, 10).reshape(3, 3)
b= np.arange(1, 13).reshape(3, 4)
print(a)
print(b)

[[1 2 3]
 [4 5 6]
 [7 8 9]]
[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


In [14]:
# Inverse
np.linalg.inv(a)

array([[ 3.15251974e+15, -6.30503948e+15,  3.15251974e+15],
       [-6.30503948e+15,  1.26100790e+16, -6.30503948e+15],
       [ 3.15251974e+15, -6.30503948e+15,  3.15251974e+15]])

In [15]:
# Determinant
np.linalg.det(a)

-9.51619735392994e-16

In [16]:
# Eigenvalues and eigenvectors
np.linalg.eig(a)

(array([ 1.61168440e+01, -1.11684397e+00, -9.75918483e-16]),
 array([[-0.23197069, -0.78583024,  0.40824829],
        [-0.52532209, -0.08675134, -0.81649658],
        [-0.8186735 ,  0.61232756,  0.40824829]]))

In [17]:
# Multiply matrices
np.dot(a, b)

array([[ 38,  44,  50,  56],
       [ 83,  98, 113, 128],
       [128, 152, 176, 200]])

In [77]:
m1 = np.random.rand(4,4)
m2 = np.random.rand(5,7)
print(m1)
print()
print(m2)

[[0.42528881 0.40657457 0.34164631 0.91217016]
 [0.71403532 0.03413834 0.42666704 0.53775728]
 [0.33927705 0.89720408 0.49237759 0.39872151]
 [0.37141594 0.42857114 0.48674924 0.95567864]]

[[0.99016122 0.45640536 0.51764117 0.8950096  0.4607941  0.86869394
  0.1802253 ]
 [0.23882228 0.13494693 0.56396152 0.34652317 0.62783537 0.09906281
  0.45925911]
 [0.970964   0.61291739 0.39657619 0.13431738 0.50796839 0.67393927
  0.42342808]
 [0.79058973 0.15532091 0.95483765 0.8118161  0.17539229 0.37728607
  0.69763216]
 [0.86249985 0.3316584  0.41654502 0.55007373 0.2238746  0.25848376
  0.5125277 ]]


In [78]:
np.linalg.inv(m1)

array([[ 2.78049138,  1.43583541,  0.42337654, -3.63848412],
       [ 1.93321085, -0.70703799,  1.1949293 , -1.94589098],
       [-6.57211399,  0.84094698,  0.21959491,  5.70809421],
       [ 1.39977416, -0.66926931, -0.81224862,  0.4258055 ]])

In [81]:
np.linalg.eig(m1)

(array([ 2.07119093+0.j        , -0.44619183+0.j        ,
         0.14124214+0.20771187j,  0.14124214-0.20771187j]),
 array([[-0.5150631 +0.j        , -0.1213811 +0.j        ,
         -0.41317104+0.20457401j, -0.41317104-0.20457401j],
        [-0.42997827+0.j        ,  0.72967831+0.j        ,
         -0.07397569+0.22084363j, -0.07397569-0.22084363j],
        [-0.49455838+0.j        , -0.67160177+0.j        ,
          0.79623596+0.j        ,  0.79623596-0.j        ],
        [-0.55248593+0.j        ,  0.04227599+0.j        ,
         -0.18317529-0.25622248j, -0.18317529+0.25622248j]]))

In [83]:
np.linalg.det(m1)

-0.05830782041664099

In [85]:
np.dot(m1, m2)

ValueError: shapes (4,4) and (5,7) not aligned: 4 (dim 1) != 5 (dim 0)

In [92]:
m3 = np.random.rand(4,6)
m4= np.dot(m1,m3)
print(m4)
print()
print('Shape is:', m4.shape)

[[1.32996563 0.96431737 1.13203722 1.05347258 1.08625951 0.83273443]
 [0.8783824  0.7062268  1.01385947 0.79347462 0.93943754 0.82438905]
 [1.30332131 1.2024648  0.72966391 1.01526931 1.02408105 0.74638174]
 [1.5043924  1.04181246 1.13100414 1.22930048 1.16983303 0.8985078 ]]

Shape is: (4, 6)


In [94]:
print(m1*m1)

[[0.18087057 0.16530288 0.1167222  0.8320544 ]
 [0.50984644 0.00116543 0.18204476 0.2891829 ]
 [0.11510892 0.80497517 0.24243569 0.15897885]
 [0.1379498  0.18367322 0.23692482 0.91332166]]


In [98]:
a = np.random.rand(4,3)
print(a)
print()
print(a[2, :])

[[0.4864531  0.0431352  0.63217345]
 [0.88639858 0.29356583 0.9861039 ]
 [0.90957921 0.56680361 0.16976195]
 [0.68136181 0.45663306 0.63341502]]

[0.90957921 0.56680361 0.16976195]


In [104]:
a = np.array([[4, 3, 1], [5, 7, 0], [9, 9, 3], [8, 2, 4]])
print(a)
print()
m,n = 0,2
# Write your code for swapping here
temp = a[m,:]
print(temp)
print()
a[n,:] = a[m, :]
print(a)
print()
a[m,:] = temp
# Print the array after swapping
print(a)

[[4 3 1]
 [5 7 0]
 [9 9 3]
 [8 2 4]]

[4 3 1]

[[4 3 1]
 [5 7 0]
 [4 3 1]
 [8 2 4]]

[[4 3 1]
 [5 7 0]
 [4 3 1]
 [8 2 4]]


In [108]:
a.ndim


2

In [109]:
a.shape

(4, 3)