<h1 style="margin-bottom:0"><center>DI 501 - Introduction to Data Informatics</center></h1>
<h2 style="margin-top:0"><center>NumPy Tutorial</center></h2>
<br>
<p style="margin-top:0"><center><b>This tutorial is prepared for Middle East Technical University's DI 501 - Introduction to Data Informatics course.</b></center></p>
<hr style="height:2px;color:navy;margin-top:0">
<p style="margin-top:0; text-align: justify; font-size:15px">NumPy (<b>Num</b>erical <b>Py</b>thon) is a widely used Python library that is open source and mostly used by scientist and engineers. It is a core library for scientific Python. It's API is used extensively by many other data science related Python libraries such as Pandas, SciPy, Matplotlib or scikit-learn. Generally, this library is used for multidimensional array and matrix structures and can perform wide variety of mathematical operations.</p>

<h3 style="margin-bottom:0">1) Installation</h3>
<br>
<p style="margin-top:0; text-align: justify">To install NumPy, you are required to have Python environment first. If you do not have Python, you are strongly recommended to have <a href="https://www.anaconda.com/">Anaconda</a> distribution as it is beginner friendly. </p>
<p style="margin-top:1; text-align: justify">If you have Python, you can proceed to install NumPy with the following code: </p>

In [None]:
conda install numpy

<p style="margin-top:0; text-align: justify">or </p>

In [None]:
pip install numpy

<p style="margin-top:0; text-align: justify">You can use those codes on command prompt or Anaconda prompt to install NumPy. If you have any problems, you can refer to <a href="https://numpy.org/install/">original website</a> or you can directly ask to assistants of the course. </p>

<h3 style="margin-bottom:0">2) Importing</h3>
<br>
<p style="margin-top:0; text-align: justify">To be able to use NumPy library, you first need to import it. It is a widely used practice to abbreviate NumPy as np. </p>

In [1]:
import numpy as np

<h3 style="margin-bottom:0">3) Creating Arrays</h3>
<br>
<p style="margin-top:0; text-align: justify">There are many ways to create an array, we will cover some popular ones. </p>
<br>
<p style="margin-top:0; text-align: justify">Code below will create an array named "a" that is 1D: </p>

In [2]:
a = np.array([1,2,3,4,5,6])
a

array([1, 2, 3, 4, 5, 6])

<p style="margin-top:0; text-align: justify">To create a 2D array, we can create an array named "b" as follows: </p>

In [3]:
b = np.array([(1,2,3), (4,5,6), (7,8,9)])
b

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

<p style="margin-top:0; text-align: justify">Finally, we can create 3D array c as: </p>

In [4]:
c = np.array([[(1,2,3),(4,5,6),(7,8,9)], [(10,11,12),(13,14,15),(16,17,18)], [(19,20,21),(22,23,24),(25,26,27)]],
 dtype = float)
c

array([[[ 1.,  2.,  3.],
        [ 4.,  5.,  6.],
        [ 7.,  8.,  9.]],

       [[10., 11., 12.],
        [13., 14., 15.],
        [16., 17., 18.]],

       [[19., 20., 21.],
        [22., 23., 24.],
        [25., 26., 27.]]])

<p style="margin-top:0; text-align: justify">There are some built-in functions in NumPy library that can help us to create different types of arrays. </p>
<br>
<p style="margin-top:0; text-align: justify">For example, we can create an array with full zeros by defining the shape of the array in parenthesis by: </p>

In [5]:
d = np.zeros((3,2))
d

array([[0., 0.],
       [0., 0.],
       [0., 0.]])

<p style="margin-top:0; text-align: justify">Likewise, we can create an array with full ones by:</p>

In [6]:
e = np.ones((3,3))
e

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

<p style="margin-top:0; text-align: justify">We can create an array with a constant value by (first, we define shape of our array, then we specify value):</p>

In [7]:
f = np.full((3,3),'a')
f

array([['a', 'a', 'a'],
       ['a', 'a', 'a'],
       ['a', 'a', 'a']], dtype='<U1')

<p style="margin-top:0; text-align: justify">We can create an identity matrix by:</p>

In [8]:
g = np.eye(4)
g

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

<p style="margin-top:0; text-align: justify">We can create an array with random values by (seed is used to generate consistant results, if you comment out first line and run same cell over and over, each time you will get different results. Seed is necessary to prevent this inconsistency):</p>

In [9]:
np.random.seed(123)

h = np.random.random((3,3))
h

array([[0.69646919, 0.28613933, 0.22685145],
       [0.55131477, 0.71946897, 0.42310646],
       [0.9807642 , 0.68482974, 0.4809319 ]])

<h3 style="margin-bottom:0">4) Inspecting</h3>
<br>
<p style="margin-top:0; text-align: justify">In this section, we will look at some properties of NumPy arrays that can be useful for different purposes. </p>
<br>
<p style="margin-top:0; text-align: justify">To get the dimension of an array, we can use: </p>

In [10]:
c.shape

(3, 3, 3)

<p style="margin-top:0; text-align: justify">We can find number of dimensions by: </p>

In [11]:
c.ndim

3

<p style="margin-top:0; text-align: justify">We can find number of elements of an array by: </p>

In [12]:
c.size

27

<p style="margin-top:0; text-align: justify">We can get data type by: </p>

In [13]:
c.dtype.name

'float64'

<h3 style="margin-bottom:0">5) Indexing and Slicing</h3>
<br>
<p style="margin-top:0; text-align: justify">We may need to select individual elements or some parts of the given array or matrix for different purposes. </p>

<h4 style="margin-bottom:0">Case 1</h4>

<p style="margin-top:0; text-align: justify">The simpleset one is getting a single value from an array. We define the position (starting from 0) for each dimension for the desired element in a square bracket. To get value 3 from the array 'a' above: </p>

In [14]:
a[2]

3

![slicing1.png](attachment:slicing1.png)

<h4 style="margin-bottom:0">Case 2</h4>

<p style="margin-top:0; text-align: justify">We may want to capture 22 from array 'c' by: </p>

In [15]:
c[2,1,0]

22.0

![slicing2.png](attachment:slicing2.png)

<p style="margin-top:0; text-align: justify">Note that when passing arguments into square brackets, we first indentify the page element is in, then we identify row and column, respectively. </p>

<h4 style="margin-bottom:0">Case 3</h4>

<p style="margin-top:0; text-align: justify">If we want to capture the very last element of our array, we can simply put -1, instead of counting number of elements in our array. </p>

In [16]:
a[-1]

6

![slicing3.png](attachment:slicing3.png)

<p style="margin-top:0; text-align: justify">To get element "5" from the array above, we could put -2 in square brackets. Note that when we do forwards slicing, index starts from 0. However, when we do backwards slicing, index starts from -1. </p>

<h4 style="margin-bottom:0">Case 4</h4>

<p style="margin-top:0; text-align: justify">We can select multiple items at once by putting required indices into an array. If we want to capture 1,2 and 6 from the a array, we can use: </p>

In [17]:
a[[0,1,5]]

array([1, 2, 6])

![slicing4-2.png](attachment:slicing4-2.png)

<h4 style="margin-bottom:0">Case 5</h4>

<p style="margin-top:0; text-align: justify">If the required elements are in order, we can use ":" to indicate all of the elements between two indices. (the index we put on the left of the colon is inclusive whereas the number we put on the right is exclusive, be careful). If we want to get all numbers between 2 and 5 (including them) from the array 'a' above: </p>

In [18]:
a[1:5]

array([2, 3, 4, 5])

![slicing5.png](attachment:slicing5.png)

<h4 style="margin-bottom:0">Case 6</h4>

<p style="margin-top:0; text-align: justify">If you want to select each number starting from 3 to the end in array a, you can put the index of 3 to the left and left the right side of ":" empty: </p>

In [19]:
a[2:]

array([3, 4, 5, 6])

![slicing6.png](attachment:slicing6.png)

<p style="margin-top:0; text-align: justify">We could select all of the array by a[ : ]. Also, we could select all the numbers up to three by a[ : 3] (we use 3 as again, right side is exclusive whereas left side is inclusive). </p>

<h4 style="margin-bottom:0">Case 7</h4>

<p style="margin-top:0; text-align: justify">If we need to select all of the even numbers, we can simply put their indices into an array. However, as we know the <b>step size</b>, we can shortly indicate step size as 3rd element, seperated by ":" again. We start from 1 (index of first even number) then go all the way up to the end (second space is empty), incrementing our index by 2 each time by:</p>

In [20]:
a[1::2]

array([2, 4, 6])

![slicing7.png](attachment:slicing7.png)

<h4 style="margin-bottom:0">Case 8</h4>

This step size method can be used for reversing an array. Recall that a[ : ] would return the entire array. If we add the step size of -1 as the third argument, we will get our array reversed.

In [21]:
a[::-1]

array([6, 5, 4, 3, 2, 1])

![slicing8.png](attachment:slicing8.png)

<h4 style="margin-bottom:0">Case 9</h4>

<p style="margin-top:0; text-align: justify">When we work with arrays that have more than 1 dimensions, we may need to select a whole row or columnn. If we want to grab second column of array b above:</p>

In [22]:
b[1,:]

array([4, 5, 6])

![slicing9.png](attachment:slicing9.png)

First, we passed the row we require, then we passed : to indicate we want everything from that row. For example, we could put ":2" as the second argument to indicate we only want 4 and 5.

<h4 style="margin-bottom:0">Case 10</h4>

<p style="margin-top:0; text-align: justify">If we want to pick all of the elements in the corners of array b:</p>

In [23]:
b[::2,::2]

array([[1, 3],
       [7, 9]])

![slicing10.png](attachment:slicing10.png)

<h4 style="margin-bottom:0">Case 11</h4>

<p style="margin-top:0; text-align: justify">If we want to pick all of the elements in array c that is on second or third column:</p>

In [24]:
c[:,:,1:3]

array([[[ 2.,  3.],
        [ 5.,  6.],
        [ 8.,  9.]],

       [[11., 12.],
        [14., 15.],
        [17., 18.]],

       [[20., 21.],
        [23., 24.],
        [26., 27.]]])

![slicing11.png](attachment:slicing11.png)

<h4 style="margin-bottom:0">Case 12</h4>

<p style="margin-top:0; text-align: justify">Apart from the examples above, we can do boolean indexing by passing a boolean condition between square brackets. Returned array will contain all of the values that satisfy the boolean condition. If we want to retrieve all of the numbers that are greater than 22 in array c: </p>

In [25]:
c[c>22]

array([23., 24., 25., 26., 27.])

![slicing12.png](attachment:slicing12.png)

<h4 style="margin-bottom:0">Case 13</h4>

<p style="margin-top:0; text-align: justify">Finally, we can pass multiple boolean conditions by using NumPy's logical_and function. If we want to retrieve all of the numbers that are greater than 22 but lower than 26 in array c: </p>

In [26]:
c[np.logical_and(c>22, c<26)]

array([23., 24., 25.])

![slicing13.png](attachment:slicing13.png)

<h3 style="margin-bottom:0">6) Mathematics</h3>
<br>
<p style="margin-top:0; text-align: justify">We may need to perform mathematical operations on arrays.</p>
<br>
<p style="margin-top:0; text-align: justify">For example, we can do simplest 4 mathematical operations as:</p>

In [27]:
np.add(b, e)

array([[ 2.,  3.,  4.],
       [ 5.,  6.,  7.],
       [ 8.,  9., 10.]])

In [28]:
np.subtract(b, e)

array([[0., 1., 2.],
       [3., 4., 5.],
       [6., 7., 8.]])

In [29]:
np.multiply(b, e)

array([[1., 2., 3.],
       [4., 5., 6.],
       [7., 8., 9.]])

In [30]:
np.divide(b, e)

array([[1., 2., 3.],
       [4., 5., 6.],
       [7., 8., 9.]])

We can apply mathematical functions with integer or floats as well.

In [31]:
np.add(b, 3)

array([[ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

In [32]:
np.multiply(b, 2)

array([[ 2,  4,  6],
       [ 8, 10, 12],
       [14, 16, 18]])

<p style="margin-top:0; text-align: justify">Or, we can do exponentiation like:</p>

In [33]:
np.exp(b)

array([[2.71828183e+00, 7.38905610e+00, 2.00855369e+01],
       [5.45981500e+01, 1.48413159e+02, 4.03428793e+02],
       [1.09663316e+03, 2.98095799e+03, 8.10308393e+03]])

<p style="margin-top:0; text-align: justify">And, we can find square root of each element:</p>

In [34]:
np.sqrt(b)

array([[1.        , 1.41421356, 1.73205081],
       [2.        , 2.23606798, 2.44948974],
       [2.64575131, 2.82842712, 3.        ]])

We can find sum of the elements in an array as:

In [35]:
c.sum()

378.0

We can find the maximum element in an array as (you can find minimum by using .min() instead):

In [36]:
c.max()

27.0

We can find mean of the elements in an array as (you can find the median by using median() instead):

In [37]:
c.mean()

14.0

We can find standard deviation of the elements in an array as:

In [38]:
np.std(c)

7.788880963698615

We can compare each element of two arrays such as:

In [39]:
b == e

array([[ True, False, False],
       [False, False, False],
       [False, False, False]])

We can compare two arrays:

In [40]:
np.array_equal(b,e)

False

<h3 style="margin-bottom:0">7) Other Useful Functions</h3>
<br>
<p style="margin-top:0; text-align: justify">In this section, we will present other useful functions that can be used from NumPy library.</p>
<br>
<p style="margin-top:0; text-align: justify">For example, you can transpose a matrix by:</p>

In [41]:
np.transpose(c)

array([[[ 1., 10., 19.],
        [ 4., 13., 22.],
        [ 7., 16., 25.]],

       [[ 2., 11., 20.],
        [ 5., 14., 23.],
        [ 8., 17., 26.]],

       [[ 3., 12., 21.],
        [ 6., 15., 24.],
        [ 9., 18., 27.]]])

Split array horizontally (if we save it into a new array and look at first element, we will see it is an array):

In [42]:
i = np.hsplit(c,3)
i[0]

array([[[ 1.,  2.,  3.]],

       [[10., 11., 12.]],

       [[19., 20., 21.]]])

Split array vertically (if we save it into a new array and look at first element, we will see it is an array):

In [43]:
j = np.vsplit(c,3)
j[0]

array([[[1., 2., 3.],
        [4., 5., 6.],
        [7., 8., 9.]]])

We can sort an array (this function itself will not return anything, when we call sorted array again, we will see it is sorted):

In [44]:
h.sort()
h

array([[0.22685145, 0.28613933, 0.69646919],
       [0.42310646, 0.55131477, 0.71946897],
       [0.4809319 , 0.68482974, 0.9807642 ]])

We can change array shape by:

In [45]:
h.reshape(9,1)

array([[0.22685145],
       [0.28613933],
       [0.69646919],
       [0.42310646],
       [0.55131477],
       [0.71946897],
       [0.4809319 ],
       [0.68482974],
       [0.9807642 ]])

We can add a new element by (first, we indicate array, then we indicate index and finally we indicate value. Note that the array itself does not change, we need to assign this function to a new variable to be able to call it back later):

In [46]:
np.insert(h, 5, 2)

array([0.22685145, 0.28613933, 0.69646919, 0.42310646, 0.55131477,
       2.        , 0.71946897, 0.4809319 , 0.68482974, 0.9807642 ])

We can delete an element by (first, we indicate array, then we indicate index. Note that the array itself does not change, we need to assign this function to a new variable to be able to call it back later):

In [47]:
np.delete(h, 1)

array([0.22685145, 0.69646919, 0.42310646, 0.55131477, 0.71946897,
       0.4809319 , 0.68482974, 0.9807642 ])

<h3 style="margin-bottom:0">8) Input and Output</h3>
<br>
<p style="margin-top:0; text-align: justify">In this section, we will present how to save or load NumPy arrays.</p>
<br>
<p style="margin-top:0; text-align: justify">For example, you can save your array as a NumPy file by putting desired direction in "..." and properly naming it:</p>

In [None]:
np.save('.../new_file',h)

You can load NumPy array files by:

In [None]:
np.load('.../new_file.npy')

Also, you can load txt or csv files as follows (you can also add delimiter):

In [None]:
np.loadtxt('.../new_file.txt')

In [None]:
np.genfromtxt('.../new_file.csv', delimiter = "")

And you can save your arrays into txt or csv files like:

In [None]:
np.savetxt('.../new_file.txt', h, delimiter='')

In [None]:
np.savetxt('.../new_file.csv', h, delimiter='')

<h3 style="margin-bottom:0">9) Help, References & Useful Links</h3>

You can look at how a function works directly in Jupyter Notebook. For example, if you do not understand how np.ones function works, you can simply write:

In [48]:
np.info(np.ones)

 ones(shape, dtype=None, order='C', *, like=None)

Return a new array of given shape and type, filled with ones.

Parameters
----------
shape : int or sequence of ints
    Shape of the new array, e.g., ``(2, 3)`` or ``2``.
dtype : data-type, optional
    The desired data-type for the array, e.g., `numpy.int8`.  Default is
    `numpy.float64`.
order : {'C', 'F'}, optional, default: C
    Whether to store multi-dimensional data in row-major
    (C-style) or column-major (Fortran-style) order in
    memory.
like : array_like
    Reference object to allow the creation of arrays which are not
    NumPy arrays. If an array-like passed in as ``like`` supports
    the ``__array_function__`` protocol, the result will be defined
    by it. In this case, it ensures the creation of an array object
    compatible with that passed in via this argument.

    .. note::
        The ``like`` keyword is an experimental feature pending on
        acceptance of :ref:`NEP 35 <NEP35>`.

    .. versionadded::

<hr style="height:2px;color:navy;margin-top:0">
<p style="margin-top:0; text-align: justify">This tutorial is prepared with the help of <a href="https://numpy.org/devdocs/user/absolute_beginners.html">original website</a> documentation.</p>
<br>
<p style="margin-top:0; text-align: justify">You can always refer to this documentation as it is complete. If you cannot find a solution, you are very likely to find an answer for your questions on the internet as this library is super widely used. If you still cannot find an answer, please do no hesitate to ask your questions to course assistants.</p>

<h4 style="margin-bottom:0">Useful Links</h4>

<p style="margin-top:0; text-align: justify">Here are some useful tutorials or blogs that are related to NumPy:</p>

<a href="https://numpy.org/devdocs/user/absolute_beginners.html">Original Website:</a> This is the website of NumPy. You may find tons of useful material that covers each aspect of NumPy library.</p> 

<a href="https://www.youtube.com/watch?v=QUT1VHiLmmI">Python NumPy Tutorial for Beginners:</a> This is a YouTube video provided by freeCodeCamp.org. It basically covers the fundamentals of NumPy library as well as it talks about linear algebra, mathematics and statistics as well.</p> 

<a href="https://rukshanpramoditha.medium.com/numpy-for-data-science-part-2-7399ffc605e5">A Complete Step-by-Step Guide to NumPy Array Indexing and Slicing:</a> This medium post is about indexing and slicing. You may find many examples for slicing and indexing here.</p> 