<h1 align = "center">NumPy Fundamentals</h1>

![](https://bit.ly/2ZVLj7m)


<div class="alert alert-info">
<b>
<p>NumPy is a must learn python library for data science. NumPy which stands for <b>Numerical Python</b> is a perfect tool for scientific computing and performing basic and advanced array operations. It extends Python list objects into comprehensive multi-dimensional arrays. It is an open source project which you can freely use. Though, there are many contributors to this open source project, however it was created by Travis Oliphant in 2005.</p>
<p>NumPy is open source so you can always check out their soucre code in their <a href = https://github.com/numpy/numpy>github repository</a><p>

<h3>Why NumPy</h3>

<ul>
    <li>NumPy is very similar to Python list but it is faster when compared to Python list. Numpy is 50X faster than the traditional Python list.</li>
    <li>It is also convenient and uses less memory when compared to Python list</li></ul>

<p>This is why it is one of the popular libraries in data science where resources and speed are very important. </p>

<p>In this notebook, you will learn:</p>
<p>    <ul>
    <li>NumPy arrays</li>
    <li>NumPy arrays vs Python list</li>
    <li>Array dimensions</li>
    <li>Array indexing</li>
    <li>Array slicing</li>
    <li>NumPy data types</li>
    <li>Iterating over arrays</li>
    <li>Joining arrays</li>
    <li>Searching and sorting arrays </li>
     <li>Filtering an array</li>
     <li>NumPy random numbers</li>
     <li>NumPy arange function</li>
     <li>NumPy arithmetic operations</li>
    <li>NumPy string operations</li>
     </ul></p>

</b>

<h4>Prerequisite</h4>
<p>You should have a basic understanding of Python or any of the programming languages.<br>If you are new to programming, you can check out my beginners-friendly interactive notobook on <a href = https://github.com/Abisola-ds/Python-Fundamentals> Python Fundamentals</a></p>
<br><br>
Time Required: <strong>4 Hours</strong>
</div>


<h3>Getting started</h3>

<div class="alert alert-info">
<p>
The first thing you need to do is to install NumPy. However, if you are using a python distribution that already has NumPy installed(E.g Anaconda or Spyder), then you do not need to install NumPy. You can skip the installation step.</p>
    <p>To install NumPy, you use any of the Python package manager as shown in the commands below:</p>
    <b><p >pip install numpy<br>
    conda install numpy</p></b>
    <p>Once NumPy is installed, you import it in your program by using the import keyword as shown below:</p>
    <b><p>import numpy</P></b>
    <p>To confirm if you have NumPy installed and to check the version, you can use the command: </p>
    <b><p>print(numpy.__version__)</p></b>
 <p>Now, NumPy is ready for use. Let's try it out</p>
 </div>

In [1]:
import numpy

a = numpy.array([3,6,8,1,0,3,4])

print(a)

[3 6 8 1 0 3 4]


<div class="alert alert-info">
<p>
    Often times, we see the convention:<br> <b>import numpy as np</b><br>
This is because NumPy is usually imported under the <b>'np'</b> alias. In Python, aliases are an alternate name for referring to the same thing. We use alias to save time and also to keep our codes standardized so that anyone working with your code can easily understand it. An alias can be created using the keyword <b>'as'</b> in the import statement.</p>

In [2]:
import numpy as np                        #Now the NumPy package can be referred to as 'np' instead of numpy.

a = np.array([3,6,8,1,0,3,4])

print(a)

[3 6 8 1 0 3 4]


<h3>NumPy arrays</h3>

<div class="alert alert-info">
<p>
NumPy is used to work with arrays. The array object in NumPy is called <b>ndarray</b>.
An array is a central data structure of the NumPy library. An array is a grid of values and it contains information about the raw data, how to locate an element, and how to interpret an element. It has a grid of elements that can be indexed in various ways.</b></p>

<p>We can create a NumPy ndarray object by using the <b>array()</b> function.</p>


In [10]:
a = np.array([1, 2, 3, 4, 5])

print(a)

print(type(a))

[1 2 3 4 5]
<class 'numpy.ndarray'>


<h3>Numpy array vs Python list</h3>

<div class="alert alert-info">
<p>Like we said earlier that the NumPy array is much faster than the Python list. We also mentioned that NumPy consumes less memory than the Python list. We would demonstrate that in this section.</p>
<p>We would be importing some packages(sys and time) to help prove this point</p>
<p>Note: You don't have to understand the codes in this section. We just want to prove the point above. So, if you don't understand, do not worry as you will get understand them in the course of this notebook</p>
</div>

In [11]:
import time
import sys

<h4>Memory Comparison</h4>

In [12]:
#Memory consumed by python list

b = range(1000)
print(sys.getsizeof(5)*len(b))

28000


In [13]:
#Memory consumed by Numpy array

c = np.arange(1000)
print(c.size*c.itemsize)

4000


<h4>Time Comparison</h4>

In [16]:
size = 100000
L1 = range(size)
L2 = range(size)
A1 = np.arange(size)
A2 = np.arange(size)


#Where l1 and L2 are Python lists and A1 and A2 are Numpy Arrays

In [17]:
#Python
start = time.time()
result = [(x+y) for x,y in zip(L1, L2)]
print(f"Python list took:", (time.time() - start)*1000 , "milliseconds")

Python list took: 335.9971046447754 milliseconds


In [18]:
#Numpy
start = time.time()
result = A1 + A2
print(f"Numpy list took:", (time.time() - start)*1000, "milliseconds")

Numpy list took: 70.8918571472168 milliseconds


<h3>Array Dimensions</h3>

<div class="alert alert-info">
<p>Just like we can have nested lists in Python, we can also have nested array in Numpy. Nested array are arrays that have arrays as their elements. Dimensions in arrays is the level of array depth. An array can have any number of dimensions.</p>
    <p>Let's look at some common dimensional arrays</p>
     <ul>
         <li><b>0-Dimensional Arrays:</b> 0-Dimensional arrays are also referred to as scalars. They are arrays with only a single element. E.g 57</li>
         <li><b>1-Dimensional Arrays:</b> 1-Dimensional arrays are arrays that has 0-Dimensional arrays as its elements. These are the most basic and common type of arrays. E.g 1, 2, 3, 4, 5 </li>
         <li><b>2-Dimensional Arrays:</b> 2-Dimensional arrays are arrays that has 1-Dimensional arrays as its elements. These type of arrays are often used to represent matrices. E.g 1, 2, 3 and 4,5,6</li>
         <li><b>3-Dimensional Arrays:</b> 3-Dimensional arrays are arrays that has 2-Dimensional arrays as its elements. These are often used to represent a 3rd order tensor.</li>
     </ul>
    
</div>

In [21]:
#Creating a 0-Dimensional array with value 57

a = np.array(57)

print(a)

57


In [22]:
#Creating a 1-Dimensional array with values 1, 2, 3, 4, 5 

a = np.array([1, 2, 3, 4, 5 ])

print(a)

[1 2 3 4 5]


In [24]:
#Creating a 2-Dimensional array with values 1, 2, 3 and 4,5,6

a = np.array([[1, 2, 3], [4,5,6]])

print(a)

[[1 2 3]
 [4 5 6]]


In [33]:
#Creating a 3-Dimensional array with values 1, 2, 3 and 4,5,6

a = np.array([[[1, 2, 3], [4,5,6]], [["a", "b", "c"], ["d","e","f"]]])

print(a)

[[['1' '2' '3']
  ['4' '5' '6']]

 [['a' 'b' 'c']
  ['d' 'e' 'f']]]


<div class="alert alert-info">
<p>You can go on and on like that creating any number of dimensions for the array.</p>
    <p>To check the number of dimensions of your array, you can use the <b>ndim</b> attribute that returns an integer that tells you how many dimensions the array have.</p>
    <p>NumPy arrays have an attribute called <b>shape</b> that returns a tuple where the number of elements in the tuple signifies the number of dimensions and each element signifies the number of elements in each dimension. Simply put, the shape of an array is the number of elements in each dimension. For example, if the shape attribute returns (2, 3), this means that the array has 2 dimensions, and each dimension has 3 elements. Also, if the shape attribute returns (2, 2, 3), this means that the array has 3 dimensions, where the outermost array and the next one has 2 elements, while the innermost array has 3 elements</p>
  <p>When the array is created, you can specify the number of dimensions it should have using the <b>ndmin</b> argument.</p>
 </div>

In [37]:
#Using the ndim to check the number of dimensions of an array
arr_1 = np.array(57)
arr_2 = np.array([1, 2, 3, 4, 5])
arr_3= np.array([[1, 2, 3], [4, 5, 6]])
arr_4 = np.array([[[1, 2, 3], [4, 5, 6]], [["a", "b", "c"], ["d","e","f"]]])

print(arr_1.ndim)
print(arr_2.ndim)
print(arr_3.ndim)
print(arr_4.ndim)

0
1
2
3


In [87]:
#Using the ndmin to specify the number of dimensions of an array
a = np.array([1, 2, 3, 4,5], ndmin=5)

print(a)
print('The number of dimensions is:', a.ndim)

[[[[[1 2 3 4 5]]]]]
The number of dimensions is: 5


<div class="alert alert-danger">
Tip: One quick way to know the number of dimensions of an array is by counting the number of square brackets. 
The number of square brackets it has equates to the number of dimensions of the array. 
</div>

In [88]:
#Checking the shape of an array
print(arr_1.shape)
print(arr_2.shape)
print(arr_3.shape)
print(arr_4.shape)

()
(5,)
(2, 3)
(2, 2, 3)


<div class="alert alert-info">
There is a <b>reshape</b> attribute in NumPy to reshape an array. Reshaping means changing the shape of an array. The shape of an array is the number of elements in each dimension. By reshaping we can add or remove dimensions or change number of elements in each dimension.
    
<p>Note: When using the reshape attribute, ensure the new shape you are specifying is achievable given the number of elements in the array. If not, you will get an error</p>


In [95]:
a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10,11,12])
print(a.shape)


(12,)


In [98]:
#Reshaping from 1-D to 2-D

new_arr = a.reshape(4, 3)
print(new_arr.shape)
print(new_arr)

(4, 3)
[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]


In [99]:
#Reshaping from 1-D to 3-D

new_arr = a.reshape(2, 3,2)
print(new_arr.shape)
print(new_arr)

(2, 3, 2)
[[[ 1  2]
  [ 3  4]
  [ 5  6]]

 [[ 7  8]
  [ 9 10]
  [11 12]]]


In [106]:
a = np.array([[1, 2, 3, 4, 5, 6],[7, 8, 9, 10,11,12]])
print(a.shape)

(2, 6)


In [107]:
#Reshaping from 2-D to 3-D

new_arr = a.reshape(2, 3,2)
print(new_arr.shape)
print(new_arr)

(2, 3, 2)
[[[ 1  2]
  [ 3  4]
  [ 5  6]]

 [[ 7  8]
  [ 9 10]
  [11 12]]]


<div class="alert alert-info">
    Converting a multi-dimensional array to a 1-dimensional array is known as <b>flatening an array.</b> We can use reshape(-1) to do this.
   </div>

In [26]:
#Flattening an array using the faltten() function
a = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
print(a.shape)

(2, 2, 3)


In [27]:
new_arr = a.flatten()
print(new_arr.shape)
print(new_arr)

(12,)
[ 1  2  3  4  5  6  7  8  9 10 11 12]


In [28]:
#Flattening an array using reshape(-1)

a = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
print(a.shape)

(2, 2, 3)


In [None]:
new_arr = a.reshape(-1)
print(new_arr.shape)
print(new_arr)

<div class="alert alert-info">There is also a <b>resize()</b> function in Numpy that function almost like the reshape() function except that the resize() function will not generate and error if the shape specified is not achievable with the number of elements</div>

In [67]:
#Expect an error

a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10,11,12])
np.reshape(a, [3,5])

ValueError: cannot reshape array of size 12 into shape (3,5)

In [69]:
#Now, lets chane reshape to resize
a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10,11,12])
np.resize(a, [3,5])

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12,  1,  2,  3]])

<h3>Array Indexing</h3>

<div class="alert alert-info">
Just like we use list indexing to access elements in a list, we also have array indexing. You can access an array element by referring to its index number. Indexing in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1.
</div>

In [42]:
a = np.array([1, 2, 3, 4,5])

print(a[0])
print(a[1])
print(a[-1]- a[0])

1
2
4


<div class="alert alert-info">
This method is for accessing 1-dimensional arrays. To access a 2 or 3 dimensional array, you can use comma separated integers representing the dimension and the index of the element. Check the examples below to understand better
</div>

In [48]:
#Accessing a 2-dimensional array
a = np.array([[1, 2, 3], [4,5,6]])
print(a)

print('3rd element on the 2nd dimension: ', a[1, 2])
print('2nd element on the 1st dimension: ', a[0, 1])
print('1st element on the 2nd dimension: ', a[1, 0])

[[1 2 3]
 [4 5 6]]
3rd element on the 2nd dimension:  6
2nd element on the 1st dimension:  2
1st element on the 2nd dimension:  4


In [50]:
#Accessing a 3-dimensional array
a = np.array([[[1, 2, 3], [4,5,6]], [["a", "b", "c"], ["d","e","f"]]])


print(a[1,1,1])

e


<div class="alert alert-info">
<p>print(a[1,1,1]) above retuned e. This is because we are indexing a[1,1,1]. see explantion below: </p>

<p>The first number represents the first dimension, which contains two elements:
[[1, 2, 3], [4,5,6]]
and:
[["a", "b", "c"], ["d","e","f"]]
. Since we selected 1, it means we are referencing the second element. i.e [[1, 2, 3], [4, 5, 6]]
</p><p>
The second number represents the second dimension, which also contains two elements:
["a", "b", "c"]
and:
["d","e","f"]
. Since we selected 1, it means we are referencing the second element of this second array. i.e ["d","e","f"]
</p><p>
The third number represents the third dimension, which contains three elements: d,e,f. Since we selected 1, it means we are referencing the second element. i.e 'e'</p>


<h3>Array slicing<h3>

<div class="alert alert-info">
We can also have slicing in NumPy arrays. We can refrence elements using slicing uisng the format below:<br>

We pass slice instead of index like this: <b>arr[start:end].</b>
<br>
We can also define the step, like this: <b>arr[start:end:step].</b>
<br>
If you don't specify the start index it defaults to 0. Likewise, if you don't specify the end index, it defaults to the length of the array.

<p>Note: The result includes the start index, but excludes the end index.</p>
</div>

<h4>Slicing 1-D arrays</h4>


In [70]:
a = np.array([1, 2, 3, 4,5,6,7,8])

print(a[1:3])
print(a[1:])


[2 3]
[2 3 4 5 6 7 8]


In [71]:
print(a[:3])
print(a[0:5:2])
print(a[::2])

[1 2 3]
[1 3 5]
[1 3 5 7]


In [None]:
<h4>Slicing 2-D arrays</h4>

In [72]:
a = np.array([[1, 2, 3], [4,5,6]])

print(a[1, :])


[4 5 6]


In [74]:
print(a[0, :2])              #for the first element, return index 0 to 2(excludes index 2)


[1 2]


In [75]:
print(a[:, 2])              #for both element, return index 2

[3 6]


In [76]:
print("For both elememts, this is index 0 to 2")
print(a[0:2, 0:2])

For both elememts, this is index 0 to 2
[[1 2]
 [4 5]]


<h4>Slicing 3-D arrays</h4>

In [78]:
a = np.array([[[1, 2, 3], [4,5,6]], [["a", "b", "c"], ["d","e","f"]]])

print(a[1, 1, :])

['d' 'e' 'f']


In [79]:
print(a[1, :, :2])

[['a' 'b']
 ['d' 'e']]


In [80]:
print(a[:, :, 1:])

[[['2' '3']
  ['5' '6']]

 [['b' 'c']
  ['e' 'f']]]


<h3>NumPy Data types</h3>

<div class="alert alert-info">
        <p>Numpy has some additional data types beyond the conventional Python data type. Below is a list of all data types in NumPy and the characters used to represent them.</p><br>

<li>i - integer</li>
<li>b - boolean</li>
<li>u - unsigned integer</li>
<li>f - float</li>
<li>c - complex float</li>
<li>m - timedelta</li>
<li>M - datetime</li>
<li>O - object</li>
<li>S - string</li>
<li>U - unicode string</li>
<li>V - fixed chunk of memory for other type ( void )</li>

<p>To check the data type of an array, you use the <b>dtype</b> attribute.</p>


In [81]:
a = np.array([1, 2, 3, 4,5])

print(a.dtype)

int32


In [82]:
a = np.array(['a', 'b', 'c'])

print(a.dtype)


<U1


In [83]:
a = np.array(['presh', 'praise', 'mj'])

print(a.dtype)


<U6


<div class="alert alert-info">
    You can also use the <b>dtype</b> argument to specify the data type of the array elements
</div>

In [84]:
a = np.array([1, 2, 3, 4,5], dtype='S')

print(a)
print(a.dtype)

[b'1' b'2' b'3' b'4' b'5']
|S1


In [85]:
a = np.array([[1,2],[3,4],[4,5],[5,6]], dtype = np.float64)
print(a)
print(a.dtype)

[[1. 2.]
 [3. 4.]
 [4. 5.]
 [5. 6.]]
float64


In [86]:
a = np.array([[1,2],[3,4],[4,5],[5,6]], dtype = np.complex)
print(a)
print(a.dtype)

[[1.+0.j 2.+0.j]
 [3.+0.j 4.+0.j]
 [4.+0.j 5.+0.j]
 [5.+0.j 6.+0.j]]
complex128


<h3>Iterating over an array</h3>

<div class="alert alert-info">
    <p>Iterating over an array can be done using the basic <b>for</b> loop in Python. However, in basic for loops, iterating through each scalar of an array would require you to use <b>n for loops</b> which can be difficult to write for arrays with very high dimensionality.</p>

<p>The <b>nditer()</b> function can be used instead. It is a helping function that can be used from very basic to very advanced iterations. It solves some basic issues which we face in iteration</p>

<p>In this section, we will use both the for loop and the nditer to iterate over the array and compare the two methods</p>
    </div>

In [4]:
#Iterating over an array using for loop
a = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])

for x in a:
    for y in x:
        for z in y:
            print(z)

1
2
3
4
5
6
7
8
9
10
11
12


In [5]:
#Iterating over an array using nditer()

a = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
for x in np.nditer(a):
    print(x)

1
2
3
4
5
6
7
8
9
10
11
12


<div class="alert alert-info">
    <p>You can see the nditer() is more efficient than the for loop for high dimesional data</p>
    </div>

<div class="alert alert-info">
    <p>You can also specify the order(column or row) by specifying the order argument of the nditer() function</p>
    <p>F is the fortan way, C is the default one and it is the C programming way</p>
    </div>

In [12]:
a = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])

print(a)

[[[ 1  2  3]
  [ 4  5  6]]

 [[ 7  8  9]
  [10 11 12]]]


In [13]:
for x in np.nditer(a, order = "C"):
    print(x)
    

1
2
3
4
5
6
7
8
9
10
11
12


In [11]:
for x in np.nditer(a, order = "F"):
    print(x)

1
7
4
10
2
8
5
11
3
9
6
12


<h3>Joining arrays</h3>

<div class="alert alert-info">
<p>Joining means putting contents of two or more arrays in a single array. In SQL, we join tables based on a key, whereas in NumPy we join arrays by axes.</p>

<p>To do this, you pass a sequence of arrays that we want to join to the concatenate() function, along with the axis. If axis is not explicitly passed, it is taken as 0. Another method for joining arrays is the stack function which is similar to the concatenate function. NumPy provides helper functions to specify ho you want to stack the arrays:<br>
    <li><b>hstack():</b> to stack along rows</li>
    <li><b>vstack():</b> to stack along columns</li>
    <li><b>dstack():</b> to stack along heights</li>
</p>

In [30]:
#Joining arrays using the concatenate function
a = np.array([[1,2],[3,4]])
b = np.array([[5,6],[7,8]])

print("Joining the two arrays along the axis 0:")
print(np.concatenate((a,b)))

print("Joining the two arrays along the axis 1:")
print(np.concatenate((a,b), axis = 1))

Joining the two arrays along the axis o:
[[1 2]
 [3 4]
 [5 6]
 [7 8]]
Joining the two arrays along the axis 1:
[[1 2 5 6]
 [3 4 7 8]]


In [34]:
#Joining arrays using the stack function
a = np.array([[1,2],[3,4]])
b = np.array([[5,6],[7,8]])

print("Joining the two arrays along the axis 0:")
print(np.concatenate((a,b)))

print("Joining the two arrays along the axis 1:")
print(np.stack((a,b), axis = 1))


Joining the two arrays along the axis 0:
[[1 2]
 [3 4]
 [5 6]
 [7 8]]
Joining the two arrays along the axis 1:
[[[1 2]
  [5 6]]

 [[3 4]
  [7 8]]]


In [35]:
#Stacking along the rows
a = np.array([[1,2],[3,4]])
b = np.array([[5,6],[7,8]])

print(np.hstack((a,b)))

[[1 2 5 6]
 [3 4 7 8]]


In [36]:
#Stacking along the columns
a = np.array([[1,2],[3,4]])
b = np.array([[5,6],[7,8]])

print(np.vstack((a,b)))

[[1 2]
 [3 4]
 [5 6]
 [7 8]]


In [37]:
#Stacking along the heights
a = np.array([[1,2],[3,4]])
b = np.array([[5,6],[7,8]])

print(np.dstack((a,b)))

[[[1 5]
  [2 6]]

 [[3 7]
  [4 8]]]


<h3>Spliting arrays</h3>

<div class="alert alert-info">
<p>Splitting is simply the reverse operation of Joining. Joining merges multiple arrays into one and splitting breaks one array into multiple arrays.</p>

<p>To split arrays, you use the <b>array_split()</b> function. You pass the array you want to split and the number of splits desired.</p></div>

In [45]:
a = np.array([1, 2, 3, 4, 5, 6,7,8,9,10])

new_arr = np.array_split(a, 3)

print(new_arr)

[array([1, 2, 3, 4]), array([5, 6, 7]), array([ 8,  9, 10])]


<div class="alert alert-info">
    <p>You can also assign each splited array to a variable. </p>
</div>

In [46]:
a= np.array([1, 2, 3, 4, 5, 6,7,8,9,10])
a,b,c = np.array_split(a, 3)

print(a)

[1 2 3 4]


In [47]:
print(b)
print(c)

[5 6 7]
[ 8  9 10]


<h3>Searching and sorting arrays</h3>

<div class="alert alert-info">
<p>You can search an array for a certain value, and return the indexes that get a match.To search an array, use the <b>where()</b>method. Searching an array is very useful in data science. You will get to understand as you begin to work with real world datasets</p>
    
<p>Sorting means putting elements in an ordered sequence, either ascending or descending. To sort arrays in NumPy, you use the <b>sort()</b> function.</p>
</div>

<h4>Searching an array</h4>

In [50]:
a = np.array([1,4,1,2,5,6,3,8,5,1,4,4])

x = np.where(a == 1)

print(x)

(array([0, 2, 9], dtype=int64),)


<h4>Sorting an array</h4>

In [55]:
a = np.array([1,4,1,2,5,6,3,8,5,1,4,4])

x = np.sort(a)

print(x)

[1 1 1 2 3 4 4 4 5 5 6 8]


In [56]:
a = np.array(['b', 'd', 'j', 'o', 't', 'c', 'a','i'])

print(np.sort(a))

['a' 'b' 'c' 'd' 'i' 'j' 'o' 't']


In [58]:
a = np.array([[7,2,4], [9,3,5]])

print(np.sort(a))

[[2 4 7]
 [3 5 9]]


In [60]:
a = np.array([True, False, True, False, False, True])

print(np.sort(a))

[False False False  True  True  True]


<h3>Filtering an array</h3>

<div class="alert alert-info">
Getting some elements out of an existing array and creating a new array out of them is called <b>filtering</b>. An array in NumPy is filtered using a boolean index list(a list of booleans corresponding to indexes in the array). If the value at an index is <b>True</b>,  then the element will be included in the filtered array, else, the value at that index is <b>False</b> and the element will be excluded from the filtered array. Simlpy put, only elements whose index value correspond to True would be returned in the filtered array. 

</div>

In [5]:
a = np.array(['b', 'd', 'j', 'o', 't', 'c', 'a','i'])

filtered_array = [False, False, False, True, False, False, True, True]

vowels = a[filtered_array]

print(vowels)

['o' 'a' 'i']


<div class="alert alert-info">
In the example above we hard-coded the True and False values. In real world scenarios, you will have to generate the filter array based on some conditions. We commonly use the <i>if else</i> statements for this. 
</div>

In [7]:
#We want to create a filtered array that will return only even numbers from the array below

a = np.array([1,5,2,7,3,9,1,3,7,2,8,4,7,2])

# Create an empty list
filtered_array = []

# go through each element in the array 
for element in a:
    if element%2 == 0:
        filtered_array.append(True)
    else:
        filtered_array.append(False)

even_numbers = a[filtered_array]

print(filtered_array)
print(even_numbers)

[False, False, True, False, False, False, False, False, False, True, True, True, False, True]
[2 2 8 4 2]


<h3>NumPy random numbers</h3>

<div class="alert alert-info">
<p>Random number does not necessarily mean a different number every time. Random simply means something that can not be predicted logically.</p>
    <p>NumPy allows us generate random numbers using the <b>random</b> module. However, the numbers generated are not truly random since it was generated using an algorithm. This kind of random numbers are referred to as pseudo random. Often times, we don't need truly random numbers except for security or very important reasons. </p>

<p>To use the NumPy random module, you have to import it using the command below: <br>
    <b>from numpy import random</b><br>
The random class has two major functions used for generating random numbers:<br>
    <li><b>randint()</b> - for generating random integer numbers</li>
    <li><b>rand()</b> - for generating random floating point numbers</li>

In [13]:
#Retruns any random number between 0 and 10

from numpy import random

a = random.randint(10)

print(a)

5


<div class="alert alert-info">
Aside the range you want your random number to be generated from, the randint() method takes a <b>size</b> parameter where you can specify the shape of the array.
<div>

<h4>Random numbers using the randint() module</h4>

In [19]:
a=random.randint(10, size=(5))

print(a)

[5 4 7 1 7]


In [20]:
a=random.randint(10, size=(20))

print(a)

[2 1 9 0 4 8 9 4 0 7 4 7 5 4 4 6 5 8 4 1]


In [21]:
a = random.randint(10, size=(2, 5))

print(a)

[[0 1 1 8 4]
 [6 7 4 7 6]]


In [22]:
a = random.randint(10, size=(3, 4))

print(a)

[[6 0 8 0]
 [0 1 5 5]
 [5 5 7 1]]


<h4>Random numbers using the rand() module</h4>

In [14]:
a = random.rand()

print(a)

0.1405626721813733


<div class="alert alert-info">
Unlike the randint() method, the rand() method does not take the range argument because it only returns float between 0 and 1. However, you can also specify the shape of the array as shown below.
<div>

In [23]:
a = random.rand(10)

print(a)

[0.97495278 0.06076402 0.61091635 0.55397758 0.7577156  0.64889528
 0.92001634 0.19896982 0.8679357  0.86317944]


In [24]:
a = random.rand(3, 4)

print(a)

[[0.31321957 0.94117417 0.85820117 0.62093794]
 [0.08445087 0.8706519  0.70937987 0.4572463 ]
 [0.38049957 0.67381545 0.19678036 0.14803225]]


<div class="alert alert-info">
NumPy also allows you to generate a random value based on the values in an array. This is done using the <b>choice()</b> method. This method takes an array as a parameter and randomly returns one of the values. You can also specify the shape of the array using the size argument</div> 

In [26]:
a = random.choice([1,2,3,4,5,6,7,8,9,0])

print(a)

6


In [27]:
a = random.choice([1,2,3,4,5,6,7,8,9,0], size = (3,4))

print(a)

[[1 7 1 1]
 [8 8 4 6]
 [6 5 7 8]]


In [28]:
a = random.choice([1,2,3,4,5,6,7,8,9,0], size = (5))

print(a)

[7 9 7 8 7]


<h3>NumPy arange function</h3>

<div class="alert alert-info">
Python has a built-in range function that allows you generate a sequence of integers based on your specified range. To do this in Numpy, you use the <b>arange</b> function for this. </div>

In [37]:
#Python range function

range(10)

range(0, 10)

In [38]:
for number in range(10):
    print(number)

0
1
2
3
4
5
6
7
8
9


<div class="alert alert-info">
Python range function will return a tuple. To get the list of element, you will have to use a for loop to iterate over it. Numpy arange function on the other hand will return an array directly. You can use the .reshape() function to specify the shape of the array generated</div>

In [46]:
np.arange(12)

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [47]:
np.arange(12).reshape(3,4)

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

<div class="alert alert-info">
NumPy also allows you to generate an array of 0's and 1's. You simply do this using the <b>np.zeros()</b> and <b>np.ones()</b> function. You will also need to specify the shape of the array.</div>

In [40]:
#Array  of 0's

np.zeros((3,4))

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [41]:
np.zeros((5))

array([0., 0., 0., 0., 0.])

In [42]:
#Array  of 1's

np.ones((3,4))

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

In [43]:
np.ones((5))

array([1., 1., 1., 1., 1.])

<h3>NumPy arithmetic operations</h3>

In [54]:
a = np.arange(1,10).reshape(3,3)
b= np.array([10,100,1000])

print("This is array a: ")
print(a)

This is array a: 
[[1 2 3]
 [4 5 6]
 [7 8 9]]


In [55]:
print("This is array b: ")
print(b)

This is array b: 
[  10  100 1000]


In [48]:
np.add(a,b)

array([[  11,  102, 1003],
       [  14,  105, 1006],
       [  17,  108, 1009]])

In [49]:
np.subtract(a,b)


array([[  -9,  -98, -997],
       [  -6,  -95, -994],
       [  -3,  -92, -991]])

In [50]:
np.multiply(a,b)


array([[  10,  200, 3000],
       [  40,  500, 6000],
       [  70,  800, 9000]])

In [51]:
np.divide(a,b)

array([[0.1  , 0.02 , 0.003],
       [0.4  , 0.05 , 0.006],
       [0.7  , 0.08 , 0.009]])

In [64]:
#Sum along axis

a = np.array([(1,2,3),(3,4,5)])
a.sum(axis=1)

array([ 6, 12])

In [65]:
a.sum(axis=0)

array([4, 6, 8])

In [60]:
#square

a = np.array([(1,2,3),(3,4,5)])
np.square(a)

[[ 1  4  9]
 [ 9 16 25]]


In [61]:
#square root

a = np.array([(1,2,3),(3,4,5) ])
np.sqrt(a)

array([[1.        , 1.41421356, 1.73205081],
       [1.73205081, 2.        , 2.23606798]])

In [63]:
#squareroot and standard deviation

np.std(a)

1.2909944487358056

In [52]:
np.transpose(a)            

array([[1, 4, 7],
       [2, 5, 8],
       [3, 6, 9]])

In [58]:
#NumPy log functions

a = np.array([(1,2,3)]) 

print(np.log2(a))
print(np.log10(a))

[[0.        1.        1.5849625]]
[[0.         0.30103    0.47712125]]


<h3>NumPy string functions</h3>

<div class="alert alert-info">
<p>NumPy allows us to perform different operations on string data type just like we have in Python. Some of this functions are: <br>
    <li><b>np.char.add: </b> This will concatenate elements in an array to its correponding element in another array</li>
    <li><b>np.char.multiply: </b>This will display the string the specified number of times</li>
    <li><b>np.char.center: </b> This takes in a string, desired  length and character you want to pad the string with if it does not meet the specified length</li>
    <li><b>np.char.capitalize: </b>This will capitalize the first letter in the string</li> 
    <li><b>np.char.title: </b>This will capitalize the first letter of every world</li>
    <li><b>np.char.upper: </b>This will capitalize the whole string</li> 
    <li><b>np.char.lower: </b>This will convert the string to lower cases</li> 
    <li><b>np.char.split: </b>This will split a single single to different strings. You can specify the split parameter but by default, it splits using the space character</li> 
    <li><b>np.char.splitlines: </b>This is to split a string using the newline character(\n)</li> 
    <li><b>np.char.strip: </b>Removes the specified character from the end of the string</li> 
    <li><b>np.char.join: </b>Join characters together</li> 
</div>

In [74]:
#NumPy String Concatenation
print(np.char.add(["Data ", "Abisola "], ["Science", "Fikayomi"]))

['Data Science' 'Abisola Fikayomi']


In [75]:
print(np.char.multiply("hello ", 3))

hello hello hello 


In [21]:
print(np.char.center("hello", 20, fillchar = "!"))

!!!!!!!hello!!!!!!!!


In [76]:
print(np.char.capitalize("hello world"))

Hello world


In [23]:
print(np.char.title("hello world"))

Hello World


In [24]:
print(np.char.upper("hello world"))

HELLO WORLD


In [25]:
print(np.char.lower("hello world"))

hello world


In [77]:
print(np.char.split("hello world"))

['hello', 'world']


In [27]:
print(np.char.splitlines("hello\n world\nokay"))

['hello', ' world', 'okay']


In [28]:
print(np.char.strip(["hello", "worldo"], "o"))

['hell' 'world']


In [80]:
print(np.char.join([":", "-"], ["dmy","data"])) 

['d:m:y' 'd-a-t-a']


<h3>Other useful NumPy function</h3>

In [87]:
#Roll axis

a = np.array([(1,2,3),(3,4,5)])

print(np.rollaxis(a,1,0))

[[1 3]
 [2 4]
 [3 5]]


In [88]:
print(np.rollaxis(a,1))

array([[1, 3],
       [2, 4],
       [3, 5]])

In [89]:
#Swap axes

np.swapaxes(a,0,1)

array([[1, 3],
       [2, 4],
       [3, 5]])

In [94]:
#linespace function
a = np.linspace(1,4,10)                         #generate 10 equally spaced numbers between 0 and 1
print(a)

[1.         1.33333333 1.66666667 2.         2.33333333 2.66666667
 3.         3.33333333 3.66666667 4.        ]


In [95]:
#Ravel function
x = np.array([(1,2,3),(3,4,5)])
print(x.ravel())

[1 2 3 3 4 5]


<h3>Numpy Practice Questions<h3>

<div class="alert alert-danger">
<h2>Exercise 1</h2>
Create a 6 by 6 two-dimensional array, and let 1 and 0 be placed alterantively accross the diagonals
</div>

Double click <b>here</b> to see my solution.

<!-- This is my code for the exercise above.
However, note that in programming, there might be different approach to solving problems.
You might use a different approach from mine, it doesn't matter. 
The objective is to get the desired result. 
Copy the code below and run it in another cell and compare the output with your own output.

This is one approach to the problem

a = np.zeros((6,6), dtype = int)
a[1::2, ::2] = 1
a[::2,1::2] = 1
print(a)



This is another approach to the problem

b = np.ones((6,6), dtype = int)
b[1::2, ::2] = 0
b[::2, 1::2] = 0
print(b)


copy both block of codes and run them on different cells

-->


<div class="alert alert-danger">
<h2>Exercise 2</h2>
Create a random array of floating point numbers. Replace some of the elements in the array with np.nan(not a number).
Find  the total number and locations of missing values in the array. Finally, replace this missing values with "999999"
</div>

Double click <b>here</b> to see my solution.

<!-- This is my code for the exercise above.
However, note that in programming, there might be different approach to solving problems.
You might use a different approach from mine, it doesn't matter. 
The objective is to get the desired result. 
Copy the code below and run it in another cell and compare the output with your own output.

a = np.random.rand(10,10)
a[np.random.randint(10, size = 5), np.random.randint(10, size = 5)] = np.nan
print("Total number of missing values: \n", np.isnan(a).sum())
print("Indexes  of missing values: \n", np.argwhere(np.isnan(a)))
j = np.where(np.isnan(a))
a[j]= 999999
a


-->


<div class="alert alert-info">
<p><big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>  <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>  <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>  <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>   <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>   <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>   <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>   <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>   <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>    <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>    <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>    <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>    <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>     <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>     <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>     <big><big>&#128079;&#127996;</big></big>
    </p>
    
<p><h2>Thumbs up for making it to the end of this notebook. You just earned the avid learner certification as well as the NumPy fundamentals certification!!!!!</h2>
 </p>
<p><big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>  <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>  <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>  <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>   <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>   <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>   <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>   <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>   <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>    <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>    <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>    <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>    <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>     <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>     <big><big>&#128079;&#127996;</big></big>
    <big><big>&#128079;&#127996;</big></big>     <big><big>&#128079;&#127996;</big></big>
    </p>


</div>

<div class="alert alert-info">
<h3>Conclusion and next step</h3>


<p>Congrats once again for completing this notebook. What was covered in this notebook are the basics of NumPy. You can check out the <a href = https://numpy.org/devdocs/user/absolute_beginners.html>NumPy documentation</a> and other resources online to advance your knowledge of the concept.</p>

</div>

<div class="alert alert-info">
<h3>Recommended resources</h3>
<p>For other interactive notebooks:
<ul>
<li><a href = https://github.com/Abisola-ds/Python-Fundamentals>Python Fundamentals</a></li>
<li><a href = >Pandas</a></li>
<li><a href = >Matplotlib</a></li>
<li><a href =>Scikit Learn</a></li>
</ul>
    
</p>

<p>To test your data science knowledge, check out:
<li><a href = https://www.kaggle.com/>Kaggle</a></li>
<li><a href = https://zindi.africa/>Zindi</a></li>  
</p>

<p>You can find virtual data science internships at:
<li><a href = https://www.insidesherpa.com/>InsideSherpa</a></li>
</p>

<p>If you get stuck, <a href = https://stackoverflow.com/questions>Stack Overflow</a> is your friend. You can get answer to almost any programming question or error you encounter. You can also contribute to the platform which acts as a plus for you to land your first Python job</p>

<p>Finally, join a community of data scientist where you can learn and collaborate with other data scientists</p>
</div>

<p>I hope you found this notebook helpful. If you have any questions or you want to collaborate, feel free to connect with me at <a href = https://www.linkedin.com/in/abisola-fikayomi>Abisola Fikayomi.</a> I am also open to constructive criticism<big><big>&#128522;</big></big>. Thank you!!!</p>

<div align = "center">Copyright &copy; 2020 <a href = https://www.linkedin.com/in/abisola-fikayomi>Abisola Fikayomi.</a></div>