# NumPy Exercises

This kernel uses exercises of NumPy from the Machine Learning Plus webpage --> https://www.machinelearningplus.com/python/101-numpy-exercises-python/

This kernel is part of my 100 days of machine learning challenge.
Check out my progress on this challenge --> https://github.com/themlphdstudent/100DaysofMachineLearning

## <font color='red'> <center> I am continuously updating this kernel. </center> </font>

## <font color='red'> <center> Don't forget to upvote. </center> </font>

<a id="top"></a>
# Table of Content
- [Import required libraries](import_required_libraries)
- [*Exercise 1*. Import numpy as np and see the version](#1)
- [*Exercise 2*. How to create a 1D array?](#2)
- [*Exercise 3*. How to create a boolean array?](#3)
- [*Exercise 4*. How to extract items that satisfy a given condition from 1D array?](#4)
- [*Exercise 5*. How to replace items that satisfy a condition with another value in numpy array?](#5)
- [*Exercise 6*. How to replace items that satisfy a condition without affecting the original array?](#6)
- [*Exercise 7*. How to reshape an array?](#7)
- [*Exercise 8*. How to stack two arrays vertically?](#8)
- [*Exercise 9*. How to stack two arrays horizontally?](#9)
- [*Exercise 10*. How to generate custom sequences in numpy without hardcoding?](#10)
- [*Exercise 11*. How to get the common items between two python numpy arrays?](#11)
- [*Exercise 12*. How to remove from one array those items that exist in another?](#12)
- [*Exercise 13*. How to get the positions where elements of two arrays match?](#13)
- [*Exercise 14*. How to extract all numbers between a given range from a numpy array?](#14)
- [*Exercise 15*. How to make a python function that handles scalars to work on numpy arrays?](#15)
- [*Exercise 16*. How to swap two columns in a 2d numpy array?](#16)
- [*Exercise 17*. How to swap two rows in a 2d numpy array?](#17)
- [*Exercise 18*. How to reverse the rows of a 2D array?](#18)
- [*Exercise 19*. How to reverse the columns of a 2D array?](#19)
- [*Exercise 20*. How to create a 2D array containing random floats between 5 and 10?](#20)
- [*Exercise 21*. How to print only 3 decimal places in python numpy array?](#21)
- [*Exercise 22*. How to pretty print a numpy array by suppressing the scientific notation (like 1e10)?](#22)
- [*Exercise 23*. How to limit the number of items printed in output of numpy array?](#23)
- [*Exercise 24*. How to print the full numpy array without truncating](#24)
- [*Exercise 25*. How to import a dataset with numbers and texts keeping the text intact in python numpy?](#25)
- [*Exercise 26*. How to extract a particular column from 1D array of tuples?](#26)
- [*Exercise 27*. How to convert a 1d array of tuples to a 2d numpy array? ](#27)
- [*Exercise 28*. How to compute the mean, median, standard deviation of a numpy array?](#28)
- [*Exercise 29*. How to normalize an array so the values range exactly between 0 and 1?](#29)
- [*Exercise 30*. How to compute the softmax score?](#30)
- [*Exercise 31*. How to find the percentile scores of a numpy array?](#31)
- [*Exercise 32*. How to insert values at random positions in an array?](#32)
- [*Exercise 33*. How to find the position of missing values in numpy array?](#33)
- [*Exercise 34*. How to filter a numpy array based on two or more conditions?](#34)
- [*Exercise 35*. How to drop rows that contain a missing value from a numpy array?](#35)
- [*Exercise 36*. How to find the correlation between two columns of a numpy array?](#36)
- [*Exercise 43*. How to get the second largest value of an array when grouped by another array?](#43)
- [*Exercise 44*. How to sort a 2D array by a column](#44)
- [*Exercise 45*. How to find the most frequent value in a numpy array?](#45)
- [*Exercise 46*. How to find the position of the first occurrence of a value greater than a given value?](#46)
- [*Exercise 47*. How to replace all values greater than a given value to a given cutoff?](#47)
- [*Exercise 48*. How to get the positions of top n values from a numpy array?](#48)
- [*Exercise 49*. How to compute the row wise counts of all possible values in an array?](#49)
- [*Exercise 50*. How to convert an array of arrays into a flat 1d array?](#50)

<a id='import_required_libraries'></a>
## Import required libraries
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [1]:
# importing the core library
import numpy as np

import os
for dirname, _, filenames in os.walk('../input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))
        
# print multiple output in single cell
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

../input/heart-disease-uci/heart.csv
../input/iris/Iris.csv
../input/iris/database.sqlite
../input/red-wine-quality-cortez-et-al-2009/winequality-red.csv
../input/mushroom-classification/mushrooms.csv
../input/pima-indians-diabetes-database/diabetes.csv


<a id='1'></a>
### *Exercise 1*. Import numpy as np and see the version
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [2]:

import numpy as np
print(np.__version__)

1.18.5


<a id='2'></a>
### *Exercise 2*. How to create a 1D array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [3]:
# Question : Create a 1D array of numbers from 0 to 9
# Output : #> array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

# Solution
X = np.arange(10)
X

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

<a id='3'></a>
### *Exercise 3*. How to create a boolean array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [4]:
# Question : Create a 3×3 numpy array of all True’s

# Solution
np.full((3,3), True, dtype=bool)

#or
np.full((9), True, dtype=bool).reshape(3,3)

#or
np.ones((3,3), dtype=bool)

#or
np.ones((9), dtype=bool).reshape(3,3)

array([[ True,  True,  True],
       [ True,  True,  True],
       [ True,  True,  True]])

array([[ True,  True,  True],
       [ True,  True,  True],
       [ True,  True,  True]])

array([[ True,  True,  True],
       [ True,  True,  True],
       [ True,  True,  True]])

array([[ True,  True,  True],
       [ True,  True,  True],
       [ True,  True,  True]])

<a id='4'></a>
### *Exercise 4*. How to extract items that satisfy a given condition from 1D array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [5]:
# Question : Extract all odd numbers from array
# input: arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
# output: array([1, 3, 5, 7, 9])

#Solution

arr = np.arange(10)

arr[arr%2 == 1]

array([1, 3, 5, 7, 9])

<a id='5'></a>
### *Exercise 5*. How to replace items that satisfy a condition with another value in numpy array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [6]:
# Question: Replace all odd numbers in arr with -1
# input: arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
# output: array([ 0, -1,  2, -1,  4, -1,  6, -1,  8, -1])

# Solution

arr = np.arange(10)

arr[arr%2 == 1] = -1
arr

array([ 0, -1,  2, -1,  4, -1,  6, -1,  8, -1])

<a id='6'></a>
### *Exercise 6*. How to replace items that satisfy a condition without affecting the original array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [7]:
# Question: Replace all odd numbers in arr with -1 without changing arr
# input: arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
# output: out
# array([ 0, -1,  2, -1,  4, -1,  6, -1,  8, -1])
# arr
# array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

# Solution

arr = np.arange(10)

out = arr.copy()

out[out%2 == 1] = -1

print('Modified Array')
out

print('\nOriginal Array')
arr

Modified Array


array([ 0, -1,  2, -1,  4, -1,  6, -1,  8, -1])


Original Array


array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

<a id='7'></a>
### *Exercise 7*. How to reshape an array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [8]:
# Question: Convert a 1D array to a 2D array with 2 rows
# input: np.arange(10)
# output array([[0, 1, 2, 3, 4],
#               [5, 6, 7, 8, 9]])

# Solution

arr = np.arange(10)
arr.reshape(2,5)

# Another solution
arr = np.arange(10)
arr.reshape(2, -1)  # Setting to -1 automatically decides the number of cols

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

<a id='8'></a>
### *Exercise 8*. How to stack two arrays vertically?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [9]:
# Question: Stack arrays a and b vertically
# input: a = np.arange(10).reshape(2,-1)
#        b = np.repeat(1, 10).reshape(2,-1)

# output: array([[0, 1, 2, 3, 4],
#                [5, 6, 7, 8, 9],
#                [1, 1, 1, 1, 1],
#                [1, 1, 1, 1, 1]])

# Solution

a = np.arange(10).reshape(2,-1)
b = np.repeat(1, 10).reshape(2,-1)

np.vstack([a,b])

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9],
       [1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1]])

<a id='9'></a>
### *Exercise 9*. How to stack two arrays horizontally?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [10]:
# Question: Stack the arrays a and b horizontally.

# Input: a = np.arange(10).reshape(2,-1)
#        b = np.repeat(1, 10).reshape(2,-1)
# Output: array([[0, 1, 2, 3, 4, 1, 1, 1, 1, 1],
#                [5, 6, 7, 8, 9, 1, 1, 1, 1, 1]])


# Solution:
a = np.arange(10).reshape(2,-1)
b = np.repeat(1, 10).reshape(2,-1)

np.hstack([a,b])

array([[0, 1, 2, 3, 4, 1, 1, 1, 1, 1],
       [5, 6, 7, 8, 9, 1, 1, 1, 1, 1]])

<a id='10'></a>
### *Exercise 10*. How to generate custom sequences in numpy without hardcoding?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [11]:
# Question: Create the following pattern without hardcoding. Use only numpy functions and the below input array a.

# Input: a = np.array([1,2,3])
# Output: array([1, 1, 1, 2, 2, 2, 3, 3, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3])


# Solution

a = np.array([1,2,3])
np.r_[np.repeat(a, 3), np.tile(a, 3)]

#other solution
np.hstack((np.repeat(a, 3), np.tile(a, 3)))

array([1, 1, 1, 2, 2, 2, 3, 3, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3])

array([1, 1, 1, 2, 2, 2, 3, 3, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3])

<a id='11'></a>
### *Exercise 11*. How to get the common items between two python numpy arrays?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [12]:
# Question: Get the common items between a and b

# Input: a = np.array([1,2,3,2,3,4,3,4,5,6])
#        b = np.array([7,2,10,2,7,4,9,4,9,8])

# Output: array([2, 4])


# Solution:
a = np.array([1,2,3,2,3,4,3,4,5,6])
b = np.array([7,2,10,2,7,4,9,4,9,8])
np.intersect1d(a,b)

array([2, 4])

<a id='12'></a>
### *Exercise 12*. How to remove from one array those items that exist in another?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [13]:
# Question: From array a remove all items present in array b

# Input: a = np.array([1,2,3,4,5])
#        b = np.array([5,6,7,8,9])

# Output: array([1,2,3,4])


# Solution
a = np.array([1,2,3,4,5])
b = np.array([5,6,7,8,9])

np.setdiff1d(a,b)

array([1, 2, 3, 4])

<a id='13'></a>
### *Exercise 13*. How to get the positions where elements of two arrays match?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [14]:
# Question: Get the positions where elements of a and b match

# Input: a = np.array([1,2,3,2,3,4,3,4,5,6])
#        b = np.array([7,2,10,2,7,4,9,4,9,8])

# Output: (array([1, 3, 5, 7]),)


# Solution

a = np.array([1,2,3,2,3,4,3,4,5,6])
b = np.array([7,2,10,2,7,4,9,4,9,8])

np.where(a == b)

(array([1, 3, 5, 7]),)

<a id='14'></a>
### *Exercise 14*. How to extract all numbers between a given range from a numpy array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [15]:
# Question: Get all items between 5 and 10 from a.

# Input: a = np.array([2, 6, 1, 9, 10, 3, 27])
# Output: (array([6, 9, 10]),)


# Solution

a = np.array([2, 6, 1, 9, 10, 3, 27])
a[(a >= 5) & (a <= 10)]

array([ 6,  9, 10])

<a id='15'></a>
### *Exercise 15*. How to make a python function that handles scalars to work on numpy arrays?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [16]:
# Question: Convert the function maxx that works on two scalars, to work on two arrays.
# Input:

def maxx(x, y):
    """Get the maximum of two items"""
    if x >= y:
        return x
    else:
        return y

# maxx(1, 5)
#> 5

# Output:
# a = np.array([5, 7, 9, 8, 6, 4, 5])
# b = np.array([6, 3, 4, 8, 9, 7, 1])
# pair_max(a, b)
# array([ 6.,  7.,  9.,  8.,  9.,  7.,  5.])

# Solution

def pair_max(x, y):
    # here I am using map to make tuple from a and b, other solution is using zip(a,b)
    maximum = [maxx(a,b) for a,b in map(lambda a,b:(a,b),x,y)]
    # using zip
    # maximum = [maxx(a,b) for a,b in zip(x,y)]
    return np.array(maximum)

a = np.array([5, 7, 9, 8, 6, 4, 5])
b = np.array([6, 3, 4, 8, 9, 7, 1])

pair_max(a,b)

array([6, 7, 9, 8, 9, 7, 5])

<a id='16'></a>
### *Exercise 16*. How to swap two columns in a 2d numpy array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [17]:
# Question: Swap columns 1 and 2 in the array arr.

# Input:

arr = np.arange(9).reshape(3,3)

print('Original array')
arr

# Solution

print("\nModified array")
arr[:, [1,0,2]]

Original array


array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])


Modified array


array([[1, 0, 2],
       [4, 3, 5],
       [7, 6, 8]])

<a id='17'></a>
### *Exercise 17*. How to swap two rows in a 2d numpy array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [18]:
# Question: Swap rows 1 and 2 in the array arr:

# Input: 

arr = np.arange(9).reshape(3,3)
print('Original array')
arr

# Solution

print("\nModified array")
arr[[1,0,2], :]

Original array


array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])


Modified array


array([[3, 4, 5],
       [0, 1, 2],
       [6, 7, 8]])

<a id='18'></a>
### *Exercise 18*. How to reverse the rows of a 2D array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [19]:
# Question: Reverse the rows of a 2D array arr.

# Input:

arr = np.arange(9).reshape(3,3)

print('Original array')
arr

# Solution

print("\nModified array")
arr[::-1, :]

Original array


array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])


Modified array


array([[6, 7, 8],
       [3, 4, 5],
       [0, 1, 2]])

<a id='19'></a>
### *Exercise 19*. How to reverse the columns of a 2D array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [20]:
# Question: Reverse the columns of a 2D array arr.

# Input: arr = np.arange(9).reshape(3,3)

# Solution

arr = np.arange(9).reshape(3,3)
print('Original array')
arr


print("\nModified array")
arr[:, ::-1]

Original array


array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])


Modified array


array([[2, 1, 0],
       [5, 4, 3],
       [8, 7, 6]])

<a id='20'></a>
### *Exercise 20*. How to create a 2D array containing random floats between 5 and 10?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [21]:
# Question: Create a 2D array of shape 5x3 to contain random decimal numbers between 5 and 10.

# Solution:

rand_arr = np.random.uniform(5,10, size=(5,3))
rand_arr

array([[8.94215867, 9.48313389, 5.16046125],
       [5.79609337, 5.07610146, 6.87205277],
       [5.58624727, 6.39839238, 8.83909431],
       [9.50567112, 7.81015086, 7.94951279],
       [8.91540928, 7.88798918, 5.93284155]])

<a id='21'></a>
### *Exercise 21*. How to print only 3 decimal places in python numpy array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [22]:
# Question: Print or show only 3 decimal places of the numpy array rand_arr.

# Input: rand_arr = np.random.random((5,3))

rand_arr = np.random.random((5,3))
np.set_printoptions(precision=3)
rand_arr

array([[0.458, 0.417, 0.359],
       [0.469, 0.729, 0.06 ],
       [0.406, 0.504, 0.97 ],
       [0.737, 0.985, 0.863],
       [0.906, 0.716, 0.26 ]])

<a id='22'></a>
### *Exercise 22*. How to pretty print a numpy array by suppressing the scientific notation (like 1e10)?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [23]:
# Pretty print rand_arr by suppressing the scientific notation (like 1e10)

# Input: 
# Create the random array
np.random.seed(100)
rand_arr = np.random.random([3,3])/1e3
np.set_printoptions(suppress=False)
rand_arr

# Output:
#> array([[ 0.000543,  0.000278,  0.000425],
#>        [ 0.000845,  0.000005,  0.000122],
#>        [ 0.000671,  0.000826,  0.000137]])

np.set_printoptions(suppress=True)
rand_arr
#> array([[ 0.000543,  0.000278,  0.000425],
#>        [ 0.000845,  0.000005,  0.000122],
#>        [ 0.000671,  0.000826,  0.000137]])

array([[5.434e-04, 2.784e-04, 4.245e-04],
       [8.448e-04, 4.719e-06, 1.216e-04],
       [6.707e-04, 8.259e-04, 1.367e-04]])

array([[0.001, 0.   , 0.   ],
       [0.001, 0.   , 0.   ],
       [0.001, 0.001, 0.   ]])

<a id='23'></a>
### *Exercise 23*. How to limit the number of items printed in output of numpy array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [24]:
# Question: Limit the number of items printed in python numpy array a to a maximum of 6 elements.
a = np.arange(15)
np.set_printoptions(threshold=6)
a

array([ 0,  1,  2, ..., 12, 13, 14])

<a id='24'></a>
### *Exercise 24*. How to print the full numpy array without truncating
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [25]:
# Question: Print the full numpy array a without truncating.

# Input: np.set_printoptions(threshold=6)
# a = np.arange(15)
# a

# Output: a
#> array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

# Solution

a = np.arange(15)


np.set_printoptions(threshold=15)
a

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

<a id='25'></a>
### *Exercise 25*. How to import a dataset with numbers and texts keeping the text intact in python numpy?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [26]:
# Question: Import the iris dataset keeping the text intact.

# Solution:
iris_data = np.genfromtxt('../input/iris/Iris.csv', delimiter=',', skip_header=1, 
                          usecols = [0,1,2,3,4,5], dtype = object)
iris_data

array([[b'1', b'5.1', b'3.5', b'1.4', b'0.2', b'Iris-setosa'],
       [b'2', b'4.9', b'3.0', b'1.4', b'0.2', b'Iris-setosa'],
       [b'3', b'4.7', b'3.2', b'1.3', b'0.2', b'Iris-setosa'],
       ...,
       [b'148', b'6.5', b'3.0', b'5.2', b'2.0', b'Iris-virginica'],
       [b'149', b'6.2', b'3.4', b'5.4', b'2.3', b'Iris-virginica'],
       [b'150', b'5.9', b'3.0', b'5.1', b'1.8', b'Iris-virginica']],
      dtype=object)

<a id='26'></a>
### *Exercise 26*. How to extract a particular column from 1D array of tuples?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [27]:
# Question: Extract the text column species from the 1D iris imported in previous question.

data = np.genfromtxt('../input/iris/Iris.csv', delimiter=',', skip_header=1, 
                          usecols = [-1], dtype = object)
data

array([b'Iris-setosa', b'Iris-setosa', b'Iris-setosa', ...,
       b'Iris-virginica', b'Iris-virginica', b'Iris-virginica'],
      dtype=object)

<a id='27'></a>
### *Exercise 27*. How to convert a 1d array of tuples to a 2d numpy array? 
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [28]:
# Question: Convert the 1D iris to 2D array iris_2d by omitting the species text field.
iris_data = np.genfromtxt('../input/iris/Iris.csv', delimiter=',', skip_header=1, dtype='float', usecols=[0,1,2,3])
iris_data

array([[  1. ,   5.1,   3.5,   1.4],
       [  2. ,   4.9,   3. ,   1.4],
       [  3. ,   4.7,   3.2,   1.3],
       ...,
       [148. ,   6.5,   3. ,   5.2],
       [149. ,   6.2,   3.4,   5.4],
       [150. ,   5.9,   3. ,   5.1]])

<a id='28'></a>
### *Exercise 28*. How to compute the mean, median, standard deviation of a numpy array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [29]:
# Question: Find the mean, median, standard deviation of iris's sepallength (1st column)

iris_data = np.genfromtxt('../input/iris/Iris.csv', delimiter=',', skip_header=1, usecols = [1])

print('Mean', np.mean(iris_data))
print('Median', np.median(iris_data))
print('Standard Deviation', np.std(iris_data))

Mean 5.843333333333334
Median 5.8
Standard Deviation 0.8253012917851409


<a id='29'></a>
### *Exercise 29*. How to normalize an array so the values range exactly between 0 and 1?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [30]:
# Question: Create a normalized form of iris's sepallength whose values range exactly between 0 and 1 so that the minimum has value 0 and maximum has value 1.

# Solution

iris_data = np.genfromtxt('../input/iris/Iris.csv', delimiter=',', dtype='float', usecols=[1], skip_header=1)

(iris_data - np.min(iris_data))/(np.max(iris_data) - np.min(iris_data))

array([0.222, 0.167, 0.111, ..., 0.611, 0.528, 0.444])

<a id='30'></a>
### *Exercise 30*. How to compute the softmax score?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [31]:
# Question: Compute the softmax score of sepallength.

iris_data = np.genfromtxt('../input/iris/Iris.csv', delimiter=',', dtype='float', usecols=[1], skip_header=1)
softmax = np.exp(iris_data)/sum(np.exp(iris_data))
softmax.sum() # it must sum 1
softmax

0.9999999999999997

array([0.002, 0.002, 0.001, ..., 0.009, 0.007, 0.005])

<a id='31'></a>
### *Exercise 31*. How to find the percentile scores of a numpy array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [32]:
# Question. Find the 5th and 95th percentile of iris's sepallength
iris_data = np.genfromtxt('../input/iris/Iris.csv', delimiter=',', dtype='float', usecols=[1], skip_header=1)

np.percentile(iris_data, q=[5, 95])

array([4.6  , 7.255])

<a id='32'></a>
### *Exercise 32*. How to insert values at random positions in an array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [33]:
# Question: Insert np.nan values at 20 random positions in iris_2d dataset

# Solution
iris_data = np.genfromtxt('../input/iris/Iris.csv', delimiter=',', dtype='float', usecols=[1,2,3,4], skip_header=1)
for i in np.random.randint(0, len(iris_data), 20):
    iris_data[i]=np.nan
iris_data

array([[5.1, 3.5, 1.4, 0.2],
       [4.9, 3. , 1.4, 0.2],
       [4.7, 3.2, 1.3, 0.2],
       ...,
       [6.5, 3. , 5.2, 2. ],
       [6.2, 3.4, 5.4, 2.3],
       [5.9, 3. , 5.1, 1.8]])

<a id='33'></a>
### *Exercise 33*. How to find the position of missing values in numpy array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [34]:
# Question: Find the number and position of missing values in iris_2d's sepallength (1st column)

iris_data = np.genfromtxt('../input/iris/Iris.csv', delimiter=',', dtype='float', usecols=[1,2,3,4], skip_header=1)
iris_data[np.random.randint(len(iris_data), size=20),np.random.randint(4,size=20)] = np.nan

# Find total mising value in complete data
print("Number of missing values in Iris data: \n", np.isnan(iris_data[:, :]).sum())

# Find total mising value in 1D data
print("Number of missing values in any one feature of Iris data: \n", np.isnan(iris_data[:, 0]).sum())

print("Position of missing values: \n", np.where(np.isnan(iris_data[:, 0])))

Number of missing values in Iris data: 
 20
Number of missing values in any one feature of Iris data: 
 5
Position of missing values: 
 (array([ 38,  80, 106, 113, 121]),)


<a id='34'></a>
### *Exercise 34*. How to filter a numpy array based on two or more conditions?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [35]:
# Question: Filter the rows of iris_2d that has petallength (3rd column) > 1.5 and sepallength (1st column) < 5.0

iris_data = np.genfromtxt('../input/iris/Iris.csv', delimiter=',', dtype='float', usecols=[1,2,3,4], skip_header=1)

# Solution
iris_data[(iris_data[:, 2] > 1.5) & (iris_data[:, 0] < 5.0)]

array([[4.8, 3.4, 1.6, 0.2],
       [4.8, 3.4, 1.9, 0.2],
       [4.7, 3.2, 1.6, 0.2],
       [4.8, 3.1, 1.6, 0.2],
       [4.9, 2.4, 3.3, 1. ],
       [4.9, 2.5, 4.5, 1.7]])

<a id='35'></a>
### *Exercise 35*. How to drop rows that contain a missing value from a numpy array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [36]:
# Question: Select the rows of iris_2d that does not have any nan value.

diabetes_data = np.genfromtxt('../input/pima-indians-diabetes-database/diabetes.csv', delimiter=',', dtype='float', usecols=[0,1,2,3,4,5,6,7], skip_header=1)
diabetes_data[np.random.randint(150, size=20), np.random.randint(4, size=20)] = np.nan
diabetes_data[np.sum(np.isnan(diabetes_data), axis = 1) == 0][:5]

array([[  1.   ,  85.   ,  66.   , ...,  26.6  ,   0.351,  31.   ],
       [  8.   , 183.   ,  64.   , ...,  23.3  ,   0.672,  32.   ],
       [  1.   ,  89.   ,  66.   , ...,  28.1  ,   0.167,  21.   ],
       [  0.   , 137.   ,  40.   , ...,  43.1  ,   2.288,  33.   ],
       [  3.   ,  78.   ,  50.   , ...,  31.   ,   0.248,  26.   ]])

<a id='36'></a>
### *Exercise 36*. How to find the correlation between two columns of a numpy array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [37]:
# question: Find the correlation between SepalLength(1st column) and PetalLength(3rd column) in iris_2d
# insted or using iris data I am going to used pima diabetes data and going to find corelation between BP(1st column) and BMI (5th column).

diabetes_data = np.genfromtxt('../input/pima-indians-diabetes-database/diabetes.csv',
                              delimiter=',', dtype='float', usecols=[0,1,2,3,4,5,6,7], skip_header=1)

print(np.corrcoef(diabetes_data[:, 1], diabetes_data[:, 5]))

print('\n')
# you can get correlation by getting value at index [0,1] or [1,0]
print(np.corrcoef(diabetes_data[:, 1], diabetes_data[:, 5])[0,1])

[[1.    0.221]
 [0.221 1.   ]]


0.221071069458983


<a id='37'></a>
### *Exercise 37*. How to find if a given array has any null values?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [38]:
# question: Find out if iris_2d has any missing values.
diabetes_data = np.genfromtxt('../input/pima-indians-diabetes-database/diabetes.csv',
                              delimiter=',', dtype='float', usecols=[0,1,2,3,4,5,6,7], skip_header=1)

np.isnan(diabetes_data).any()

False

<a id='38'></a>
### *Exercise 38*. How to replace all missing values with 0 in a numpy array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [39]:
# Question: Replace all ccurrences of nan with 0 in numpy array

wine_quality = np.genfromtxt('../input/red-wine-quality-cortez-et-al-2009/winequality-red.csv',
                             delimiter=',', dtype='float', usecols=[0,1,2,3,4,5,6,7,8,9,10], skip_header=1)

wine_quality[np.random.randint(len(wine_quality), size=20), np.random.randint(11, size=20)] = np.nan

print("Does dataset have any Nan value:",np.isnan(wine_quality).any())

wine_quality[np.isnan(wine_quality)] = 0

print("Does dataset have any Nan value:",np.isnan(wine_quality).any())

Does dataset have any Nan value: True
Does dataset have any Nan value: False


<a id='39'></a>
### *Exercise 39*. How to find the count of unique values in a numpy array?

<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [40]:
# Question: Find the unique values and the count of unique values in mashroom data's habitat (22 column) column

mushroom = np.genfromtxt('../input/mushroom-classification/mushrooms.csv',
                             delimiter=',', dtype=object, usecols=[0], skip_header=1)
#mushroom
np.unique(mushroom, return_counts=True)

(array([b'e', b'p'], dtype=object), array([4208, 3916]))

<a id='40'></a>
### *Exercise 40*. How to convert a numeric to a categorical (text) array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [41]:
# Question: Bin the petal length (3rd) column of iris_2d to form a text array, such that if petal length is:

# Less than 3 --> 'small'
# 3-5 --> 'medium'
# >=5 --> 'large'

# Solution

iris_data = np.genfromtxt('../input/iris/Iris.csv', delimiter=',', dtype=object, usecols=[3], skip_header=1)

bins = np.array([0, 3, 5, 7])
inds = np.digitize(iris_data.astype('float'), bins)

labels = {1:'small', 2: 'medium', 3:'large'}
iris_cat_data = [labels[x] for x in inds]
iris_cat_data[:10]

['small',
 'small',
 'small',
 'small',
 'small',
 'small',
 'small',
 'small',
 'small',
 'small']

<a id='41'></a>
### *Exercise 41*. How to create a new column from existing columns of a numpy array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [42]:
# Question: Create a new column for volume in iris_2d, where volume is (pi x petallength x sepal_length^2)/3

iris_data = np.genfromtxt('../input/iris/Iris.csv', delimiter=',', 
                          dtype=object, usecols=[1,2,3,4], skip_header=1)

sepallength = iris_data[:, 0].astype('float')
petallength = iris_data[:, 2].astype('float')

new_column = (np.pi * petallength * (sepallength**2))/3

new_column = new_column[:, np.newaxis]
#new_column

# Add the new column
out = np.hstack([iris_data, new_column])

# View
out[:4]

array([[b'5.1', b'3.5', b'1.4', b'0.2', 38.13265162927291],
       [b'4.9', b'3.0', b'1.4', b'0.2', 35.200498485922445],
       [b'4.7', b'3.2', b'1.3', b'0.2', 30.0723720777127],
       [b'4.6', b'3.1', b'1.5', b'0.2', 33.238050274980004]], dtype=object)

<a id='42'></a>
### *Exercise 42*. How to do probabilistic sampling in numpy?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [43]:
# Question: Randomly sample iris's species such that setose is twice the number of versicolor and virginica

iris_data = np.genfromtxt('../input/iris/Iris.csv', delimiter=',', 
                          dtype=object, usecols=[1,2,3,4,5], skip_header=1)

np.random.seed(100)
a = np.array(['Iris-setosa', 'Iris-versicolor', 'Iris-virginica'])
species_out = np.random.choice(a, 150, p=[0.5, 0.25, 0.25])
print(np.unique(species_out, return_counts=True))

(array(['Iris-setosa', 'Iris-versicolor', 'Iris-virginica'], dtype='<U15'), array([77, 37, 36]))


<a id='43'></a>
### *Exercise 43*. How to get the second largest value of an array when grouped by another array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [44]:
# Question: What is the value of second longest petallength of species setosa
# For this question I am going to find second highest bloodpressure (2nd column) where outcome is 1
diabetes_data = np.genfromtxt('../input/pima-indians-diabetes-database/diabetes.csv',
                              delimiter=',', dtype=object, usecols=[0,1,2,3,4,5,6,7,8], skip_header=1)

# Solution
bloodpressure= diabetes_data[diabetes_data[:, 8]==b'1', [2]].astype('float')

np.unique(np.sort(bloodpressure))[-2]

110.0

<a id='44'></a>
### *Exercise 44*. How to sort a 2D array by a column
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [45]:
# Question: Sort the iris dataset based on sepallength column.
# In this problem, I am going to sort the diabetes dataset based on Glucose (1th column)
diabetes_data = np.genfromtxt('../input/pima-indians-diabetes-database/diabetes.csv',
                              delimiter=',', dtype=object, usecols=[0,1,2,3,4,5,6,7,8], skip_header=1)

diabetes_data[diabetes_data[:,1].argsort()]

array([[b'1', b'0', b'74', ..., b'0.299', b'21', b'0'],
       [b'5', b'0', b'80', ..., b'0.346', b'37', b'1'],
       [b'1', b'0', b'68', ..., b'0.389', b'22', b'0'],
       ...,
       [b'4', b'99', b'72', ..., b'0.294', b'28', b'0'],
       [b'4', b'99', b'76', ..., b'0.223', b'21', b'0'],
       [b'2', b'99', b'60', ..., b'0.453', b'21', b'0']], dtype=object)

<a id='45'></a>
### *Exercise 45*. How to find the most frequent value in a numpy array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [46]:
# Question: Find the most frequent value of petal length (3rd column) in iris dataset.

iris_data = np.genfromtxt('../input/iris/Iris.csv', delimiter=',', 
                          dtype=object, usecols=[1,2,3,4,5], skip_header=1)

v,c = np.unique(iris_data[:, 2], return_counts=True)
v[np.argmax(c)]

b'1.5'

<a id='46'></a>
### *Exercise 46*. How to find the position of the first occurrence of a value greater than a given value?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [47]:
# Question: Find the position of the first occurrence of a value greater than 1.0 in 
# petalwidth 4th column of iris dataset.
iris_data = np.genfromtxt('../input/iris/Iris.csv', delimiter=',', 
                          dtype=object, usecols=[4], skip_header=1)

np.argwhere(iris_data[:].astype(float) > 1.0)[0]

array([50])

<a id='47'></a>
### *Exercise 47*. How to replace all values greater than a given value to a given cutoff?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [48]:
# Question: From the array a, replace all values greater than 30 to 30 and less than 10 to 10.

# Solution

np.set_printoptions(precision=2)
np.random.seed(100)
a = np.random.uniform(1,50, 20)

a[a<10]=10
a[a>30]=30
np.set_printoptions(threshold=20)
a

array([27.63, 14.64, 21.8 , 30.  , 10.  , 10.  , 30.  , 30.  , 10.  ,
       29.18, 30.  , 11.25, 10.08, 10.  , 11.77, 30.  , 30.  , 10.  ,
       30.  , 14.43])

<a id='48'></a>
### *Exercise 48*. How to get the positions of top n values from a numpy array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [49]:
# Question: Get the positions of top 5 maximum values in a given array a.
np.random.seed(100)
a = np.random.uniform(1,50, 20)
a
sort = a.argsort()
print('Positions')
sort[-5:][::-1]
print('Values')
a[sort][-5:][::-1]

array([27.63, 14.64, 21.8 , 42.39,  1.23,  6.96, 33.87, 41.47,  7.7 ,
       29.18, 44.67, 11.25, 10.08,  6.31, 11.77, 48.95, 40.77,  9.43,
       41.  , 14.43])

Positions


array([15, 10,  3,  7, 18])

Values


array([48.95, 44.67, 42.39, 41.47, 41.  ])

<a id='49'></a>
### *Exercise 49*. How to compute the row wise counts of all possible values in an array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [50]:
# Question: Compute the counts of unique values row-wise.

# Solution
def counts_of_all_values_rowwise(arr2d):
    # Unique values and its counts row wise
    num_counts_array = [np.unique(row, return_counts=True) for row in arr2d]

    # Counts of all values row wise
    return([[int(b[a==i]) if i in a else 0 for i in np.unique(arr2d)] for a, b in num_counts_array])

np.random.seed(100)
np.set_printoptions(threshold=10)
arr = np.random.randint(1,11,size=(6, 10))
arr
print(np.arange(1,11))
counts_of_all_values_rowwise(arr)

array([[ 9,  9,  4, ...,  3,  6,  3],
       [ 3,  3,  2, ..., 10,  7,  3],
       [ 5,  2,  6, ...,  8,  2,  2],
       [ 8,  8,  1, ...,  3,  6,  9],
       [ 2,  1,  8, ...,  3,  6,  2],
       [ 9,  2,  6, ...,  6,  1, 10]])

[ 1  2  3  4  5  6  7  8  9 10]


[[1, 0, 2, 1, 1, 1, 0, 2, 2, 0],
 [2, 1, 3, 0, 1, 0, 1, 0, 1, 1],
 [0, 3, 0, 2, 3, 1, 0, 1, 0, 0],
 [1, 0, 2, 1, 0, 1, 0, 2, 1, 2],
 [2, 2, 2, 0, 0, 1, 1, 1, 1, 0],
 [1, 1, 1, 1, 1, 2, 0, 0, 2, 1]]

<a id='50'></a>
### *Exercise 50*. How to convert an array of arrays into a flat 1d array?
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to TOC</a>

In [51]:
# Question: Convert array_of_arrays into a flat linear 1d array.
arr1 = np.arange(3)
arr2 = np.arange(3,7)
arr3 = np.arange(7,10)

arr_2d = np.concatenate([arr1, arr2, arr3])
print(arr_2d)

[0 1 2 3 4 5 6 7 8 9]


## I hope you have learned something in this kernel.
## <font color='red'> If you like this kernel, Don't forget to upvote. </font>

## Happy Learning