# NumPy 

NumPy (or Numpy) is a Linear Algebra Library for Python.  
It is used by almost all of the libraries in the PyData Ecosystem as one of their main building blocks.
 
Numpy comes preinstalled with Anaconda and IBM DSX. If for some reason it isn't installed, you can install it from your command prompt by typing:

If you have Anaconda:
    
    conda install numpy
    
If you do not have Anaconda:
    
    pip install numpy
    
    
API documentation:https://numpy.org/doc/stable/reference/index.html

## Using NumPy

Once you've installed NumPy you can import it as a library:

In [1]:
import numpy as np


memory: in sequece - no pointers

effiency: one type only. no need to check the type in operations
many built in operations on arrays/matrices

Numpy arrays have a fixed size at creation, unlike python lists (which can grow dynamically). Changing the size of ndarray will create a new array and delete the original.


# NumPy Indexing and Selection

In this lecture we will discuss how to select elements or groups of elements from an array.

In [2]:
#Creating sample array
arr = np.arange(0,11)

In [3]:
#Show
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

#### Creating an array of 10 zeros 

In [4]:
np.zeros(10)

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

## Bracket Indexing and Selection
The simplest way to pick one or some elements of an array looks very similar to python lists:

In [5]:
#Get a value at an index
arr[8]

8

In [6]:
#Get values in a range
arr[1:5]

array([1, 2, 3, 4])

In [7]:
#Setting a value with index range
arr[0:5]=100

#Show
arr

array([100, 100, 100, 100, 100,   5,   6,   7,   8,   9,  10])

In [8]:
# Reset array
arr = np.arange(0,11)

#Show
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [9]:
#Important notes on Slices
slice_of_arr = arr[0:6]

#Show slice
slice_of_arr

array([0, 1, 2, 3, 4, 5])

In [10]:
#Change Slice
slice_of_arr[:]=99

#Show Slice again
slice_of_arr

array([99, 99, 99, 99, 99, 99])

Now note the changes also occur in our original array!

In [11]:
arr

array([99, 99, 99, 99, 99, 99,  6,  7,  8,  9, 10])

When we take a slice or set a variable using assignment, the variable is not copied. Instead we get a view of the original array! This makes memory usage more efficient.

In [12]:
#To get a copy, need to be explicit
arr_copy = arr.copy()

arr_copy

array([99, 99, 99, 99, 99, 99,  6,  7,  8,  9, 10])

## Indexing a 2D array (matrices)

The general format is **arr_2d[row][col]** or **arr_2d[row,col]**. I recommend usually using the comma notation for clarity.

In [13]:
arr_2d = np.array(([5,10,15],[20,25,30],[35,40,45]))

#Show
arr_2d

array([[ 5, 10, 15],
       [20, 25, 30],
       [35, 40, 45]])

In [14]:
#Indexing row
arr_2d[1] # first raw

array([20, 25, 30])

In [15]:
# Format is arr_2d[row][col] or arr_2d[row,col]

# Getting individual element value
arr_2d[1][0]

20

In [16]:
# Getting individual element value
arr_2d[1,0] # Option 2

20

In [17]:
# 2D array slicing
print(arr_2d)
#Shape (2,2) from top right corner
arr_2d[:2,1:]

[[ 5 10 15]
 [20 25 30]
 [35 40 45]]


array([[10, 15],
       [25, 30]])

In [18]:
#Get the middle column
arr_2d[:,1]

array([10, 25, 40])

### Fancy Indexing

Fancy indexing allows you to select entire rows or columns out of order, Passing an array of indices to access multiple array elements at once.
To see it in action, let's quickly build out a numpy array:

In [71]:
#Set up matrix
arr2d = np.zeros((10,5))
arr2d

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

In [72]:
arr2d.shape

(10, 5)

In [74]:
#Length of array
arr_length = arr2d.shape[1] # если 1, то показывает количество столбиков, если 0, то количство строк
arr_length

5

In [22]:
#Set up array
for i in range(arr_length):
    arr2d[i] = i
#How should we do it on the columns?    
arr2d

array([[0., 0., 0., 0., 0.],
       [1., 1., 1., 1., 1.],
       [2., 2., 2., 2., 2.],
       [3., 3., 3., 3., 3.],
       [4., 4., 4., 4., 4.],
       [5., 5., 5., 5., 5.],
       [6., 6., 6., 6., 6.],
       [7., 7., 7., 7., 7.],
       [8., 8., 8., 8., 8.],
       [9., 9., 9., 9., 9.]])

Fancy indexing allows the following

In [23]:
arr2d[[2,4,6,8]]

array([[2., 2., 2., 2., 2.],
       [4., 4., 4., 4., 4.],
       [6., 6., 6., 6., 6.],
       [8., 8., 8., 8., 8.]])

In [24]:
# Allows in any order
arr2d[[6,4,2,7]]

array([[6., 6., 6., 6., 6.],
       [4., 4., 4., 4., 4.],
       [2., 2., 2., 2., 2.],
       [7., 7., 7., 7., 7.]])

Fancy indexing also works in multiple dimensions. Consider the following array:

In [25]:
X = np.arange(12).reshape((3, 4))
X

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

Like with standard indexing, the first index refers to the row, and the second to the column:

In [26]:
row = np.array([1, 1, 2]) #row 0, 2 column = 2 ; row 1, column 1 = 5; row 2 column from previous matrix X
col = np.array([2, 1, 3])
X[row, col] #проходится только по координатам одного значения

array([ 6,  5, 11])

Notice that the first value in the result is X[0, 2], the second is X[1, 1], and the third is X[2, 3]. The pairing of indices in fancy indexing follows all the broadcasting rules.

So, for example, if we combine a column vector and a row vector within the indices, we get a two-dimensional result:

In [27]:
row[:, np.newaxis] #np.newaxis - new dimention , it is in colomn space so new column vektor

array([[1],
       [1],
       [2]])

In [28]:
X

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [29]:
X[row[:, np.newaxis], col] #проходится по raw

array([[ 6,  5,  7],
       [ 6,  5,  7],
       [10,  9, 11]])

In [30]:
X[row, col[:, np.newaxis]] #проходится по col

array([[ 6,  6, 10],
       [ 5,  5,  9],
       [ 7,  7, 11]])

**numpy.newaxis** *is used to increase the dimension of the existing array by one more dimension, when used once*

# Selection

Let's briefly go over how to use brackets for selection based off of comparison operators.

In [31]:
arr = np.arange(0,11)
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [32]:
arr > 4

array([False, False, False, False, False,  True,  True,  True,  True,
        True,  True])

If we want the specific values that are true in the arr:

In [33]:
bool_arr = arr>4

In [34]:
bool_arr

array([False, False, False, False, False,  True,  True,  True,  True,
        True,  True])

In [35]:
arr[bool_arr] #показывает значения

array([ 5,  6,  7,  8,  9, 10])

Another way to get the values of the array according to selection conditions:

In [36]:
arr[arr>2]

array([ 3,  4,  5,  6,  7,  8,  9, 10])

In [37]:
x = 2
arr[arr>x]

array([ 3,  4,  5,  6,  7,  8,  9, 10])

## Broadcasting

Numpy arrays differ from a normal Python list because of their ability to broadcast:

The term broadcasting describes how numpy treats arrays with different shapes during arithmetic operations. Subject to certain constraints, the smaller array is “broadcast” across the larger array so that they have compatible shapes.

In [38]:
a = np.array([[1,2,3],
              [4,5,6]])

a + np.array([1,2,3])

array([[2, 4, 6],
       [5, 7, 9]])

In [39]:
a + np.array([[1,2]]).T #transpose к первой строке добавляем 1, ко второй строке добавляем 2

array([[2, 3, 4],
       [6, 7, 8]])

In [40]:
np.array([[1,2]]).T

array([[1],
       [2]])

# NumPy Exercises 

Now that we've learned about NumPy let's test your knowledge. We'll start off with a few simple tasks, and then you'll be asked some more complicated questions.

#### 1. Create an array of 10 ones

In [41]:
arr = np.ones(10)
arr

array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

#### 2. Create an array of the integers from 10 to 50

In [42]:
arr = np.arange(10,51)
arr

array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,
       27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,
       44, 45, 46, 47, 48, 49, 50])

#### 3. Create a 3x3 matrix with values ranging from 0 to 8

In [43]:
arr_2d = np.array(([0,1,2],[3,4,5],[6,7,8]))

#Show
arr_2d

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

#### 4. Use NumPy to generate a random number between 0 and 1

#### 5. Use NumPy to generate an array of 25 random numbers sampled from a standard normal distribution

## Append:

**Append of list**

In [44]:
l1 = [1,2,3]
l1.append([4,5])
print('l1 append {}'.format(l1))

# Extend:

l2 = [1,2,3]
l2.extend([4,5])
print('l2 extend {}'.format(l2))

l1 append [1, 2, 3, [4, 5]]
l2 extend [1, 2, 3, 4, 5]


*Formatting output using the format method : 
The **format()** method was added in Python(2.6). The format method of strings requires more manual effort. Users use {} to mark where a variable will be substituted and can provide detailed formatting directives, but the user also needs to provide the information to be formatted. This method lets us concatenate elements within an output through positional formatting.*


**Append of numpy array**

In [45]:
# No axis paramter - append (extend) all axis together
a1 = np.array([1,2,3])
a1 = np.append(a1, [4,5]) #like extend of list
#print('a1 append {}'.format(list(a1)))
a1

array([1, 2, 3, 4, 5])

In [46]:
# axis=0 - append (extend) axis 0 together
a3 = np.array([[1,2,3]])
a3 = np.append(a3, [[4,5,6]], axis=0) #must be the same dimension of axis 0; axis0 добавляет строки
#print('a3 append {}'.format(a3))
a3

array([[1, 2, 3],
       [4, 5, 6]])

In [47]:
# axis=1 - append (extend) axis 1 together
a5 = np.array([[1,2,3], [4,5,6]])
a5 = np.append(a5, [[10, 11],[12, 13]], axis=1)  #must to be the same dimension of axis 1; axis1 добавляет столбики
#print('a5 append {}'.format(a5))
a5

array([[ 1,  2,  3, 10, 11],
       [ 4,  5,  6, 12, 13]])

# Automate reshape:


Reshaping arrays
Reshaping means changing the shape of an array.

The shape of an array is the number of elements in each dimension.

By reshaping we can add or remove dimensions or change number of elements in each dimension.

In [48]:
arr = np.arange(25).reshape(5,5)
arr

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

In [49]:
print(arr,'\n') # don't understand about \n ??
#print(arr)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]] 



In [50]:
b_coulmn = np.arange(10).reshape((-1,1)) #(-1) indicates the number of columns to be 1.
#output:(is a 1 dimensional columnar array)
b_coulmn

array([[0],
       [1],
       [2],
       [3],
       [4],
       [5],
       [6],
       [7],
       [8],
       [9]])

In [51]:
b_row = np.arange(10).reshape((1,-1))
#output:(is a 1 dimensional row array)
b_row 


array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])

## Dimension after slicing:

In [52]:
mat = np.arange(1,21).reshape(4,5)
#print(mat, '\n')
mat

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20]])

In [53]:
mat[[0],2:4]

array([[3, 4]])

In [54]:
mat = np.arange(1,21).reshape(4,5)
print(mat, '\n')

print('mat[2:4,2]: \n{}\n'.format(mat[2:4,2])) #output vector , mat(row, column)
print('mat[2:4,2:3]: \n{}\n'.format(mat[2:4,2:3])) #output: original dimension in the matrix (matrix of 2 rows)
print('mat[2,2:4]: \n{}\n'.format(mat[2,2:4])) #output: vector
print('mat[2:3,2:4]: \n{}\n'.format(mat[2:3,2:4])) #output: original dimension in the matrix (matrix of one row)
print('mat[[2],2:4]: \n{}\n'.format(mat[[2],2:4])) #output: original dimension in the matrix (matrix of one row)

[[ 1  2  3  4  5]
 [ 6  7  8  9 10]
 [11 12 13 14 15]
 [16 17 18 19 20]] 

mat[2:4,2]: 
[13 18]

mat[2:4,2:3]: 
[[13]
 [18]]

mat[2,2:4]: 
[13 14]

mat[2:3,2:4]: 
[[13 14]]

mat[[2],2:4]: 
[[13 14]]



## Universal Array Functions

Numpy comes with many [universal array functions](http://docs.scipy.org/doc/numpy/reference/ufuncs.html), which are essentially just mathematical operations you can use to perform the operation across the array. Let's show some common ones:

In [55]:
arr = np.arange(25).reshape(5,5)
arr

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

In [56]:
#Taking Square Roots
np.sqrt(arr)

array([[0.        , 1.        , 1.41421356, 1.73205081, 2.        ],
       [2.23606798, 2.44948974, 2.64575131, 2.82842712, 3.        ],
       [3.16227766, 3.31662479, 3.46410162, 3.60555128, 3.74165739],
       [3.87298335, 4.        , 4.12310563, 4.24264069, 4.35889894],
       [4.47213595, 4.58257569, 4.69041576, 4.79583152, 4.89897949]])

In [57]:
np.max(arr) #same as arr.max()

24

In [58]:
np.sin(arr)

array([[ 0.        ,  0.84147098,  0.90929743,  0.14112001, -0.7568025 ],
       [-0.95892427, -0.2794155 ,  0.6569866 ,  0.98935825,  0.41211849],
       [-0.54402111, -0.99999021, -0.53657292,  0.42016704,  0.99060736],
       [ 0.65028784, -0.28790332, -0.96139749, -0.75098725,  0.14987721],
       [ 0.91294525,  0.83665564, -0.00885131, -0.8462204 , -0.90557836]])

### Exponents and logarithms

Another common type of operation available in a NumPy ufunc are the exponentials: (Universal functions = ufunc)

In [59]:
x = [1, 2, 3]
print("x     =", x)
print("e^x   =", np.exp(x))
print("2^x   =", np.exp2(x))
print("3^x   =", np.power(3, x))

x     = [1, 2, 3]
e^x   = [ 2.71828183  7.3890561  20.08553692]
2^x   = [2. 4. 8.]
3^x   = [ 3  9 27]


The inverse of the exponentials, the logarithms, are also available.
The basic ``np.log`` gives the natural logarithm; if you prefer to compute the base-2 logarithm or the base-10 logarithm, these are available as well:

In [60]:
x = [1, 2, 4, 10]
print("x        =", x)
print("ln(x)    =", np.log(x))
print("log2(x)  =", np.log2(x))
print("log10(x) =", np.log10(x))

x        = [1, 2, 4, 10]
ln(x)    = [0.         0.69314718 1.38629436 2.30258509]
log2(x)  = [0.         1.         2.         3.32192809]
log10(x) = [0.         0.30103    0.60205999 1.        ]


### Other aggregation functions

NumPy provides many other aggregation functions, but we won't discuss them in detail here.
Additionally, most aggregates have a ``NaN``-safe counterpart that computes the result while ignoring missing values, which are marked by the special IEEE floating-point ``NaN`` value (for a fuller discussion of missing data, see [Handling Missing Data](03.04-Missing-Values.ipynb)).
Some of these ``NaN``-safe functions were not added until NumPy 1.8, so they will not be available in older NumPy versions.

The following table provides a list of useful aggregation functions available in NumPy:

|Function Name      |   NaN-safe Version  | Description                                   |
|-------------------|---------------------|-----------------------------------------------|
| ``np.sum``        | ``np.nansum``       | Compute sum of elements                       |
| ``np.prod``       | ``np.nanprod``      | Compute product of elements                   |
| ``np.mean``       | ``np.nanmean``      | Compute mean of elements                      |
| ``np.std``        | ``np.nanstd``       | Compute standard deviation                    |
| ``np.var``        | ``np.nanvar``       | Compute variance                              |
| ``np.min``        | ``np.nanmin``       | Find minimum value                            |
| ``np.max``        | ``np.nanmax``       | Find maximum value                            |
| ``np.argmin``     | ``np.nanargmin``    | Find index of minimum value                   |
| ``np.argmax``     | ``np.nanargmax``    | Find index of maximum value                   |
| ``np.median``     | ``np.nanmedian``    | Compute median of elements                    |
| ``np.percentile`` | ``np.nanpercentile``| Compute rank-based statistics of elements     |
| ``np.any``        | N/A                 | Evaluate whether any elements are true        |
| ``np.all``        | N/A                 | Evaluate whether all elements are true        |



## Numpy Indexing and Selection Exercises

Now you will be given a few matrices, and be asked to replicate the resulting matrix outputs:

In [61]:
mat = np.arange(1,26).reshape(5,5)
mat

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25]])

####  6.Write code here that restores the output of the cell below

In [62]:
mat[2:5,1:5]

array([[12, 13, 14, 15],
       [17, 18, 19, 20],
       [22, 23, 24, 25]])

#### 7. Write code here that reproduces the output of the cell below

In [63]:
mat[3,4]

20

In [64]:
mat[3][4]

20

#### 8. Write code here that reproduces the output of the cell below

In [65]:
mat[0:3,1]

array([ 2,  7, 12])

#### 9. Write code here that reproduces the output of the cell below

In [66]:
mat[4]

array([21, 22, 23, 24, 25])

#### 10. Write code here that reproduces the output of the cell below

In [67]:
mat[3:5,0:5]

array([[16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25]])

#### 11. An experiment is being conducted to determine the melting point of water. We have repeated this experiment 10 times to obtain a more accurate calculation of the melting point. The results from the experiment can be observed below.

In [68]:
melting_point_data = np.array([98.5, 99.9, 100.6, 99.3, 100.7, 99.4, 98.4, 99.5, 99.3, 100.7]) #no

Following the experiment, the electric thermometer is miscalibrated by -0.5  ∘ C. If we were using a list, float, or dictionary then we would have to manually add 0.5  ∘ C to each value. However, a NumPy array allows this error to be corrected more simply.

 **Multiplying the original melting point data by two results in:**

**Subtracting 0.5 from the original melting point data results in:**

#### 12. Return array of odd rows and even columns from below numpy array

In [69]:
sampleArray = np.array([[3 ,6, 9, 12], [15 ,18, 21, 24], 
[27 ,30, 33, 36], [39 ,42, 45, 48], [51 ,54, 57, 60]])
sampleArray

array([[ 3,  6,  9, 12],
       [15, 18, 21, 24],
       [27, 30, 33, 36],
       [39, 42, 45, 48],
       [51, 54, 57, 60]])

In [70]:
sampleArray[0:5:2,1:4:2]

array([[ 6, 12],
       [30, 36],
       [54, 60]])