## ***NUMPY***

---



---



### ***NUMPY (numerical python)***


---
It is a library developed for numerical calculations in Python.

It enables working with multidimensional arrays.

It offers a wide range of methods that allow performing mathematical operations on these arrays.


In [357]:
my_list = [1,2,5,[0,1],[2,3,5]]
my_list[4]

[2, 3, 5]

## *Why is important NumPy ⁉*

NumPy is the fundamental library for numerical computation and multidimensional data (vector/matrix) operations in Python.

It provides speed and efficiency, especially when working with numerical data.

1) Speed and performance (why ‘fast’?)

- NumPy's fundamental data structure, the “ndarray”, holds a single data type (homogeneous). Thanks to this single data type, it is stored more compactly in memory (carries less additional information), and vectorised operations/linear algebra operations run faster.

- For this reason, many libraries (e.g., scikit-learn) commonly use NumPy arrays for data input; they are highly compatible with linear algebra and model matrix logic.

2) NumPy vs Pandas: When to use which?

- Mathematics / model matrix / large numerical calculations → NumPy
- Data cleaning, merging, grouping and reporting → Pandas

3) Table features available in Pandas but not in NumPy

   -Pandas works more like Excel:

- Column names and row indices (e.g. columns such as Age, Gender; row labels such as Ahmet, Mehmet)
- Join/Merge (table merging)
- GroupBy (grouping and summarising)
- Time series resampling (aggregating time series weekly/monthly, etc.)
- Missing data/label alignment (automatic alignment via index)
- Heterogeneous table: Each column in the DataFrame can have a different dtype

    -NumPy, on the other hand, is more ‘raw’ and lower-level:

- It works with positions (0,1,2…) instead of column/row labels.
- It is very fast, but its ‘table’ ergonomics are not as powerful as Pandas.

4) Why is NumPy like a ‘stepping stone’?

    NumPy is a critical transition on the path from lists to Pandas:

- It teaches vector and matrix logic.
- It is highly functional for operations such as slicing, broadcasting, and addition/multiplication on multidimensional data (2D, 3D, …)

5) Appearance: ‘raw’ and more technical

- NumPy arrays often appear as nested, regular (symmetrical) lists.
- Therefore, at first glance, they may seem more ‘intimidating’ than Pandas; but they are very powerful on the numerical computation side.

***NumPy Arrays***

---



-Numpy arrays are an excellent alternative to Python lists.

-Numpy arrays are much easier to manage and more suitable for computation, especially for scientific calculations and numerical operations.

In [358]:
# CREATİNG A NUMPY ARRAY
import numpy as np
arr = np.array([1,2,3,4,5])
print(arr)

[1 2 3 4 5]


In [359]:
import numpy as np

one_dimensional_array = np.array([1,2,3])
two_dimensional_array = np.array([[1,2,3],[4,5,6]])

type(one_dimensional_array)

numpy.ndarray

In [360]:
print(one_dimensional_array.shape)
print(two_dimensional_array.shape)

(3,)
(2, 3)


In [361]:
two_dimensional_array   # 2 rows and 3 columns

array([[1, 2, 3],
       [4, 5, 6]])



---


Pc: The n × m matrix A is a rectangular array of numbers consisting of n rows and m columns.

---



In [362]:
# Numpy also provides numerous functions for creating arrays:
import numpy as np
a = np.zeros((2,2))
print(a)

[[0. 0.]
 [0. 0.]]


In [363]:
b = np.ones((1,2))
print(b)

[[1. 1.]]


In [364]:
c = np.full((2,2),7)
print(c)

[[7 7]
 [7 7]]


In [365]:
d = np.eye(2)  # eye means identity matrix
print(d)

[[1. 0.]
 [0. 1.]]


In [366]:
e = np.random.random((2,2))
print(e)

[[0.69854209 0.53407563]
 [0.53417873 0.11611349]]


***Array Indexing***

---



-We use indices to access any element in Numpy arrays.

-Similar to Python lists, but there is a separate index for each dimension.

For example, to access the second element in the first dimension of the two-dimensional array named two_dimensional_array:

two_dimensional_array[ 0 , 2 ]






In [367]:
two_dimensional_array

array([[1, 2, 3],
       [4, 5, 6]])

In [368]:
int(two_dimensional_array[0,2])

3

In [369]:
# Let's create a two dimensional array of size 3 x 4
import numpy as np
m = np.array([[3,7,3,4], [5,6,7,2], [2,1,1,1]])
print(m)

[[3 7 3 4]
 [5 6 7 2]
 [2 1 1 1]]


In [370]:
h = m[:2,1:3]
# m[beginning row(default 0) : end row(2) , beginning column(1) : end column(3)]
# excluding end points!
print(h)

[[7 3]
 [6 7]]


In [371]:
print(m[0,1])

7


In [372]:
h[0,0] = 3
print(m[0,1])

3


In [373]:
print(h)

[[3 3]
 [6 7]]


In [374]:
print(m)

[[3 3 3 4]
 [5 6 7 2]
 [2 1 1 1]]




---


Numpy contains many useful functions. Some of these are as follows:

append, where, add, random, reshape, vstack, mean, median, std, isnan


---



***Exercises (NumPy)***

In [375]:
# Create a 3x3 numpy array consisting of zeros.
np.zeros((3,3))

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [376]:
# Select and extract all odd numbers within an array.
# Hint -> Input: array[0,1,2,3,4,5,6,7] and desired output: array[1,3,5,7]

x = np.array([0,1,2,3,4,5,6,7])
y = np.array([])

for ele in x:
  if ele % 2 != 0:
    y = np.append(y, ele)
  else:
    continue
y

array([1., 3., 5., 7.])

In [377]:
# Convert a one-dimensional array into a two-dimensional array with two rows.
# Hint -> Input: array[0,1,2,3,4,5,6,7], Desired output: array([[0, 1, 2, 3],[4, 5, 6, 7]])

x = np.array([0,1,2,3,4,5,6,7])
x.shape

(8,)

In [378]:
x.reshape(2,4)

array([[0, 1, 2, 3],
       [4, 5, 6, 7]])

In [379]:
new_column = int(len(x)/2)
x_revise = x.reshape(2, new_column)
print(x_revise)

[[0 1 2 3]
 [4 5 6 7]]


In [380]:
x_revise = np.reshape(x, (2,-1))
print(x_revise)

[[0 1 2 3]
 [4 5 6 7]]


In [381]:
# Stack the first_array and second_array arrays vertically.

first_array = np.arange(10).reshape(2,-1)
second_array = np.repeat(1, 10).reshape(2,-1)

first_array = np.arange(10)
print(first_array)
print(first_array.shape)


[0 1 2 3 4 5 6 7 8 9]
(10,)


In [382]:
first_array = first_array.reshape(2,-1)
print(first_array)
print(first_array.shape)

[[0 1 2 3 4]
 [5 6 7 8 9]]
(2, 5)


In [383]:
second_array = np.repeat(1,10)
print(second_array)
print(second_array.shape)

[1 1 1 1 1 1 1 1 1 1]
(10,)


In [384]:
second_array = second_array.reshape(2,-1)
print(second_array)
print(second_array.shape)

[[1 1 1 1 1]
 [1 1 1 1 1]]
(2, 5)


In [385]:
stack = np.append(first_array, second_array)
print(stack)
print(stack.shape)

[0 1 2 3 4 5 6 7 8 9 1 1 1 1 1 1 1 1 1 1]
(20,)


In [386]:
# Take the elements from the m series that are greater than 5 and less than 10.
m = np.arange(0,15,2)
m

array([ 0,  2,  4,  6,  8, 10, 12, 14])

In [387]:
m = np.arange(0,15,2)
n = np.array([])
for ele in m:
  if ele > 5 and ele < 10:
    n = np.append(n, ele)
print(n)

[6. 8.]


In [388]:
# Find the mean, median and standard deviation of the m series.
print(np.mean(m))
print(np.median(m))
print(np.std(m))

7.0
7.0
4.58257569495584


In [389]:
# Find the indices of elements greater than 0.5 in the m array.
# Define these indices in a new variable named idx.

m = np.random.random(1000)
idx = np.array([])

for i in range(len(m)):
  ele = m[i]
  if ele > 0.5:
    idx = np.append(idx,i)
print(len(idx))

504


In [390]:
# Reverse a vector (the first element becomes the last element)

x = np.array([0,1,2,3,4,5,6,7])

reverse = np.flip(x)
reverse

array([7, 6, 5, 4, 3, 2, 1, 0])

In [391]:
# OR -> with no clear beginning and end

x[::-1]

array([7, 6, 5, 4, 3, 2, 1, 0])

In [392]:
# Create a random vector of size 10 and add a 0 value to the front of the largest value.

vector = np.random.rand(10)
print(vector)

[0.26565472 0.04586685 0.4153434  0.04514602 0.29294593 0.17598683
 0.1816093  0.32763183 0.02962996 0.9297918 ]


In [393]:
print(np.argmax(vector))

9


In [394]:
vector_revise = np.insert(vector, np.argmax(vector), 0)
print(vector_revise)

[0.26565472 0.04586685 0.4153434  0.04514602 0.29294593 0.17598683
 0.1816093  0.32763183 0.02962996 0.         0.9297918 ]


In [395]:
# Subtract the mean of each row from that row.
x = np.reshape(vector, (2,5))
print(x)

[[0.26565472 0.04586685 0.4153434  0.04514602 0.29294593]
 [0.17598683 0.1816093  0.32763183 0.02962996 0.9297918 ]]


In [396]:
mean = np.mean(x)
# For the second argument of the mean function,
# axis = 1 represents rows, axis = 0 represents columns

print(mean)

row_mean = np.mean(x, axis = 1)
print(row_mean)


0.2709606645028658
[0.21299138 0.32892994]


In [397]:
column_mean = np.mean(x, axis = 0)
print(column_mean)

[0.22082077 0.11373808 0.37148762 0.03738799 0.61136886]


In [398]:
row_mean = np.mean(x, axis = 1)
row_mean

array([0.21299138, 0.32892994])

In [399]:
np.repeat(row_mean, 5).shape

(10,)

In [400]:
row_mean_array = np.repeat(row_mean, 5).reshape(2,5)
row_mean_array

array([[0.21299138, 0.21299138, 0.21299138, 0.21299138, 0.21299138],
       [0.32892994, 0.32892994, 0.32892994, 0.32892994, 0.32892994]])

In [401]:
final = x - row_mean_array
print(final)

[[ 0.05266333 -0.16712453  0.20235202 -0.16784537  0.07995454]
 [-0.15294312 -0.14732064 -0.00129812 -0.29929998  0.60086186]]




---

How does the “Repeat” function work?


---



In [402]:
print(np.repeat(3,4))                   # Repeat number 3 four times

x = np.array([[1,2],[3,4]])             # 2x2 matrix
print(x)                                # Prints the matrix

print(np.repeat(x,2))                   # Repeats all elements (flattening them) twice
print(np.repeat(x, 3, axis = 0))        # Repeats the rows 3 times
print(np.repeat(x, 3, axis = 1))        # Repeats the columns 3 times
print(np.repeat(x, [1,2], axis = 0))    # Repeat line 1 once, line 2 twice

[3 3 3 3]
[[1 2]
 [3 4]]
[1 1 2 2 3 3 4 4]
[[1 2]
 [1 2]
 [1 2]
 [3 4]
 [3 4]
 [3 4]]
[[1 1 1 2 2 2]
 [3 3 3 4 4 4]]
[[1 2]
 [3 4]
 [3 4]]




---


According to worldometers.info, as of 11:00 on 17 February 2019, the world population was approximately 7,684,621,550, with an average daily growth rate of approximately 107,000.

World population:
world_pop = 7684621550

Average daily growth rate:
growth_rate = 107000

Create a series (world_pop_series) showing the world population from 17 February to 26 February. Then create a dictionary containing this information and save it to a file using numpy.save.


---



In [403]:
days = np.arange(17,27,1)
pop = np.arange(7684621550, 7684621550+(10*107000), 107000)
final_array = np.append(days, pop).reshape(2,-1).T

In [404]:
my_dict = dict(zip(days, pop))
my_dict

clean_dict = {int(k): int(v) for k, v in my_dict.items()}
np.save("Example_dictionary.npy", clean_dict)

In [405]:
clean_dict

{17: 7684621550,
 18: 7684728550,
 19: 7684835550,
 20: 7684942550,
 21: 7685049550,
 22: 7685156550,
 23: 7685263550,
 24: 7685370550,
 25: 7685477550,
 26: 7685584550}



---


Note: The zip() function creates an iterator object that returns tuples, combining the first elements of each given iterator, then the second elements, and so on.


---





---


Write a time_resolved function for a given array and your desired step (or window size).

The function will take the following three parameters:
  time_resolved(dizi, window_width, shift)

For example,
for array = np.array([1, 2, 6, 4, 5, 4, 2, 7, 8, 9]),

when time_resolved(array, 3, 4) is called, the function's output should be [4, 5, 4].


---



In [406]:
def time_resolved(array, window_width, shift):
  under_array = np.array([])
  for i in range(shift-1, shift + window_width-1):
    under_array = np.append(under_array, array[i])
  return under_array

In [407]:
array = np.array([1,2,3,4,5,6,7,8,9])
time_resolved(array, 3, 4)

array([4., 5., 6.])



---


Hello everyone, we have reached the end of the second episode. It was a much more intense episode than the previous one. I hope you found it useful.If there is anything you don't understand, please don't hesitate to contact me. See you in the next episode  :)


---

