# Numpy
NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays [[1]](https://en.wikipedia.org/wiki/NumPy). <br>

NumPy is, just like SciPy, Scikit-Learn, Pandas, etc. one of the packages that you just can’t miss when you’re learning data science, mainly because this library provides you with an array data structure that holds some benefits over Python lists, such as: being more compact, faster access in reading and writing items, being more convenient and more efficient [[2]](https://www.datacamp.com/community/tutorials/python-numpy-tutorial). 

### Resources
1. [Numpy Documentation](https://docs.scipy.org/doc/numpy/user/quickstart.html)
2. [Scipy Lecture Notes](http://www.scipy-lectures.org/)
3. [Python Data Science Handbook](https://www.amazon.com/Python-Data-Science-Handbook-Essential/dp/1491912057)
3. [NumPy Data Science Essential Training](https://www.lynda.com/NumPy-tutorials/NumPy-Data-Science-Essential-Training/508873-2.html)

### 1. Basics

NumPy arrays are fixed-size containers of items that are more efficient than Python lists or tuples for data processing
- They only store a single data type (mixed data types are stored as a string)
- They can be one dimensional or multi-dimensional
- Array elements can be modified, but the array size cannot change


In [5]:
import numpy as np

In [6]:
sales = [0, 3, 6, 23, 56, 73, 234, 456, 2391]
sales_array = np.array(sales)
sales_array

array([   0,    3,    6,   23,   56,   73,  234,  456, 2391])

In [3]:
type(sales_array)

numpy.ndarray

NumPy arrays have these key properties:
- ndim – the number of dimensions (axes) in the array
- shape – the size of the array for each dimension
- size – the total number of elements in the array
- dtype – the data type of the elements in the array

In [4]:
print(f"ndim: {sales_array.ndim}")
print(f"shape: {sales_array.shape}")
print(f"size: {sales_array.size}")
print(f"dtype: {sales_array.dtype}")

ndim: 1
shape: (9,)
size: 9
dtype: int64


In [10]:
sales_2d = [[3, 5, 12, 78], [23, 56, 73, 234], [6, 23, 56, 73]]
sales_2d_array = np.array(sales_2d)
print(sales_2d_array)
type(sales_2d_array)

[[  3   5  12  78]
 [ 23  56  73 234]
 [  6  23  56  73]]


numpy.ndarray

In [11]:
print(f"ndim: {sales_2d_array.ndim}")
print(f"shape: {sales_2d_array.shape}")
print(f"size: {sales_2d_array.size}")
print(f"dtype: {sales_2d_array.dtype}")

ndim: 2
shape: (3, 4)
size: 12
dtype: int64


### Array Creation

##### As an alternative to converting lists, you can create arrays using functions
![array creation](img/array_creation.png)

In [13]:
np.ones((3, 4))

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

In [14]:
np.zeros((3, 4))

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [15]:
np.full((3, 4), 5)

array([[5, 5, 5, 5],
       [5, 5, 5, 5],
       [5, 5, 5, 5]])

In [16]:
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [17]:
np.random.random((3, 4))

array([[0.35195466, 0.2578571 , 0.44767187, 0.07707635],
       [0.89819737, 0.2578158 , 0.20808739, 0.74219542],
       [0.79869588, 0.77909724, 0.92438517, 0.0789578 ]])

In [18]:
np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [19]:
np.linspace(0, 10, 5)

array([ 0. ,  2.5,  5. ,  7.5, 10. ])

In [20]:
np.arange(10).reshape(2, 5)

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

#### You can create random number arrays from a variety of distributions using NumPy functions and methods (great for sampling and simulation!)

In [24]:
from numpy.random import default_rng
rng = default_rng(1234)
random_array = rng.random((3, 4))
random_array

array([[0.97669977, 0.38019574, 0.92324623, 0.26169242],
       [0.31909706, 0.11809123, 0.24176629, 0.31853393],
       [0.96407925, 0.2636498 , 0.44100612, 0.60987081]])

In [31]:
rng = default_rng(1234)
mean, std_d = 2, 0.8
random_array = rng.normal(mean, std_d, size=(2, 4))
random_array

array([[0.71693056, 2.05127993, 2.59271304, 2.12209535],
       [2.69099511, 4.33047938, 0.81694131, 2.75637838]])

In [None]:
# task 1
# create a array with 50 elemnets from the range 23 to 56
# reshape it to 5 rows and 10 columns



In [32]:
# task 2
# create a random numbers from 0 to 1 in a 3x6 shape

### Array Indexing & Slicing

Indexing & slicing one-dimensional arrays is the same as base Python
- array[index] – indexing to access a single element (0-indexed)
- array[start:stop:step size] – slicing to access a series of elements (stop is not inclusive)


In [33]:
arange_arr = np.arange(5, 10)
print(arange_arr)

print(arange_arr[3]) # 4th element as 0-base indexing
print(arange_arr[-2]) # 2nd last element
print(arange_arr[1:3]) # Last index is exculded
print(arange_arr[::2])

arange_arr[-1] = 15 # mutable
print(arange_arr)
arange_arr[0:3] = [1, 2, 3]
print(arange_arr)

[5 6 7 8 9]
8
8
[6 7]
[5 7 9]
[ 5  6  7  8 15]
[ 1  2  3  8 15]


Indexing & slicing two-dimensional arrays requires an extra index or slice
- array[row index, column index] – indexing to access a single element (0-indexed)
- array[start:stop:step size, start:stop:step size] – slicing to access a series of elements

In [34]:
arange_arr = np.arange(12)
arange_arr.shape = (4,3)
print(arange_arr)

# arange_arr(row, col)  ---- Basic format
print(arange_arr[1])
print(arange_arr[1][2])
print(arange_arr[1,2]) # alternative format of previous one
print(arange_arr[0:2, 1:3])
print(arange_arr[:, 1:3])
print(arange_arr[0:3, :2])
print(arange_arr[1:, [0, 2]]) # Indexing with an array of integers (discreat)

[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]
[3 4 5]
5
5
[[1 2]
 [4 5]]
[[ 1  2]
 [ 4  5]
 [ 7  8]
 [10 11]]
[[0 1]
 [3 4]
 [6 7]]
[[ 3  5]
 [ 6  8]
 [ 9 11]]


##### A few more examples
![array indexing](img/array_indexing.png)

#### Array indexing exercise
![array indexing exercise](img/array_indexing_exercise.png)

### 2. Array Operations

In [39]:
sales = [[12, 2, 6, 48], [6, 3, 5, 73]]
sales_array = np.array(sales)
sales_array

array([[12,  2,  6, 48],
       [ 6,  3,  5, 73]])

In [40]:
sales_array + 2

array([[14,  4,  8, 50],
       [ 8,  5,  7, 75]])

In [41]:
quanity = sales_array[0, :]
price = sales_array[1, :]

quanity * price

array([  72,    6,   30, 3504])

In [43]:
# Task 3
# design a campaign: random discount for a product on given prices

prices = np.array([10, 20, 30, 40, 50, 60])

# add flat 5tk shipping cose for every product

# generate a random discount for every product

# calculate the final price and round the result 2digits floating point 


##### You can filter arrays by indexing them with a logical test
- Only the array elements in positions where the logical test returns True are returned

In [44]:
prices >= 20

array([False,  True,  True,  True,  True,  True])

In [46]:
prices[prices >= 20]

array([20, 30, 40, 50, 60])

In [47]:
prices[(prices >= 20) & (prices <= 50)]

array([20, 30, 40, 50])

In [49]:
# You can modify array values by assigning new ones
prices[2] = 0
prices[4] = 0
print(prices)

[10 20  0 40  0 60]


In [51]:
prices[prices == 0] = 8
prices

array([10, 20,  8, 40,  8, 60])

- The **where()** NumPy function performs a logical test and returns a given value if the test is True, or another if the test is False

![where filter function](img/where.png)

In [78]:
inventory_array = np.array([23, 67, 0, 4, 65, 0])
product_array = np.array(["apple", "orange", "mango", "banana", "grape", "pineapple"])

In [57]:
np.where(inventory_array <= 0, "out of stock", "Stock available")

array(['Stock available', 'Stock available', 'out of stock',
       'Stock available', 'Stock available', 'out of stock'], dtype='<U15')

In [58]:
np.where(inventory_array <= 0, "out of stock", product_array)

array(['apple', 'orange', 'out of stock', 'banana', 'grape',
       'out of stock'], dtype='<U12')

#### Task 4
> Hey there,<br><br>
We’re working on some more promotions. Can you filter our product list to only include prices greater than 25?<br>
Once you’ve done that, modify your logic to force "pepsi" into the list. Call this array ‘fancy_feast_special’.<br>
Finally, we need to modify our shipping logic. Create a new shipping cost array, but this time if price is greater than 20, shipping cost is 0, otherwise shipping cost is 5.<br><br>
Thanks!

- Array aggregation methods let you calculate metrics like sum, mean, and max

In [59]:
inventory_array

array([23, 67,  0,  4, 65,  0])

In [60]:
print(inventory_array.max())
print(inventory_array.min())
print(inventory_array.mean())
print(inventory_array.sum())


67
0
26.5
159


- You can also aggregate across rows or columns

In [83]:
# 2x4 inventory array
inventory_array = np.array([[23, 67, 0, 4], [65, 0, 23, 67]])
print(inventory_array.max(axis=0)) # aggregate along the row
print(inventory_array.max(axis=1)) # aggregate along the columns

print(inventory_array.sum(axis=0)) # aggregate along the row
print(inventory_array.sum(axis=1)) # aggregate along the columns

[65 67 23 67]
[67 67]
[88 67 23 71]
[ 94 155]


In [69]:
np.median(inventory_array)

23.0

In [71]:
np.percentile(inventory_array, 75)

65.5

In [72]:
np.unique(inventory_array)

array([ 0,  4, 23, 65, 67])

In [73]:
np.sqrt(inventory_array)

array([[4.79583152, 8.18535277, 0.        , 2.        ],
       [8.06225775, 0.        , 4.79583152, 8.18535277]])

+ The sort() method will sort arrays in place
    - Use the axis argument to specify the dimension to sort by

In [81]:
inventory_array

array([[23, 67,  0,  4],
       [65,  0, 23, 67]])

In [75]:
inventory_array.sort()
inventory_array

array([[ 0,  4, 23, 67],
       [ 0, 23, 65, 67]])

In [82]:
inventory_array.sort(axis=0)
inventory_array

array([[23,  0,  0,  4],
       [65, 67, 23, 67]])

In [84]:
inventory_array.sort(axis=1)
inventory_array

array([[ 0,  4, 23, 67],
       [ 0, 23, 65, 67]])

### 3. Copy & View

In [None]:
original_arr = np.arange(10)
print("Original:", original_arr)

copied_arr = original_arr
print("Copied:", copied_arr)

Original: [0 1 2 3 4 5 6 7 8 9]
Copied: [0 1 2 3 4 5 6 7 8 9]


In [None]:
### Same or Different
#reference equilaty
original_arr is copied_arr
print(id(original_arr))
print(id(copied_arr))

4442817040
4442817040


In [None]:
#value equality
original_arr == copied_arr
copied_arr[0] = 3
original_arr

array([3, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [None]:
np.may_share_memory(original_arr, copied_arr)


True

In [None]:
### View
raw_marks = np.array([88, 91, 82, 89, 95, 85])
final_marks = raw_marks.view()
final_marks.shape = (2, 3)
print("Before Chaning:")
print(final_marks)
print(raw_marks)
final_marks[1,2] = 93
print("After Changing:")
print(final_marks)
print(raw_marks)

Before Chaning:
[[88 91 82]
 [89 95 85]]
[88 91 82 89 95 85]
After Changing:
[[88 91 82]
 [89 95 93]]
[88 91 82 89 95 93]


In [None]:
np.may_share_memory(raw_marks, final_marks)


True

In [None]:
### Copy
raw_marks = np.array([88, 91, 82, 89, 95, 85])
#final_marks = raw_marks.copy()
final_marks = np.copy(raw_marks)
final_marks.shape = (2, 3)
print(final_marks)
np.may_share_memory(raw_marks, final_marks)

[[88 91 82]
 [89 95 85]]


False