# Week #4 - Numpy library for Python
### During this week, you will learn Numpy in Python.  Why is Numpy important?  It allows you to deal with complex mathematics. Before we can use any Numpy functions, we need to call the function in a library call.  After importing Numpy we can access all the functions.


### Topics You will learn
 - Arrays
 - arange
 - random integers
 - Uniform
 - random real numbers
 - normal random numbers
 - Maxtrix
 - shape/reshape matrix
 - Arithmetics of arrays
 - Adding arrays
 - Array methods
 - array indexing and Slicing

# Call Numpy library
Before using Nuumpy data types you must call the Numpy library.  The way to access Numpy library.  When ever you refer to numpy use np.

```
import numpy as np
```
### Run the code.

In [17]:
import numpy as np

# Why NumPy? (The Business Case)
### In Excel, if you want to multiply a column of prices by a column of quantities, you drag a formula down. In standard Python, you would need a "for loop." NumPy allows Vectorization, which lets us perform math on entire arrays at onceâ€”just like a single formula in an Excel cell applied to a whole range.

### Key Concept: NumPy arrays are homogeneous (all data must be the same type, usually numbers), making them much faster than standard Python lists.

### Creating Your First Arrays
#### Students should practice moving from standard Python lists to NumPy.

```
import numpy as np

# **1D Array (like a single row or column in Excel) - A "Vector"**  one row and three columns
prices = np.array([19.99, 25.50, 15.00])

# **2D Array (like a full spreadsheet) - A "Matrix"**   three rows and three columns
# Rows = Products, Columns = [Price, Cost, Inventory]
data = np.array([
    [19.99, 10.00, 50],
    [25.50, 12.00, 30],
    [15.00, 5.00, 100]
])
```
### Run the code.

In [18]:
import numpy as np

# 1D Array (like a single row or column in Excel) - A "Vector",  one row and three columns.  1 x 3 elements
prices = np.array([19.99, 25.50, 15.00])

# 2D Array (like a full spreadsheet) - A "Matrix",   three rows and three columns.  3x3 = 9 elements
# Rows = Products, Columns = [Price, Cost, Inventory]

data = np.array([
    [19.99, 10.00, 50],
    [25.50, 12.00, 30],
    [15.00, 5.00, 100]
])

# 3. Business Math: Vectorization & Broadcasting
### This is where the precalculus background shines. Instead of manual calculation, we use element-wise operations.  <br>

### Scenario: You want to apply a 10% tax to all prices. <br> 
Formula: $TaxedPrice = Price \times 1.10$
```
taxed_prices = prices * 1.10  # This is "Broadcasting" a single number across an array
```


In [19]:
taxed_prices = np.round(prices * 1.10, 2)  # This is "Broadcasting" a single number across an array
taxed_prices

array([21.99, 28.05, 16.5 ])

# Slicing and Filtering
Business data is often messy. We use slicing to extract specific values like "Price", "Cost", or "Inventory".

**Indexing: data[0, 0]**<br>
Gets the price of the first item.

**Slicing column: data[: , i]**<br>
The index i equals the column you want to access; in this case, the entire Cost column.  The first column index, 0, corresponds to the price.  The second column index, 1, corresponds to the cost.  The third column, which equals 2, is the inventory.  **1st column data[ : , 0]**.

**Slicing: data[j : k ]**<br>
The index j equals the row you want to access, in this case, the entire row.  The first row has index 0.  The second index k determine where you want to stop gathering rows. jth row stopping at k-1 rows  **1st row data[0 : 1]**

**Boolean Masking: data[: , 2] < 40** <br>
Finding all products with low inventory below 40. <br>**low_stock = data[data[:, 2] < 40]** 

```
cost = data[:, 1]
profit = prices - cost

prices  # displays prices
# in another code cell
taxed_prices  # prices with taxes
```


In [20]:
data

array([[ 19.99,  10.  ,  50.  ],
       [ 25.5 ,  12.  ,  30.  ],
       [ 15.  ,   5.  , 100.  ]])

In [21]:
data[:, 0] # 1st column

array([19.99, 25.5 , 15.  ])

In [22]:
data[:, 1]  # 2nd column

array([10., 12.,  5.])

In [23]:
data[:, 2]  # 3rd column

array([ 50.,  30., 100.])

In [24]:
data[0 :1 ]

array([[19.99, 10.  , 50.  ]])

In [25]:
cost = data[:, 1]
profit = prices - cost

profit  # displays prices

array([ 9.99, 13.5 , 10.  ])

In [26]:
# in another code cell
taxed_prices  # prices with taxes

array([21.99, 28.05, 16.5 ])

In [27]:
data[:, 0]  # first row

array([19.99, 25.5 , 15.  ])

In [28]:
# get inventory column
data[: , 2]


array([ 50.,  30., 100.])

In [29]:
low_stock = data[data[:, 2] < 40]
low_stock

array([[25.5, 12. , 30. ]])

In [30]:
data[0:1]  # First row

array([[19.99, 10.  , 50.  ]])

In [31]:
data[1: 2]  # Second row

array([[25.5, 12. , 30. ]])

In [32]:
data[2:3]  # Third row

array([[ 15.,   5., 100.]])

# Determine the size or shape of an Array

The size of the array is the number of elements in the array.  The shape of the array is the number of rows and columns.
```
print("Number of elements: ", data.size)   # number of elements
print("Number of rows and columns: ", data.shape)  # number of rows and columns
```


In [33]:
print("Number of elements: ", data.size)   # number of elements
print("Number of rows and columns: ", data.shape)  # number of rows and columns

Number of elements:  9
Number of rows and columns:  (3, 3)


# arange()
arange is similar to the range function in Python.  You can create an array of values from start, to end + 1.
```
A1 = np.arange(11)
A2 = np.arange(5, 11)
```

In [34]:
A1 = np.arange(11)
A1

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [35]:
A2 = np.arange(5, 11)
A2

array([ 5,  6,  7,  8,  9, 10])

# Random Numbers
You may want to generate random numbers from 0 to 1.

```
np.random.rand()
```

In [36]:
np.random.rand()

0.5270618619114266

# Random Numbers (rows and columns)
You may want to generate random numbers between 0 and 1 that fits into a matrix of (row, column)
```
np.random.rand(3, 5) # (rows, columns)
```

In [37]:
np.random.rand(3, 5)

array([[0.79271479, 0.38396621, 0.8331135 , 0.03719885, 0.08690188],
       [0.78585578, 0.44964238, 0.202725  , 0.09680042, 0.5774812 ],
       [0.12008124, 0.24389083, 0.91794674, 0.69579966, 0.92529777]])

# Random Numbers - Integers (rows, columns)
You may want to generate integers between integers, either as one value or a matrix of values.
```
np.random.randint(1, 101) # random number between 1 and 100
np.random.randint(1, 101, (5,3))
```

In [38]:
np.random.randint(1, 101)

22

In [39]:
np.random.randint(1, 101, (5,3))

array([[57, 85, 76],
       [18,  4,  5],
       [87, 64, 66],
       [79, 75, 72],
       [83, 77, 20]])

# Random Numbers - Normal
The normal distribution is very important in Statistics, so you can generate random numbers from a Normal distribution.
```
# the random number from a Normal distribution  mean = 75 and std = 5
np.random.normal(75, 5)

In [40]:
np.random.normal(75, 5, (4, 5))

array([[71.21726305, 80.81937103, 73.47829239, 77.72780332, 82.00283119],
       [77.96310416, 76.25062054, 74.55816873, 87.11153051, 72.65036388],
       [73.11478165, 76.2685694 , 71.9892267 , 69.05147113, 77.54666411],
       [79.11906143, 73.78926272, 79.02315089, 74.28073896, 67.11312945]])

# Converting a list to an Array
You can use a tuple to assign the mean and standard deviation.  Use a list comprehension to create a list and convert the list to an array.
```
mean, std = (75, 1.5)
l3 = [np.random.normal(mean, std) for x in np.arange(25)]
L3 = np.array(l3)  # convert the list to an array
L3
```

In [41]:
mean, std = (75, 1.5)
l3 = [np.random.normal(mean, std) for x in np.arange(25)]
L3 = np.array(l3)
L3

array([75.19342644, 74.10090291, 76.75389595, 75.15545685, 73.33188661,
       73.70481788, 72.4157888 , 74.09252912, 77.8321144 , 76.7486797 ,
       76.83156856, 74.77752219, 73.03358406, 76.47213105, 75.54317536,
       75.11301476, 75.23188117, 73.52558889, 76.80409912, 76.05128476,
       73.36353126, 75.17488867, 74.97945775, 74.64370407, 75.98694797])

# Arithmetic with NumPy Arrays
You can perform arithmetic operations with an array.  You can do all the order of operations (parentheses, raise to a power, multiplication, division, addition, or subtraction).
```
import numpy as np
A = np.array([31, 32, 33, 34, 35, 36, 37, 38, 39])
B = np.array([41, 42, 43, 44, 45, 46, 47, 48, 49])
A ** 3
A * 4
A / 3
A + 7
A - 6
(A + B) ** 2 + 3
A + 3 * B
2 * A - 4 * B
```

In [42]:
import numpy as np
A = np.array([31, 32, 33, 34, 35, 36, 37, 38, 39])
B = np.array([41, 42, 43, 44, 45, 46, 47, 48, 49])
A ** 3

array([29791, 32768, 35937, 39304, 42875, 46656, 50653, 54872, 59319])

In [43]:
A * 4

array([124, 128, 132, 136, 140, 144, 148, 152, 156])

In [44]:
A + 7

array([38, 39, 40, 41, 42, 43, 44, 45, 46])

In [45]:
A - 6

array([25, 26, 27, 28, 29, 30, 31, 32, 33])

In [46]:
(A + B) ** 2 + 3

array([5187, 5479, 5779, 6087, 6403, 6727, 7059, 7399, 7747])

In [47]:
A + 3 * B

array([154, 158, 162, 166, 170, 174, 178, 182, 186])

In [48]:
2 * A - 4 * B

array([-102, -104, -106, -108, -110, -112, -114, -116, -118])

In [49]:
G = np.array([11, 12, 13, 14, 15, 16, 17, 18, 19])
A * G

array([341, 384, 429, 476, 525, 576, 629, 684, 741])

In [50]:
A - G

array([20, 20, 20, 20, 20, 20, 20, 20, 20])

In [51]:
A/G

array([2.81818182, 2.66666667, 2.53846154, 2.42857143, 2.33333333,
       2.25      , 2.17647059, 2.11111111, 2.05263158])

In [52]:
(A + 3*G) ** 2

array([4096, 4624, 5184, 5776, 6400, 7056, 7744, 8464, 9216])

# Numpy Statistical Methods 
### Statistical methods that the numpy array uses (vectors)
 - max()
 - argmax()
 - min()
 - argmin()
 - mean()
 - median()
 - std()
 - sum()
 - count




In [53]:
K = np.random.normal(65, 2.5, (20,1))

In [54]:
K.mean().item()

65.46804869234407

In [55]:
# maximum value
L3.max().item()

77.83211439514776

In [56]:
# index of maximum value
L3.argmax().item()

8

In [57]:
# minimum of array
L3.min().item()

72.41578880120036

In [58]:
L3.argmin().item()

6

In [59]:
L3.mean().item()

75.0744751311121

In [60]:
L3.std().item()

1.3710075207560533

In [61]:
np.percentile(L3, 25).item()

74.0925291150512

In [62]:
np.percentile(L3, 75).item()

76.0512847595619

In [63]:
np.percentile(L3, 10).item()

73.34454446674229

# axis 
Axises is a variable that determines whether you are applying the method to rows (axis = 1) or columns (axis = 0).

```
X = np.arange(0, 15)
X = X.reshape(3, 5)
X.mean(axis = 0)
X.sum(axis = 0)
X.mean(axis = 1)
X.sum(axis = 1)
```


In [64]:
X = np.arange(0, 15)
X = X.reshape(3, 5)
X

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [65]:
Y = X.sum(axis = 1)
Y

array([10, 35, 60])

In [66]:
row_1_total = Y.item(0)
row_1_total

10

In [67]:
nm = X.sum(axis = 1).item(0)
nm

10

In [68]:
X.mean(axis = 0)  # average of columns

array([5., 6., 7., 8., 9.])

In [69]:
X.mean(axis = 1)  # average of rows

array([ 2.,  7., 12.])

In [70]:
X.std(axis = 0) # standard deviation of columns

array([4.0824829, 4.0824829, 4.0824829, 4.0824829, 4.0824829])

In [71]:
X.std(axis = 1) # standard deviation of rows

array([1.41421356, 1.41421356, 1.41421356])

In [72]:
np.percentile(X, [25, 75],  axis=1)

array([[ 1.,  6., 11.],
       [ 3.,  8., 13.]])

In [73]:
(X > 4).any()  # Are there any values in X greater than 4

np.True_

In [74]:
(X > -1).all()  # Are all the values in X greater than 4

np.True_

In [75]:
L4 = np.sort(L3)
L4

array([72.4157888 , 73.03358406, 73.33188661, 73.36353126, 73.52558889,
       73.70481788, 74.09252912, 74.10090291, 74.64370407, 74.77752219,
       74.97945775, 75.11301476, 75.15545685, 75.17488867, 75.19342644,
       75.23188117, 75.54317536, 75.98694797, 76.05128476, 76.47213105,
       76.7486797 , 76.75389595, 76.80409912, 76.83156856, 77.8321144 ])

# Conditional Logic
To create a new array based on the condition.
```
np.where(L4 > L4.mean(), 'higher', 'lower')
```

In [76]:
np.where(L4 > L4.mean(), 'higher', 'lower')

array(['lower', 'lower', 'lower', 'lower', 'lower', 'lower', 'lower',
       'lower', 'lower', 'lower', 'lower', 'higher', 'higher', 'higher',
       'higher', 'higher', 'higher', 'higher', 'higher', 'higher',
       'higher', 'higher', 'higher', 'higher', 'higher'], dtype='<U6')

In [77]:
w1 = L4.mean() - 1 * L4.std()
w2 = L4.mean() - 2 * L4.std()
w3 = L4.mean() + 1 * L4.std()
w4 = L4.mean() + 2 * L4.std()
np.where(L4 < w2, "2 std below", (np.where(L4 < w1, "1 std below", (np.where(L4 > w4, "1 std above", "2 std above")))))

array(['1 std below', '1 std below', '1 std below', '1 std below',
       '1 std below', '2 std above', '2 std above', '2 std above',
       '2 std above', '2 std above', '2 std above', '2 std above',
       '2 std above', '2 std above', '2 std above', '2 std above',
       '2 std above', '2 std above', '2 std above', '2 std above',
       '2 std above', '2 std above', '2 std above', '2 std above',
       '1 std above'], dtype='<U11')