## <p align="center" style="font-family: monospace, monaco;"> Chapter 4 - Numpy Foundations </p>
### <p align="center" style="font-family: monospace, monaco;"> Jupyter Notebook </p>

<p style="font-family: monospace, monaco;"> <span style="background-color:#ffb3c6;" ><strong>Numpy</strong></span> &rarr; Is a core package for scientific computing, it provides support for array based calculations, aswell as linear algebra</p> 

### <p style="font-family: monospace, monaco;"> Numpy Array</p>
- <p style="font-family: monospace, monaco;">to preform any array based calculations with nested list, you would have to write some sort of loop</p>
<p style="font-family: monospace, monaco"> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Ex] to add an number to every element in a nested list, you could do:</p>


In [1]:
import numpy as np

In [16]:
matrix = [[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]
[[i + 1 for i in row] for row in matrix]

[[2, 3, 4], [5, 6, 7], [8, 9, 10]]

- <p style="font-family: monospace, monaco;"> Unfortunately this code is not only difficult to read but will become very slow when looping through big arrays. </p>
- <p style="font-family: monospace, monaco;"> Using Numpy instead of Python list can be anywhere from a couple times to a hundred times faster.</p>
- <p style="font-family: monospace, monaco;"> Numpy is able to achieve this by using code from compiled languages like C and Fortran.</p>
- <p style="font-family: monospace, monaco;"> <span style="background-color:#ffb3c6;" > A Numpy array is a N-dimensional array for <em>homogenous data</em>
- <p style="font-family: monospace, monaco;"> <span style="background-color:#ffb3c6;" >Homogenous Data </span> &rarr; Data of the same data type (ex: all ints) </p>
- <p style="font-family: monospace, monaco;"> There are one and two dimensional arrays of floats

<img src="media/1-2d.png" alt="one and two dimensional" style="width:500px; height:250px;" >

<p style="font-family: monospace, monaco;"> Creating a one and two dimensional array to work with: </p>

In [6]:
import numpy as np

In [14]:
#This constructs an array with a simple list results in a one dimensional array
array1 = np.array([10, 100, 1000.])

In [15]:
#This constructs an array with a nested list results in a two dimensional array
array2 = np.array([[1., 2., 3.],
                   [4., 5., 6.]])

- <p style="font-family: monospace, manaco;"> A one dimensional array does not have a explicit row and colomn orientation</p>
- <p style="font-family: monospace, manaco;"> <code>Float64</code> is able to accomadate all elements. Were able to see this because <code>array1</code> has all <code>ints</code> except the last number which is a <code>float</code>, this turns the entire data type into <code>float64</code>
- <p style="font-family: monospace, manaco;"> To access its data type us can use dtype: </p>

In [17]:
array1.dtype

dtype('float64')

- <p style="font-family: monospace, monaco;"> Numpy uses its own data types, which are more granular than Pythons</p>
- <p style="font-family: monospace, monaco;"> You can manually convert a Numpy data type back to a Python data type by doing:

In [13]:
float(array1[0])

10.0

### <p style="font-family: monospace, monaco;"> Vectorization and Broadcasting</p>
- <p style="font-family: monospace, monaco;"> When building the sum of a scalar and Numpy array, Numpy preforms an element-wise operation refered to as  Vectorization</p>
- <p style="font-family: monospace, monaco;"> Vectorization allows you to write concise code practically representing the mathmatical notation: </p>

In [19]:
array2 + 1

array([[2., 3., 4.],
       [5., 6., 7.]])

- <p style="font-family: monospace, monaco;">Scalar &rarr; refers to basic Python data types like float and string</p>
- <p style="font-family: monospace, monaco;"> When working with two arrays, Numpy still uses the element-wise operation: <p>

In [21]:
array2 * array2

array([[ 1.,  4.,  9.],
       [16., 25., 36.]])

- <p style="font-family: monospace, monaco;">Broadcasting &rarr; When using two arrays with different shapes, Numpy extends the smaller array over the larger array making their shape compatible (automatically, and only when possible)</p>

In [22]:
array2 * array1

array([[  10.,  200., 3000.],
       [  40.,  500., 6000.]])

- <p style="font-family: monospace, monaco;"> When preforming matrix multiplications or dot products, use the <code>@</code></p>

In [23]:
array2 @ array2.T #array2.T is short for array.transpose()

array([[14., 32.],
       [32., 77.]])

- <p style="font-family: monospace, monaco;"> The .transpose() function turns all column elements into row elements and all row elements into column elements (basically flip flopping the rows and colums)</p> 

### <p style="font-family: monospace, monaco;"> Universal Function (ufunc) </p>
- <p style="font-family: monospace, monaco;"> Universal Function (ufunc) &rarr; works on every single element in a Numpy array
<p style="font-family: monospace, monaco;"> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Ex] Using the Python sqrt function will produce an error when used on a Numpy array:

In [25]:
import math
math.sqrt(array2)

TypeError: only length-1 arrays can be converted to Python scalars

- <p style="font-family: monospace, monaco;">You can create a nested loop, and get the sqrt of every element, then build a Numpy array again with the results: </p>

In [26]:
np.array([[math.sqrt(i) for i in row] for row in array2])

array([[1.        , 1.41421356, 1.73205081],
       [2.        , 2.23606798, 2.44948974]])

- <p style="font-family: monospace, monaco;"> While this works in cases when Numpy does not offer a ufunc and the array is small enough, using Numpys ufunc (if available) will be much faster and better with big arrays (also better to type and read)</p>

In [27]:
np.sqrt(array2)

array([[1.        , 1.41421356, 1.73205081],
       [2.        , 2.23606798, 2.44948974]])

- <p style="font-family: monospace, monaco;"> Some of NumPy's ufuncs like <code>sum</code> are also available as array methods</p>
- <p style="font-family: monospace, monaco;"> To find the sum of each column you can do: </p>

In [28]:
array2.sum(axis=0)

array([5., 7., 9.])

- <p style="font-family: monospace, monaco;"> <code>axis=0</code> &rarr; the axis along the rows (columns) </p>
- <p style="font-family: monospace, monaco;"> <code>axis=1</code> &rarr; the axis along the columns (rows) </p>
- <p style="font-family: monospace, monaco;"> Just using <code>sum</code>, sums up the whole array</p>

In [29]:
array2.sum()

np.float64(21.0)

- <p style="font-family: monospace, monaco;"> NumPy unfuncs can be used with pandas DataFrames</p>


### <p style="font-family: monospace, monaco;"> Creating and Manipulating Arrays</p> 
**<p style="font-family: monospace, monaco;"> Getting and Setting Array Elements </p>**
- <p style="font-family: monospace, monaco;"> When working with nested list like <code>matrix</code>, you can use chaind indexing: <code>matrix[0][0]</code>. This will get you the first element of the first row </p>
- <p style="font-family: monospace, monaco;"> With NumPy, you provide both the index and slice operation in a single bracket: </p>

In [None]:
numpy_array[row_selection, column_selection] 

- <p style="font-family: monospace, monaco;"> For a one dimensional array, youd write: <code>numpy_array[selection]</code> </p>
- <p style="font-family: monospace, monaco;"> The slice notion includes the start index and excludes the end index, like: <code>start:end</code></p>
- <p style="font-family: monospace, monaco;"> When removing the start and end index, you are left with a colon, which stands for all rows or all columns</p>
- <p style="font-family: monospace, monaco;"> When slicing a column or row of a two dimensional array, you are left with a one dimensional array, not a two dimensional row or column vector</p>
<img src="../media/slice.png" alt="slice examples" style="height: 100px; width: 500px;">
- <p style="font-family: monospace, monaco;"> Examples from image shown above:m</p>

In [32]:
array1[2] #returns a scalar

np.float64(1000.0)

In [33]:
array2[0,0] #returns a scalar

np.float64(1.0)

In [34]:
array2[:,1:] #returns a 2d array

array([[2., 3.],
       [5., 6.]])

In [35]:
array2[:,1] #returns a 1d array

array([2., 5.])

In [36]:
array2[1, :2] #returns a 1d array

array([4., 5.])

<strong><p style="font-family: monospace, monaco;">Useful Array Constructors</p></strong>
- <p style="font-family: monospace, monaco;"> You can easily create an array by using <code>arange</code> which stands for array range</p>
- <p style="font-family: monospace, monaco;"> <code>arange</code> is similar to the built in <code>range</code>, the only difference is <code>arange</code> returns a NumPy array</p>
- <p style="font-family: monospace, monaco;"> Using <code>reshape</code> allows us to generate an array with desired dimensions</p>


In [37]:
np.arange(2*5).reshape(2,5) #2 rows, 5 columns

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

- <p style="font-family: monospace, monaco;"> Could also be needed to generate an array eith psuedorandom numbers.</p>

In [38]:
np.random.randn(2,3) #2 rows, 3 columns

array([[ 0.88011096, -0.83695597, -0.23195625],
       [ 0.49071027,  1.33413016, -0.337547  ]])

- <p style="font-family: monospace, monaco;"> Another useful constructors are <code>np.ones</code> and <code>np.zeros</code> to create an array of ones and zeros. There is also <code>np.eye</code> which is used to create an identity matrix</p>
<strong><p style="font-family: monospace, monaco;"> View vs. Copy</p></strong>
- <p style="font-family: monospace, monaco;"> When slicing an array, it returns a view, this means that you are working with a subset of the original data, which means you are changing the original array as it isnt a copy</p>

In [39]:
array2

array([[1., 2., 3.],
       [4., 5., 6.]])

In [40]:
subset = array2[:,:2]
subset

array([[1., 2.],
       [4., 5.]])

In [42]:
subset[0,0] = 1000
subset

array([[1000.,    2.],
       [   4.,    5.]])

In [43]:
array2

array([[1000.,    2.,    3.],
       [   4.,    5.,    6.]])

- <p style="font-family: monospace, monaco;"> If that is not what you wanted, you have to add <code> .copy()</code>< at the end of the line where uyou are setting the array as the value of subset: </p>

In [44]:
subset = array2[:, :2].copy()

### <p style="font-family: monospace, monaco;"> Conclusion</p>
- <p style="font-family: monospace, monaco;"> once text is invloved, the array will have a datatype of <code>object</code>, which mathmatical equations cant be preformed on</p>
