# Introduction to NumPy
NumPy Notes:

In [141]:
import numpy as np
np.__version__

'2.2.4'

## Differences between lists and NumPy arrays
* Lists - add/delete items, mutable size, different data types
* Arrays - immutable size, same data type, contiguous

In [142]:
gpas_as_list = [3.4, 3.2, 3.6]

In [143]:
# Can add elements to it
gpas_as_list.append(4.0)
# Can have multiple data types
gpas_as_list.insert(1, "Whatevs")
# Can have items removed
gpas_as_list.pop(1)

'Whatevs'

In [144]:
gpas_as_list

[3.4, 3.2, 3.6, 4.0]

In [145]:
gpas = np.array(gpas_as_list)

In [146]:
?gpas

[31mType:[39m        ndarray
[31mString form:[39m [3.4 3.2 3.6 4. ]
[31mLength:[39m      4
[31mFile:[39m        c:\users\blaise\appdata\local\programs\python\python313\lib\site-packages\numpy\__init__.py
[31mDocstring:[39m  
ndarray(shape, dtype=float, buffer=None, offset=0,
        strides=None, order=None)

An array object represents a multidimensional, homogeneous array
of fixed-size items.  An associated data-type object describes the
format of each element in the array (its byte-order, how many bytes it
occupies in memory, whether it is an integer, a floating point number,
or something else, etc.)

Arrays should be constructed using `array`, `zeros` or `empty` (refer
to the See Also section below).  The parameters given here refer to
a low-level method (`ndarray(...)`) for instantiating an array.

For more information, refer to the `numpy` module and examine the
methods and attributes of an array.

Parameters
----------
(for the __new__ method; see Notes below)

shape : 

In [147]:
gpas.dtype

dtype('float64')

In [148]:
gpas.itemsize

8

In [149]:
gpas.size

4

In [150]:
len(gpas)

4

In [151]:
gpas.nbytes

32

## Multidimensional arrays
* Keep in mind [floating point precision](https://numpy.org/doc/stable/user/basics.types.html#extended-precision) if adding floats from array.
* Data structure is called `ndarray`, representing any **n**umber of **d**imensions
* Arrays can have multiple dimensions, you declare them on creation
* Dimensions help define what each element in the array represents.  A two dimensional array is just an array of arrays
* **Rank** defines how many dimensions an array contains 
* **Shape** defines the length of each of the array's dimensions
* Each dimension is also referred to as an **axis**, and they are zero-indexed. Multiples are called **axes**.
* A 2d array is AKA **matrix**.

In [152]:
student_gpas = np.array([
    [3.4, 3.2, 3.6, 4.0],
    [3.2, 3.8, 4.0, 4.0],
    [1.2, 2.4, 3.5, 3.8]
], np.float16)
student_gpas

array([[3.4, 3.2, 3.6, 4. ],
       [3.2, 3.8, 4. , 4. ],
       [1.2, 2.4, 3.5, 3.8]], dtype=float16)

In [153]:
student_gpas.ndim

2

In [154]:
student_gpas.shape

(3, 4)

In [155]:
student_gpas.size

12

In [156]:
len(student_gpas)

3

In [157]:
student_gpas.itemsize

2

In [158]:
student_gpas.itemsize * student_gpas.size

24

In [159]:
%whos ndarray

Variable        Type       Data/Info
------------------------------------
fake_log        ndarray    7: 7 elems, type `uint16`, 14 bytes
gpas            ndarray    4: 4 elems, type `float64`, 32 bytes
index           ndarray    2x2: 4 elems, type `int64`, 32 bytes
student_gpas    ndarray    3x4: 12 elems, type `float16`, 24 bytes
study_minutes   ndarray    4x7: 28 elems, type `uint16`, 56 bytes


In [160]:
np.info(student_gpas)

class:  ndarray
shape:  (3, 4)
strides:  (8, 2)
itemsize:  2
aligned:  True
contiguous:  True
fortran:  False
data pointer: 0x1cedbf0bb60
byteorder:  little
byteswap:  False
type: float16


In [161]:
student_gpas[2]

array([1.2, 2.4, 3.5, 3.8], dtype=float16)

In [162]:
student_gpas[2][3]

np.float16(3.8)

## About data types
* [data types](https://docs.scipy.org/doc/numpy-1.14.0/user/basics.types.html) are here.
* Data types are maintained by wrapping values in a [scalar representation](https://numpy.org/doc/stable/reference/arrays.scalars.html)
* np.zeros is a good way to create an array of zeros and specify data type.

In [163]:
study_minutes = np.zeros(7, np.uint16)
study_minutes

array([0, 0, 0, 0, 0, 0, 0], dtype=uint16)

In [164]:
%whos

Variable             Type           Data/Info
---------------------------------------------
fake_log             ndarray        7: 7 elems, type `uint16`, 14 bytes
first_day_minutes    uint16         Shape: ()
gpas                 ndarray        4: 4 elems, type `float64`, 32 bytes
gpas_as_list         list           n=4
index                ndarray        2x2: 4 elems, type `int64`, 32 bytes
np                   module         Shape: <function shape at 0x000001CEFF6332E0>
rand                 RandomState    RandomState(MT19937)
second_day_minutes   uint16         Shape: ()
student_gpas         ndarray        3x4: 12 elems, type `float16`, 24 bytes
study_minutes        ndarray        7: 7 elems, type `uint16`, 14 bytes


In [165]:
study_minutes[0] = 150

In [166]:
first_day_minutes = study_minutes[0]

In [167]:
first_day_minutes

np.uint16(150)

In [168]:
type(first_day_minutes)

numpy.uint16

In [169]:
study_minutes[1] = 60

In [170]:
second_day_minutes = study_minutes[1]

In [171]:
second_day_minutes

np.uint16(60)

In [172]:
study_minutes[2:6] = [80, 60, 30, 90]

## Creation 
* `np.random` package.  
  * `RandomState` lets you seed your randomness in a repeatable way.
* You can append:
   * Use the `np.append` method.  Make sure the new row is the same shape.
   * Create/reassign a new array by including the existing array as part of the iterable in creation (enclose in hard brackets).

## Indexing
* Use indexing shortcut by separating dimensions with a comma.  
* Index using a `list` or `np.array`.  Values will be pulled out at that specific index (Fancy indexing)

In [173]:
study_minutes = np.array([
    study_minutes,
    np.zeros(7, np.uint16)
])

In [174]:
study_minutes.shape

(2, 7)

In [175]:
study_minutes[1][0] = 60

In [176]:
rand = np.random.RandomState(42)
fake_log = rand.randint(30, 180, size=7, dtype=np.uint16)
fake_log

array([132, 122, 128,  44, 136, 129, 101], dtype=uint16)

In [177]:
[fake_log[3], fake_log[4]]

[np.uint16(44), np.uint16(136)]

In [178]:
fake_log[[3, 4]]

array([ 44, 136], dtype=uint16)

In [179]:
index = np.array([
    [3, 4],
    [0, 1]
])
fake_log[index]

array([[ 44, 136],
       [132, 122]], dtype=uint16)

In [180]:
study_minutes = np.append(study_minutes, [fake_log], axis=0)
study_minutes

array([[150,  60,  80,  60,  30,  90,   0],
       [ 60,   0,   0,   0,   0,   0,   0],
       [132, 122, 128,  44, 136, 129, 101]], dtype=uint16)

In [181]:
study_minutes[1, 1] = 360
study_minutes

array([[150,  60,  80,  60,  30,  90,   0],
       [ 60, 360,   0,   0,   0,   0,   0],
       [132, 122, 128,  44, 136, 129, 101]], dtype=uint16)