# Numpy Library

In [1]:
# import numpy as np
import numpy as np

## Numpy General Info
- Numpy arrays must contain all the same datatype
    - if a list contains floats and ints, all will be converted to floats if possible
    - use Pandas if wanting a dataframe with multiple data types

## Creating Numpy Arrays
- `np.array(list)`
    - can supply the list as a variable or as a list of values in `[]`
    - can specify type of array `dtype='type'` for 'int', 'float32', etc.

#### Numpy Data Types
- set using `dtype='datatype'` or `dtype=np.datatype` in args
    - `bool_` True or False store as a byte
    - `int_` default integer type, normally either int64 or int32
    - `intc` identical to C int (int32 or int64)
    - `intp` int used for indexing
    - `int8` byte (-128 to 127)
    - `int16` integer (-32768 to 32767)
    - `int32` integer (-2147483648 to 2147483647)
    - `int64` integer (well, the biggest one)
    - `uint8` unsigned int (0 to 255)
    - `uint16` unsigned int (0 to 65535)
    - `uint32` unsigned int (0 to 4294967295)
    - `uint64` unsigned int (the biggest one)
    - `float_` shorthand for float64
    - `float16` half-precision float: sign bit, 5 bits exponent, 10 bits mantissa
    - `float32` single-precision float: sign bit, 8 bits exponent, 23 bits mantissa
    - `float64` double-precision float: sign bit, 11 bits exponent, 52 bits mantissa
    - `complex_` short for complex128
    - `complex64` complex number, with 32-bit floats
    - `complex128` complex number, with 64-bit floats
- more advanced type specification possible at [numpy documentation](http://numpy.org)

#### Create Arrays from Lists

In [2]:
# create an array from a list using np.array()
np.array([1, 2, 3, 4, 5, 6], dtype='float32')

array([1., 2., 3., 4., 5., 6.], dtype=float32)

In [3]:
# create an array from a list stored in a variable using np.asarray()
mylist = [2, 4, 6, 8, 10]
np.asarray(mylist, dtype='int')

array([ 2,  4,  6,  8, 10])

#### Explicit Multidimensional Array

In [4]:
# creates three rows [2, 4, 6] as starting i values, with ranges +3 over i
# row 1 range(2, 5), row 2 range(4, 7), row 3 range(6, 9)
np.array([range(i, i + 3) for i in [2, 4, 6]])

array([[2, 3, 4],
       [4, 5, 6],
       [6, 7, 8]])

#### Creating Arrays from Scratch
- This section contains info on generating random arrays too
- Can specify a *seed* when using the `np.random` package
    - `np.random.seed(num)` where "num" is an integer you supply
    - calling `np.random` using the same parameters will result in the same random numbers after this!
    - this could be useful for reproducibility in some cases

In [5]:
# length-10 int array filled with zeros using np.zeros(num, dtype)
# instead of 10, use (rows, columns) for multi-dimensional array
np.zeros(10, dtype=int)

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [6]:
# 3x5 arrays of floats filled with ones using np.ones
np.ones((3, 5), dtype=float)

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [7]:
# create an array where you specify the value to fill using np.full
np.full((4, 6), 3.14, dtype=float)

array([[3.14, 3.14, 3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14, 3.14, 3.14]])

In [8]:
# create an array specifying a start, stop, and step using np.arange()
# similar to the range() function
# args are (start_val, stop_not_included, step)
np.arange(0, 20, 2)

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In [9]:
# create an array from start to stop with evenly spaced steps between using np.linspace()
# args are (first_val, last_val, num_of_vals)
np.linspace(0, 1, 5)

array([0.  , 0.25, 0.5 , 0.75, 1.  ])

In [10]:
# create array of uniformly dist random values between 0-1 using np.random.random()
# can set the "seed" using np.random.seed(value)
np.random.random((4, 3))

array([[0.37312613, 0.70133611, 0.27573336],
       [0.85038517, 0.68947881, 0.30302937],
       [0.02321078, 0.33738511, 0.36386887],
       [0.81801482, 0.8707027 , 0.20365378]])

In [11]:
# create array of normally dist random values mean of 0 and stdev of 1
# can set the "seed" using np.random.seed(value)
# args are (mu, sigma, shape_or_number_of_vals)
# mu = mean, sigma = stdev
np.random.normal(0, 1, (3, 3))

array([[-0.442534  , -0.10722454, -0.0318288 ],
       [ 0.61570818,  0.42899879,  0.26139669],
       [-0.64980345, -0.06753001, -0.72745616]])

In [12]:
# create array of random ints using np.random.randint()
# can set the "seed" using np.random.seed(value)
# args are (start, stop_not_included, shape_or_num_of_vals)
np.random.randint(0, 10, (5, 5))

array([[8, 8, 9, 1, 6],
       [4, 3, 8, 1, 6],
       [5, 4, 6, 1, 6],
       [0, 1, 0, 3, 5],
       [4, 0, 5, 2, 2]])

In [13]:
# create identity matrix (1's on diagonal, all else 0's) using np.eye()
# always a square matrix
# dtype looks like float by default
np.eye(4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

## Array Attributes
- This attributes section will use the following x1, x2, and x3 arrays as examples:
    - these are 1d, 2d, and 3d arrays respectively
- Attributes
    - access using `object.attribute`
    - `ndim` number of dimensions
    - `shape` size of each dimension
    - `size` total size (number of items)
    - `dtype` data type of array
    - `itemsize` size in bytes of each array element
    - `nbytes` total size in bytes of the array
        - `nbytes` should equal `itemsize` times `size`

In [14]:
# seed for reproducibility with np.random
np.random.seed(0)

x1 = np.random.randint(10, size=6) # 1d array len=6
x2 = np.random.randint(10, size=(3, 4)) # 2d array 3x4
x3 = np.random.randint(10, size=(3, 4, 5)) # 3d array 3x4x5

In [18]:
# print number of dimensions
print("x1 ndim: ", x1.ndim)
print("x2 ndim: ", x2.ndim)
print("x3 ndim: ", x3.ndim)

x1 ndim:  1
x2 ndim:  2
x3 ndim:  3


In [19]:
# print total size using nbytes
print("x1 nbytes: ", x1.nbytes)
print("x2 nbytes: ", x2.nbytes)
print("x3 nbytes: ", x3.nbytes)

x1 nbytes:  24
x2 nbytes:  48
x3 nbytes:  240


## Accessing Array Elements
#### Access a single value (uses 0 indexing)
- `array[index]` for 1d
- `array[row_index, column_index]` for 2d
- `array[box_index, row_index, column_index]` for 3d

In [20]:
# show x3 values
x3

array([[[8, 1, 5, 9, 8],
        [9, 4, 3, 0, 3],
        [5, 0, 2, 3, 8],
        [1, 3, 3, 3, 7]],

       [[0, 1, 9, 9, 0],
        [4, 7, 3, 2, 7],
        [2, 0, 0, 4, 5],
        [5, 6, 8, 4, 1]],

       [[4, 9, 8, 1, 1],
        [7, 9, 9, 3, 6],
        [7, 2, 0, 3, 5],
        [9, 4, 4, 6, 4]]])

In [21]:
# select the middle box (index of 1), then first row first column
x3[1, 0, 0]

0

#### Array Slicing
- assigning an array slice or value to a variable can allow you to change the variable's value
    - **this will in turn modify the original array too!**
    - to create a copy of an array that won't modify the original use `copy()`
        - `subarray_copy = array[rows, cols].copy()` 
- 1d arrays
    - `array[start:stop:step]`
    - `array[:5]` first 5 elements (index 0 through 4)
    - `array[5:]` elements starting with index 5 until the end
    - `array[4:7]` elements from index 4 through index 6
    - `array[::2]` every other element
    - `array[1::2]` every other element starting at index 1
    - `array[::-1]` all elements reversed
    - `array[5::-2]` every other element in reverse from index 5 to 0 (1 really)

In [23]:
x1

array([5, 0, 3, 3, 7, 9])

In [24]:
x1[1:4]

array([0, 3, 3])

In [25]:
x1[::-1]

array([9, 7, 3, 3, 0, 5])

- multidimensional arrays
    - creating subarrays
        - `array[:rows, :cols]` accesses :rows number of rows and :cols number of columns
            - `array[:2, :3]` two rows, three columns
        - `array[:rows, ::cols]` ::cols acts like a step
            - `array[:3, ::2]` three rows, every other column
        - `array[::-1, ::-1]` reverses the rows and reverses the columns
    - accessing rows and columns
        - `array[:, col]` accesses all rows in column at specified index
        - `array[row, :]` accesses entire row at specified index

In [26]:
# view x2
x2

array([[3, 5, 2, 4],
       [7, 6, 8, 8],
       [1, 6, 7, 7]])

In [33]:
# assign 3rd row to a subarray variable
# modifying this subarray will change the original
x2sub = x2[2, :]
print(x2sub)

[1 6 7 7]


In [35]:
# modifying the subarray modifies the original array
x2sub[0] = 99
print(x2)

[[ 3  5  2  4]
 [ 7  6  8  8]
 [99  6  7  7]]


In [37]:
# copies can be modified without altering the original
x2subcopy = x2[2, :].copy()
print(x2subcopy)
x2subcopy[0] = 1
print(x2subcopy)
print()
print(x2)

[99  6  7  7]
[1 6 7 7]

[[ 3  5  2  4]
 [ 7  6  8  8]
 [99  6  7  7]]
