# Numpy Tutorial
### What is Numpy?
Numpy is a python library for managing big data. Big data is just large sets of data. Python is a scripting language designed to be easily written. Because of this, underlying data storage is very innefficient. When storing and manipulating large quantities of data, this can become a problem; execution takes a long time and requiers hundreds of gigabytes (if not terabytes) of memory. Numpy is a solution which stores data in optimized arrays, while still providing a simple interface for interaction and manipulation.

### What is an array?
An array is a list of data. Arrays have 'dimensions', which are similar to dimensions in math. A 1 dimensional (1D) array has numbers in a line. This looks like [0,1,2,3,4]. A 2 dimensional (2D) array has numbers in a plane. This looks like: [[ 0,  1,  2,  3,  4], [ 5,  6,  7,  8,  9], [10, 11, 12, 13, 14]]. The beauty of numpy is n-dimensional arrays. While it becomes hard to visualize, numpy allows 3D, 4D, or 99D arrays. This is useful for complex algorithms.

### Array shape
Array shape is just a way of representing the size of the array in all of its dimensions. For the example 2D array above, it has a shape of (3,5), because it is a 3 row by 5 column array. A 3D array's shape would have 3 numbers, and so on.

In [None]:
# Experiment with changing the array and seeing how it
# affects the shape. Can you make it (5, 3)? Can you make
# it only 1 number, or can you make it 3 numbers?

import numpy as np
a = np.array([[ 0,  1,  2,  3,  4],
              [ 5,  6,  7,  8,  9],
              [10, 11, 12, 13, 14]])

print(a.shape)

### Data in Arrays
Numpy arrays don't have to be filled with integers. They can have strings, doubles, or any other data type. The array is smart, and knows what data type it is storing.

In [None]:
# Note that I am not importiny numpy again.
# Jupyter notebook has cached the import from
# the first cell. If this is the first cell you
# run, it will error, and you will have to run
# the first cell first. It is generally good practice
# to put all imports in a cell at the top of the document
# that is run when the document is opened, no matter which
# cell is going to be run afterwards. For all future tutorials,
# that will be the case.

# Try storing differnet types of data. Python does interesting
# things with data. A string will show up as <U5, which is
# a type of unicode. Can you get <U4 or <U1? What about float64?
b = np.array([1])

print(b.shape)
# dtype stands for data type
print(b.dtype)

If you want to define a specific datatype (like if you are only storing small numbers and want to store them in int16 instead of the default int64, you can specify that when the array is created.)

You can see a list of types at https://numpy.org/doc/stable/user/basics.types.html

In [None]:
c = np.array([1], dtype=np.short)
print(c.dtype)

### Generating arrays
Arrays can be generated in multiple ways. One is manually, from numbers typed into the code, which is how all the arrays in this file have been defined so far. When creating arrays manually, np.array() must be passed a *single* argument. That is why all dimensions necessitate a set of square brackets to enclose them.

Arrays can be automatically generated with various commands. There are demos of select routines below. You can see a full list at https://numpy.org/doc/stable/reference/routines.array-creation.html

In [None]:
# np.arange(start, stop, step) generates evenly
# spaced values between the half open interval
# [start, stop) with the step size of step.
print(np.arange(2,10,2))

In [None]:
# start defaults as 0. step defaults as 1.
# stop has no default and must be defined.
print(np.arange(5))

In [None]:
# np.zeros(shape) returns an array filled with 0.

# Shape is a single argument, and must be
# contained in parentheses.

print(np.zeros((3,3)))

# The output is filled with 0. because it defaults to
# floats. You can specify integers if desired.

print()
print(np.zeros((3,3), dtype=int))

# np.ones does the same thing but fills with 1.

### Manipulating arrays
Numpy arrays are manipulated in the same way that regular arrays are manipulated: with [index]. For numpy, multiple dimensions are defined within the brackets, not as new brackets. For example, [row,col] will access a specific value in a 2D array. To get an entire row [row] works. That is a simplified syntax, though. In reality, numpy is reading [row,:], which means row and all columns (':' represents all). To get a column, the access is [:,col]. The entire synyax is [row_start:row_end,col_start:col_end]. This can expand with more dimensions and more commas.

In [None]:
# the reshape command will transform any shaped
# array into any other shaped array, as long as
# the total number of elements are the same.
d = np.arange(12).reshape(4, 3)
print(d,end="\n\n")

# accessing rows
print(d[0,:],end="\n\n")

# accessing individual items
print(d[0,1],end="\n\n")

# accessing columns
print(d[:,1],end="\n\n")

# mess around with your own commands to access 
# different segments of the array.


## Exercise
## Numpy exercise

Generate the following array in as few commands as possible (without manually initializing the entire array)

    [[0,0,0,0,0]
    [0,1,1,1,0]
    [0,1,5,1,0]
    [0,1,1,1,0]
    [0,0,0,0,0]]

<details>
  <summary>Hint 1</summary>
  use np.zero and np.ones
</details>

<details>
  <summary>Hint 2</summary>
  you can combine different sized arrays
</details>

<details>
  <summary>Hint 3</summary>
  remember to set your data type correctly
</details>

In [None]:
# Write your code here