# Lab 4: Intro to Numpy
The Numpy module is fundamental to almost all analysis tools that use Python. The power of Numpy is that it allows you to do math on whole arrays, and it does this math in C. This allows the operations to be very fast which is essential when dealing with large amounts of data. For more information see:
* http://www.numpy.org

* https://docs.scipy.org/doc/numpy-dev/user/quickstart.html

## Python indices are "zero-based"

* The center of the origin pixel is index 0
* e.g. for a 2D array the origin pixel (lower-left) is [0, 0]

### Comparisons to other languages/applications:

* 0-based:  python, C, IDL
* 1-based:  fortran, iraf, FITS WCS, SExtractor, ds9

## Python arrays are stored in "row-major" order

* for a 2D array, if x is the column index and y is the row index, then
the array is indexed as **[y, x]**
  * e.g. **data[y, x]**
  * *x (column) is the fast array index and y (row) is the slow array index*
* for a 3D array, index as e.g. **data[z, y, x]**

## Numpy multidimensional array (ndarray):
* an array of homogeneous elements (usually numbers), all of the same type
* a memory-efficient container that provides fast numerical operations
* designed for scientific computation (array-oriented computing)

First you need to import the Numpy Module

In [1]:
import numpy as np    # standard convention

## How to create a Numpy Array
You basically create a list and generate the numpy array from there.

In [2]:
# define a 1D array of 4 elements
a = np.array([0, 1, 2, 3]) #Convert a list to a numpy array
print(a)
print(type(a))

[0 1 2 3]
<class 'numpy.ndarray'>


In [3]:
# define a 2D (3x3) array
b = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8]])
print(b)

[[0 1 2]
 [3 4 5]
 [6 7 8]]


## Arange
Like `range()` `np.arrange()` produces a sequence of numbers. Unlike `range()` this is not a range object but rather a numpy array.

In [4]:
c = np.arange(10)
d = np.arange(2, 5, 0.5)  # start, stop (exclusive), step
print(c)
print(d)

[0 1 2 3 4 5 6 7 8 9]
[2.  2.5 3.  3.5 4.  4.5]


## Zeros
np.zeros let's you create an empty numpy array of any size or dimension.

In [5]:
e = np.zeros(3)
print(e)

[0. 0. 0.]


In [6]:
f = np.zeros((3, 3))
print(f)

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]


## Array attributes
Numpy arrays have a number of attributes that you can access to determine what kind of array it is.

In [7]:
a = np.array([[0, 1, 2], [3, 4, 5]])
print(a.ndim) #How many dimensions
print(a.size) #How many elements
print(a.shape) #How are those elements arranged
print(a.dtype) #What is the object type

2
6
(2, 3)
int64


## Basic Numpy Operations
The most powerful thing about Numpy arrays is that opreations work *elementwise* and in C. So you can use a numpy array in an equation.

In [8]:
print('a Matrix:')
a = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8]])
print(a)
print('Addition:')
print(a + 10)
print('Power:')
print(a ** 3)
print('Equations:')
print(a + (2 * a))
print('Elementwise multiplication, not matrix multiplication')
print(a * a )
print('Matrix multiplication:')
print(np.dot(a, a))
# a.dot(a)   # shorthand for above
print('To Save memory you can do operations in place:')
a *= 3
print(a)

a Matrix:
[[0 1 2]
 [3 4 5]
 [6 7 8]]
Addition:
[[10 11 12]
 [13 14 15]
 [16 17 18]]
Power:
[[  0   1   8]
 [ 27  64 125]
 [216 343 512]]
Equations:
[[ 0  3  6]
 [ 9 12 15]
 [18 21 24]]
Elementwise multiplication, not matrix multiplication
[[ 0  1  4]
 [ 9 16 25]
 [36 49 64]]
Matrix multiplication:
[[ 15  18  21]
 [ 42  54  66]
 [ 69  90 111]]
To Save memory you can do operations in place:
[[ 0  3  6]
 [ 9 12 15]
 [18 21 24]]


## Statistical Functions
Numpy has many functions that can take a numpy array and return a statistical value. Things like sum, average, median, and standard deviation are built-in. You can often call these functions in two ways.

In [9]:
print(np.sum(a))
print(a.sum())
print(np.mean(a))
print(a.mean())
print(np.std(a))
print(a.std())

108
108
12.0
12.0
7.745966692414834
7.745966692414834


## Mathematical Functions
Whereas most of the function in the math module work only on single numbers. The same function exist in numpy, so that you can work on them all at the same time.

In [10]:
x = np.arange(5)
print(x)
print(np.exp(x))
print(np.sqrt(x))
print(np.sin(x))

[0 1 2 3 4]
[ 1.          2.71828183  7.3890561  20.08553692 54.59815003]
[0.         1.         1.41421356 1.73205081 2.        ]
[ 0.          0.84147098  0.90929743  0.14112001 -0.7568025 ]


## Indexing and Slicing
Numpy has a very powerful indexing and slicing abilities. It has the same indexing abiities as a Python sequence, but it also allows you to do boolean expressions to create new numpy arrays from old ones.

Here are some of the standard indexing:

In [11]:
x = np.arange(10)**3
print(x)
print(x[3])
print(x[-1])
print(x[3:6])

[  0   1   8  27  64 125 216 343 512 729]
27
729
[ 27  64 125]


## Matrix Indexing
Remember that matrices are row-major.

In [12]:
a = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8]])
print(a)
print(a[1,2])

[[0 1 2]
 [3 4 5]
 [6 7 8]]
5


## Here is the Fancy Indexing
You can give numpy arrays indices in any order and they don't have to be continuous. You can also use boolean expressions to make mask arrays that effectively let you creat new numpy arrays from conditions placed on old ones.

In [13]:
idx = [5, 2, 1]
print(x[idx])
#A shorthand way
print(x[[5,2,1]])

[125   8   1]
[125   8   1]


In [14]:
maskidx = (x > 300)
print(maskidx)
print(x[maskidx])
#A shorthand way to do this
print(x[(x>300)])

[False False False False False False False  True  True  True]
[343 512 729]
[343 512 729]


In [15]:
maskidx = ((x > 50) & (x < 200)) #Logical AND
print(x[maskidx])
maskidx = ((x < 50) | (x > 200)) #Logical OR
print(x[maskidx])

[ 64 125]
[  0   1   8  27 216 343 512 729]


## Lab 4: Now it is your turn
Please answer the following questions, then print them off and turn them in. You don't need to print the whole notebook. Only print the pages starting from here.

Name: 

**Q1: Create a string array of letters in alphabetical order with shape (3,4). Print the array, what shape it is, what type it is, what size it is, and how many dimensions it has.**

**Q2: How would I index the element of matrix `a` that contains the number 7?**

In [None]:
a = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8]])

**Q3: Use numpy to find the the y values for `x = [56, 62, 84, 16, 57, 73, 84, 27, 93, 42, 33, 17, 30, 72, 57, 53, 41, 13, 36, 79]` given a line of slope of 3 and y-intercept of -2.**

**Q4: What is the average of the x array? What is the median of x? What is the standard deviation of x?**

**Q5: What is the average and standard deviation of only those elements greater than 20, but less than 50?**