# Plotting I

Here we are using matplotlib to create plots. To be able to see the plots in this notebook, we use the matplotlib magic line. This is a ipython notebook specific thing. 

If you're using python from within a normal editor (Sublime or PyCharm) a simple plt.show() at the end of your script will open an external window showing your plot. This allows you to see you're plot without having to save it first.

In [None]:
import numpy as np
from astropy.io import fits
import matplotlib.pyplot as plt 
%matplotlib inline

Matplotlib is built on two basic components: figures and axes. The figure is the window in which a plot is created. Each figure can contain one plot or multiple subplots/axes.

If we want to create a very simple plot, there is no need to define the figure and the axes. Matplotlib takes care of it for us:    

In [None]:
#create an array with values between 0 and 10
x = np.arange(0, 11, 1)
#create an array with values between 10 and 20
y = np.arange(10, 21, 1)

#plot y(x)
plt.plot(x, y)

# Python masks

Masks will basically speed up everything you're doing by replacing for loops. They are extremely useful if you're dealing with large arrays. Lets's do the same thing: once with a for loop and once with a mask:

## Example I

In [None]:
# this will create a 1D array with 100 entries
# starting at 0 and ending at 99
a = np.arange(0, 20, 1)

# we want to set all values to -999. for which the division by 3 produces no rest

# with a for loop
for i in range(len(a)):
    if a[i]%3 == 0:
        a[i] = -999.

print ('array a, with the for loop: ')
print (a) 


a = np.arange(0, 20, 1)
# we can speed this (ok, very simple) process up by using a mask
# this is our mask: it will select all entries for which modulo 3 is 0
mask = (a%3 == 0)
# we can now apply this to our array a
# and set the values to which this applies to -999.
a[mask] = -999.

print ('array a, with the mask: ')
print (a)

## Example II

In this second example we want to create 1000 points with random x and y coordinates and then use a function to select a subset of these points.

To do so, we use numpy's random_sample function which draws a certain number of float values from the interval [0.0, 1.0) 

In [None]:
# create 1000 random x and y values 
x = np.random.random_sample(1000)
y = np.random.random_sample(1000)

# let's plot these points
plt.plot(x, y, marker='.', color='black', linestyle='', 
         rasterized=True)

In [None]:
def func(x):
    return -x**2 + 1
# let's plot the function first
x_range = np.arange(0, 1.1, 0.1)
plt.plot(x_range, func(x_range), linestyle='-',
        color='red')

To select the points below func, we will now introduce a mask called below_line:

In [None]:
below_line = (y < func(x))

Check if this gives the right result by plotting the points: 

In [None]:
plt.plot(x_range, func(x_range), linestyle='-',
        color='red')
plt.plot(x[below_line], y[below_line], marker='.', color='red',
         linestyle='', rasterized=True)

We can also select the points above the line:

In [None]:
above_line = (y >= func(x))
plt.plot(x_range, func(x_range), linestyle='-',
        color='red')
plt.plot(x[above_line], y[above_line], marker='.', color='black',
         linestyle='', rasterized=True)

Our masks can also contain more than one condition: 

In [None]:
mask1 = (y >= func(x)) & (x <= 0.5)
mask2 = (y < func(x)) & (x >= 0.5)

plt.plot(x_range, func(x_range), linestyle='-',
        color='red')
plt.plot(x[mask1], y[mask1], marker='.', color='red',
         linestyle='', rasterized=True)
plt.plot(x[mask2], y[mask2], marker='.', color='black',
         linestyle='', rasterized=True)

## Example III

Of course we can also apply masks to real data. Let's read in our fits file again:

In [None]:
path = 'SDSS_test_cat.fits'
cat= fits.open(path)[1].data
ID = cat.field('SDSS_ID')
z = cat.field('Z')

# select objects in redshift range
mask = (z > 0.01) & (z < 0.05)

# print the ID of all objects in this redshift range
print ('IDs objects in range: ') 
print (ID[mask])
#print the totla number of objects in this redshift range
print ('Number of objects in range: %.2e' % len(ID[mask]))

# we can also use the same mask to select the obejcts that are not 
# in this redshift range
print ('IDs objects NOT in range: ')
print (ID[~mask])
print ('Number of objects NOT in range: %.2e' % len(ID[~mask]))

# and let's make sure we are really selecting all objects
print ('all objects in file: %.2e' %len(ID))
print ('objects selected with mask and ~mask: %.2e' %(len(ID[mask]) + len(ID[~mask])))

Above we have defined the mask based on the 'z' array. We have then applied it to the 'ID' array. If you're working with masks it's important that __all__ of your arrays have the __same shape and the same length__. Otherwise you might be selecting the wrong objects. Also: rather than using multiple masks on the same array at the same time, create one big mask and apply it once. Keep it simple to make sure that everything is working correctly! 

## Exercise

Use the array named 'Zoo' below to determine mean ages and weights of Zoo animals. 

* print the name, age and weight of all elephants

Then split the array into four vectors containing the species, the names, the age and the weight of the animals. Convert the age and weight vectors from string to float and calculate:

* the average age of all animals 
* the average weight of all animals
* the average weight of all elephants
* the average weight of all animals that are not elephants
* the average age of all animals that weigh less than 1000kg 
* the average age of all elephants that are less than 50 years old

In [None]:
Zoo = np.array([['Elephant', 'Lisa', '60.', '5560.'],
               ['Tiger', 'Sheldon', '22.', '250.'],
               ['Bear', 'Brutus', '18.', '198.'],
               ['Hippopotamus', 'Lucy', '55.', '1012.'],
               ['Elephant', 'Peanut', '3.', '100.'],
               ['Seaotter', 'Otto', '12.', '25.'],
               ['Elephant', 'Nadim', '32.', '6000.']])