*this notebook was adapted from 
 [course notes by Dr. Kate Follette](https://github.com/kfollette/ASTR200-Spring2017)*

### *** Names: [Insert Your Names Here;      Optional: indicate preferred pronouns]***

# Problem Set 6             
## PHYS 105A   Spring 2019

## Contents

1. Lists vs. Arrays 
2. Basics of plotting
3. Making plots look nicer
4. Scatter Plots

# 1.  Creating Lists vs. Arrays of numbers

Recall, we can create list of integers using the range function and converting the output to a list:

`list(range(start,end,increment))`

In [None]:
a = list(range(10)) # increment by 1, starting at 0
a

In [None]:
aa = list(range(1,10,2)) # increment by 2, starting at 1
aa

In [None]:
# We can also ask for the minimum value of a list of integers
min(aa)

In [None]:
# and the maximal value
max(aa)

Recall that `Lists` are types of containers. Another type of container is an `Array`.

In [2]:
import numpy as np # we are importing the numpy package and will call it using a shortcut `np`

numpy has a function called `arange` that generates an array of numbers

In [None]:
b = np.arange(10) 
b

In [None]:
bb = np.arange(1,10,2) # start at 1, increment by 2
bb

`Arrays` also have indices.  Like `Lists`, the first index of an array is 0

In [None]:
bb[0] # the first element in the array

In [None]:
bb[-1] # the last element in the array

In [None]:
# Index slicing also works with arrays
bb[:]

In [None]:
# select 2nd through last element of the array
bb[1:-1]

In [None]:
# like with lists you can ask for the length of the list
len(bb)

In [None]:
# and you can ask for the maximal or minimal value of an array
max(bb)

In [None]:
min(bb)

So why use an `Array` instead of a `List` ? 

** Reason 1. The range function only increments by integer values.** 

In [None]:
# Try to create a list that starts at 1, ends at 10 and increments by 0.1 
list(range(1,10,0.1)) 

In [None]:
# Instead do this with an array
np.arange(1,10,0.1)

**Reason 2. With an `array` you can act on the entire set of numbers without requiring a `for loop` **

Increase each item in the `List` `a` and the `Array` `b` by a factor of 2.

In [None]:
# For the List a, this is done using a for loop.

for i in a:
    a[i] *= 2

print(a)

 Note: We can use `List comprehension` to make this more compact.
 The below is equivalent to the above code cell.


In [None]:
a = list(range(10)) # reset the list 

newlist = [i*2 for i in a]
print(newlist)

In [None]:
# For the array b, increasing each elements by a factor of 2 is much simpler
b *= 2 # or equivalently b = b*2
print(b)

** Reason 3.  Arrays have more efficient index slicing capabilities with conditional statements **


`numpy arrays` have a special feature that enables efficient index slicing with conditionals without using for loops.  

`name_of_array[conditional statement]`

Where the indices are specified by the conditional statement.

Examine the example below, which computes a new array called `evens` containing all even numbers between 1 and 20, inclusively, given an original array `z` of all numbers from 1 to 20 inclusively.  

In [None]:
z = np.arange(1,21) # first define a new array with all integers from 1 to 20 inclusively.
print(z) # print it to the screen

In [4]:
# now define a new array that is a subset of the original array z, but the desired indices are the elements 
# in z that satisfy the condition z%2 == 0.

evens = z[z%2 ==0] # this new array contains only the even numbers in z.
print(evens)

[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20]
[ 2  4  6  8 10 12 14 16 18 20]


The below example prints all numbers between 5 and 15, non-inclusively. 
Note that we have used the bit-wise operator `&` here instead of the boolean operator `and`

In [8]:
print(z[(z>5) & (z<15)])

[ 6  7  8  9 10 11 12 13 14]


** Reason 4. Arrays can also have additional columns (2D, 3D, etc.) - but we'll get to that in another class** 

## Exercise 1.

 Practice with `List comprehension` :
 
a) Create a `List` called `q` using `list(range())` that starts 1 and ends at 20 in increments of 3. Print `q` to the screen.

b) Create a new `List` called `qq` that is defined using `List comprehension` as `q**3`. Print `qq` to the screen.

c) List comprehension can also include `if statements`.

`list = []`

`for item in list:
    if conditional:
        expression`
        
Can be re-written as : 

`list = [ expression for item in list if conditional ]`

The below code creates a new list called qqq that stores all values of qq (from part b) that are larger than 1000 and divides them by 1000.

below using `List comprehension`

## Exercise 2

Practice with Arrays.


a) Create a `Array` called `qa` that starts 1 and ends at 20 in increments of 3. Print `qa` to the screen.

b) Create a new `Array` called `qqa` that is defined as `qa**3`. Print `qqa` to the screen.

c) Create a new `numpy Array` called qqqa that contains all values of qqa that are larger than 1000 and divides them by 1000. This is the same as Exercise 2 d.

Avoid a for loop by defining the indices using a conditional statement.  

print qqqa to the screen


# 2. Plotting Basics

We are going to create plots using the python library called `matplotlib`

`matplotlib` contains a plotting module pyplot.

The first line in the below code cell `imports` the pyplot module from `matplotlib` and says that we will refer to the `pyplot` as `plt` 

The second line tells the notebook that you want to display any plots `inline` (i.e. inside the notebook). This is done with the magic function `%matplotlib inline`.

In [None]:
# You must always include the below lines of code somewhere in the Jupyter notebook if you 
# want to make a plot.

import matplotlib.pyplot as plt
%matplotlib inline

This gives you access to all of `pyplot`'s functions, which can be called as `plt.functionname`.

In particular, pyplot has a function called `plot`, which you can call as: 
`plt.plot(arg1,arg2)`

In [None]:
# define a function y = x^2 
def y(x):
    return x**2

In [None]:
# Let's plot the function 2 ways:

# Method 1, using Lists and List Comprehension

x = list(range(10))
plt.plot(x, [y(i) for i in x])

In [None]:
# Method 2, using Arrays

xx = np.arange(0,10,0.1)
plt.plot(xx, y(xx)) # no for loops necessary

# 3. Making Plots Nicer


Here are some especially useful pyplot functions, called with `plt.functionname(input(s))`:

`xlabel` and `ylabel` set the labels for the x and y axes, respectively and have a required string input.

`title` sets the title of the plot and also requires a string input.

These can be specified either in the call to the function or separately before or after the plot call.

In [None]:
# Method 2, using Arrays


xx = np.arange(0,10,0.1) # defining the x axis

# plotting the functions
plt.plot(xx, y(xx)) 

# let's add axis labels
plt.xlabel("the independent variable (no units)")
plt.ylabel("the dependent variable (no units)")

# let's add a title
plt.title("The Quadratic Function")


Let's add another function to this plot.

In [None]:
# This is the same plotting instructions as above.
# But now we have added another line to the plot.

xx = np.arange(0,10,0.1) # we can use this same array for BOTH functions

# plotting the functions 
plt.plot(xx, y(xx))  # original plot
plt.plot(xx, xx**3) # Here is where we have the new plot.
        # We don't *have* to define a function to plot here:  xx is an array 

# axis labels
plt.xlabel("the independent variable (no units)")
plt.ylabel("the dependent variable (no units)")

# plot title
plt.title("The Quadratic and Cubic Function") # the title has been modified


When you include another plot, pyplot will automatically change the color of the line for you.
But you can change the color, linewidth and linestyle.

Line properties are controlled with optional keywords to the plot function, namely the commands `color`, `linestyle` and `linewidth`. The first two have required string arguments, and the third (linewidth) requires a numerical argument in multiples of the default linewidth (1).

`linestyle` options: [ '-' | '--' | '-.' | ':' |]

`color` options: 'red','green','yellow','blue', 'orange', 'cyan', 'magenta','black'


In [None]:
# This is the same plotting instructions as above.
# But now we have modified the appearance of the lines.

xx = np.arange(1,10,0.1) # we can use this same array for BOTH functions

# plotting the functions 
plt.plot(xx, y(xx),linewidth=5, linestyle=':',color='red')  
plt.plot(xx, xx**3,linewidth=5, linestyle='-.',color='green')

# axis labels
plt.xlabel("the independent variable (no units)")
plt.ylabel("the dependent variable (no units)")

# plot title
plt.title("The Quadratic and Cubic Function") 


To add a legend to a plot, you use the pyplot function `legend`. 

You can assign labels to each line with the label keyword (string input), as below, and then call `legend` with no input. 

We can also change the range of the plot. `xlim` and `ylim` set the range of the x and y axes, respectively and they have two required inputs - a minimum and maximum for the range. By default, the axis range is set to encompass all of the values in x and y, but there are many cases where you might want to "zoom in" on certain regions of the plot.

In [None]:
# This is the same plotting instructions as above.
# But now we have modified the appearance of the lines.

xx = np.arange(1,10,0.1) # we can use this same array for BOTH functions

# plotting the functions 
plt.plot(xx, y(xx),linewidth=5, linestyle=':',color='r',label='quadratic') # label added
plt.plot(xx, xx**3,linewidth=5, linestyle='-.',color='g',label='cubic') # label added

# axis labels
plt.xlabel("the independent variable (no units)")
plt.ylabel("the dependent variable (no units)")

# plot title
plt.title("The Quadratic and Cubic Function") 

# fixing the limits to zoom in on a region of the plot
plt.xlim(2,5)
plt.ylim(2,100)

# adding the legend
plt.legend()

As you can see above, the default for a legend is to place it at the upper right of the plot, even when it obscures the underlying lines and to draw a solid line around it (bounding box).

Generally speaking, bounding boxes are rather ugly, so you should nearly always (unless you really want to set the legend apart) use the optional Boolean (True or False) keyword `frameon` to turn this off. 

Legend also takes the optional keyword loc to set the location of the legend. Loc should be a string. More options for legend are found [here](https://matplotlib.org/users/legend_guide.html)

You can also save the plot to a file using the function `savefig` and setting the resolution.

In [None]:

xx = np.arange(1,10,0.1) # we can use this same array for BOTH functions

# plotting the functions 
plt.plot(xx, y(xx),linewidth=5, linestyle=':',color='r',label='quadratic') 
plt.plot(xx, xx**3,linewidth=5, linestyle='-.',color='g',label='cubic') 

# axis labels
plt.xlabel("the independent variable (no units)")
plt.ylabel("the dependent variable (no units)")

# plot title
plt.title("The Quadratic and Cubic Function") 

plt.xlim(2,5)
plt.ylim(2,100)

plt.legend(loc="upper center", frameon=False) #modified the legend 


plt.savefig('PSet6_QuadraticCubic.png', dpi=300) # save the figure to a file, 
#dpi indictes the resolution

## Exercise 3.

a) 

Recall when we discussed index slicing, we specified the start and end range. E.g.

`x[1:20]`, meaning `x[start at 1 : end before 20]` 


We can also specify skipping over some indices within that range using the following syntax:

`x[1:20:3]`, meaning `x[ start at 1: end before 20 : skip every 3rd item]`

In the below, a Quartic function is added to the plot using this functionality - blue points are plotted for this function, skipping every 2nd index. 

Add another Quartic function that fills in the gap using red dots, as instructed in the comments (#) 

In [None]:
xx = np.arange(1,10,0.1) # we can use this same array for all functions

# plotting the functions 
plt.plot(xx, y(xx),linewidth=5, linestyle=':',color='green',label='quadratic') 
plt.plot(xx, xx**3,linewidth=5, linestyle='-.',color='red',label='cubic') 


# Quartic Function 
# 'bo' by itself indicates plotting the data using blue points 
# [::2] means, start at the beginning, go all the way to the end, but skip every 2nd index. 
plt.plot(xx[::2], xx[::2]**4, 'bo', label='quartic')

###############  Add here a new line to plot  ########## 
# copy the same line as above. 
# but modify the index so that it starts at 1 
# use 'ro' instead of `bo`, to plot the data using red points 





# axis labels
plt.xlabel("the independent variable (no units)")
plt.ylabel("the dependent variable (no units)")

# plot title
plt.title("The Quadratic, Cubic and Quartic Function") 

plt.xlim(2,5)
plt.ylim(0,100)

plt.legend(loc="upper left", frameon=False) #modified the legend 


plt.savefig('PSet6_QuadraticCubicQuartic.png', dpi=300) # save the figure to a file, 
#dpi indictes the resolution

b)
   - Define an array called `n` with a range from 0 to 6*np.pi in increments of 0.1
   
Do each step below, re-generating the plot after *each* line to make sure the plot is doing what you want it to.
Note that numpy has its own functions (like np.sin, np.exp etc) that can take whole arrays component wise.  

   - Plot `n` vs `np.sin(n)` :   linewidth=3 and color blue
   - Plot `n` vs `-np.sin(n)` : linewidth=3,  linestyle =':' and color blue 
   - Plot `n` vs `cosine` of `n` : linewidth=3 and color red
   - Plot `n` vs `cosine` of `n` with an amplitude of `2` : linewidth=3, color red, linestyle=':'
   - add a horizontal line at y = 0 by plotting `n` vs. `0*n` : color gree, linestyle='--'
   - change the xlimits to go from 0 to the maximal value of `n`
   - Add axes labels : `radians`  and `amplitude`
   - Add a title
   - Add a legend that is outside the plot:  plt.legend(loc='upper right', bbox_to_anchor=(1.3, 1.05)).  You will need to add a `label` to each plt.plot call.
   - Save the plot to a file called Pset6_Exercise3.png


# 4. Scatter Plots

We can use `numpy arrays` to generate arrays with random numbers.

The below code generates 10 random `floats` from `0 to 1` and stores it as an array.

In [None]:
N = 10 # number of desired random floats. 

x = np.random.rand(N) # this numpy function creates an array of the given shape (N) 
 #and populates it with random samples from a uniform distribution over [0, 1).

print(x)

If we wanted random numbers between 0 and 20 we simply multiply x by 20

In [None]:
x_new = x*20
print(x_new)

We can use arrays with random numbers to generate `scatter plots`. We will use these plots later to plot tables of real data.

In [None]:
# let's define two arrays with 1000 random numbers sampled between 0 and 100
N = 1000
x = np.random.rand(N)*100
y = np.random.rand(N)*100

colors = np.random.rand(N) # This is another random list of 1000 numbers from 0 to 1 that 
                             # matplotlib will map onto a color table. 
    
area = (30*np.random.rand(N))**2 # randomly assigning area of each point. 0 to 15 point radii 

# Scatter plots 
plt.scatter(x, y, s=area, c=colors, alpha=0.5) # alpha indicates transparency of each point

plt.xlabel('x')
plt.ylabel('y')
plt.title('Scatter Plot')
plt.savefig('Pset6_ScatterPlot.png', dpi=300) # save the figure to a file, 


## Exercise 4.

a) Using the previously defined arrays x and y, plot only the largest points. 
Remember that x, y, area and colors are all arrays. To identify the largest points, you need
those points that satisfy the following conditional statement: 

( (area <= max(area)) &  (area >= max(area)/1.2))


This is a long condition and it would be annoying to have to write this out each time for each array. 

Instead, `numpy arrays` have a special feature to allow you to store the desired index range as a variable: 

`desired_index = np.where(conditional statement)`

This defines an array `desired_index` that stores the indices of the original array where the conditional statement is true.  An example is shown below. Use this method to plot only the largest points. 


In [21]:
## Make sure you understand what is going on here: 

test = np.arange(1,21)
print(f"old test array {test}")

index = np.where( (test <= max(test)) & (test >= max(test)/2.0))
print(f"subset of test array: {test[index]}")

old test array [ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20]
subset of test array: [10 11 12 13 14 15 16 17 18 19 20]


b) Do the same as part (a) but now for the points with the smallest areas.  This doesn't need to be precise, but it should visually appear to only be small point sizes. 

Note that min(area) will be a small number, so to define an upper end to the range you may need to multiply that value by a large number. 

More information about plotting with matplotlib is found [here](https://matplotlib.org/index.html)

# Submitting Problem Sets for Grading

When you're done with a notebook, click "restart & clear all output" from the kernel menu and then "close and halt" from the file menu. Then to shut down the notebook server, go back to the original window where you started things up and type ctrl-C. You will be asked for confirmation, and if you type y, the server will then shut down.

**Change the name of the notebook to user_name_ProblemSet6.ipynb**

Submit the notebook on D2L

The problem set will be due before 5 PM the day before next class.