# Demo of Python : Simple matrix operations & Plotting

Author : Keith Hawkins (UT Austin)
AST352K Spring 2023

Class : AST352K

Demo of Python

Date : Aug 25, 2022

## Learning goals
The purpose of this tutorial/demo is to introduce you to the basics of python that will be required to complete the various projects in this class. Here we will practice using arrays and lists, simple mathematical opportations, plotting, etc. 

Introduce the arrays and lists
Introduce simple plotting in 1-,2-d with matplotlib (e.g. x-y, histogram, histogram in 2d via hexbin)


We will start by loading in the nessisary libraries. Inside the libraries exists defintions (functions) that allow us to carry out what we want the code to do. 


Lets begin!
DATA LOCATION 
Data sets: 
high_quality_gaia.fits [50 MB] = astrometry, photometry for 500,000+ random milky way stars with uncertainties in parallax better than 1%. Data location : https://utexas.box.com/s/x4kfcv97bdxrhwkxllm687wjo3u2ls54

Download and save to the SAME LOCATION as this file.

In [18]:
#lets make a plots interactive; comment out if you do not have widget installed
%matplotlib widget 

#if we wanted to make the plots inline in the notebook and not interactive / if you dont have widget lets talk
#%matplotlib inline

#Importing libraries 
import numpy as np #numpy
import matplotlib.pyplot as p
import astropy
from astropy.table import Table
from cycler import cycler

Styling plots: 
    Lets set some of the default parameters for plotting. This cell is not needed but i like to style my plots.

In [5]:
#Lets set some of the default parameters for plotting. This cell is not needed but i like to style my plots.
p.rc('axes',prop_cycle=(cycler('color', ['k','b','g','r','c','m','y'])))
p.rcParams['lines.linewidth']= 1.5
p.rcParams['axes.linewidth']=2.0
p.rcParams['font.size']= 15.0
p.rcParams['axes.labelsize']=16.0
p.rcParams['axes.unicode_minus']=False
p.rcParams['xtick.major.size']=6
p.rcParams['xtick.minor.size']=3
p.rcParams['xtick.major.width']=1.5#2.0
p.rcParams['xtick.minor.width']=1.0
p.rcParams['axes.linewidth']=2.5
p.rcParams['axes.titlesize']=20#'large'
p.rcParams['xtick.labelsize'] = 20#'x-large' # fontsize of the tick labels
p.rcParams['ytick.labelsize']=20 #'x-large'
p.rcParams['ytick.major.width']=2.0 #4
p.rcParams['ytick.minor.width']=1.0 #2.0
#--- added ONLY if you have LATEX installed otherwise comment out this-----
p.rcParams['text.usetex']= True
p.rcParams['mathtext.fontset']= 'custom'
p.rcParams['mathtext.default']= 'rm'
p.rcParams['axes.formatter.use_mathtext']=False
#-----------------------------------------------

## Arrays and Printing 

In [13]:
#lets define an array x such that it ranges from -20 to 20 with 0.1 step intervals 
x = np.arange(-20.0,20.1,0.1) #generates an array from -20 to 20 with step size

#what is in x?
print(x) #lets print x and see whats inside; 


[-2.00000000e+01 -1.99000000e+01 -1.98000000e+01 -1.97000000e+01
 -1.96000000e+01 -1.95000000e+01 -1.94000000e+01 -1.93000000e+01
 -1.92000000e+01 -1.91000000e+01 -1.90000000e+01 -1.89000000e+01
 -1.88000000e+01 -1.87000000e+01 -1.86000000e+01 -1.85000000e+01
 -1.84000000e+01 -1.83000000e+01 -1.82000000e+01 -1.81000000e+01
 -1.80000000e+01 -1.79000000e+01 -1.78000000e+01 -1.77000000e+01
 -1.76000000e+01 -1.75000000e+01 -1.74000000e+01 -1.73000000e+01
 -1.72000000e+01 -1.71000000e+01 -1.70000000e+01 -1.69000000e+01
 -1.68000000e+01 -1.67000000e+01 -1.66000000e+01 -1.65000000e+01
 -1.64000000e+01 -1.63000000e+01 -1.62000000e+01 -1.61000000e+01
 -1.60000000e+01 -1.59000000e+01 -1.58000000e+01 -1.57000000e+01
 -1.56000000e+01 -1.55000000e+01 -1.54000000e+01 -1.53000000e+01
 -1.52000000e+01 -1.51000000e+01 -1.50000000e+01 -1.49000000e+01
 -1.48000000e+01 -1.47000000e+01 -1.46000000e+01 -1.45000000e+01
 -1.44000000e+01 -1.43000000e+01 -1.42000000e+01 -1.41000000e+01
 -1.40000000e+01 -1.39000

In [14]:
#how many elements are in x?
print('There are %i elements in array x'%(len(x))) #lets print how many elements make up array x

There are 401 elements in array x


## Mean / Median : Simple Statistics
What do you think the mean and median of x will be ???

In [15]:
meanx = np.mean(x)
medianx = np.median(x)
print('The Mean of X = %.2f'%(meanx))
print('The Median of X = %.2f'%(medianx))

The Mean of X = 0.00
The Median of X = 0.00


Why are these not 0? Should they be? 

## Simple Operations on Arrays and Plotting

In [16]:
#lets define a new variable y such that y = x^2 NOTE ** is the same thing as power in python
y = x**2 

#lets also define a new parameter that is log(x) and log10(x)
y2 = np.log(x) #natural log Note: there will be some errors b/c negative numbers cannot be in the log 
y3 = np.log10(x) #log base 10


#how many elements are in y
print('There are %i elements in array y'%(len(y))) #lets print how many elements make up array x

There are 401 elements in array y


  y2 = np.log(x) #natural log Note: there will be some errors b/c negative numbers cannot be in the log
  y3 = np.log10(x) #log base 10


## Simple plotting 
Lets try to plot y as a function of x!

In [19]:
p.figure() # create a figure 
p.plot(x,y,ls='-',color='k') #plot y vs x  black solid line of
p.xlabel('x') #label the x axis with 'x'
p.ylabel('y') #label the y axis with 'y'
p.tight_layout() # remove the white spaces

Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

Lets now try plotting y2 (i.e. y2= log(x)) and y3 (i.e., y= log10(x)) as a function of x on the same plot. 

In [21]:
p.figure() 
p.plot(x,y2,ls='-',color='k',label='y=log(x)',lw=0.5) #natural log (ln) ; solid black line with label and line weight=2
p.plot(x,y3,ls='-.',color='r',label='y=log10(x)',lw=2) #log base 10; dot-dashed black line with label and line weight=2
p.xlabel('X') ; p.ylabel('Y')
p.legend()
p.tight_layout()

Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

what if we wanted to make a histogram of x ? 
NOTE: Histograms allow us to see the 'Distribtuion' in a parameter.

In [24]:
p.figure() #creating a figure
p.hist(y, histtype='step',lw=4,bins=20) #plotting the histogram
p.ylabel('N') #creating a ylabel
p.xlabel('y') #creating an x label 
p.tight_layout() #makes the figure "tight" so there arent alot of white spaces

Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

## Loading in Data Tables w/ Astropy, Plotting Density Diagrams 

Ok.. so now that we have constructed an array, preformed a mathematical opporation on it, and plotted the results, lets try something harder. Lets try to read in some actual Gaia Data using astropy and plots the on sky positions for the stars 

In [25]:
T = Table.read('./high_quality_gaia.fits') #lets read in the fits table this is downloaded from the Gaia archive upon the completion of the ADQL query 
#NOTE: for this to work its T= Table.read(full path of data table './' = local folder/directory)
T.colnames #this allows us to see the column names 



['source_id',
 'ra',
 'dec',
 'l',
 'b',
 'parallax',
 'pmra',
 'pmdec',
 'phot_g_mean_mag',
 'phot_bp_mean_mag',
 'phot_rp_mean_mag',
 'bp_rp',
 'ag_gspphot',
 'azero_gspphot',
 'ebpminrp_gspphot',
 'has_rvs',
 'parallax_over_error',
 'radial_velocity']

In [26]:
T.show_in_notebook

<bound method Table.show_in_notebook of <Table masked=True length=513123>
     source_id              ra         ... parallax_over_error radial_velocity
                           deg         ...                          km / s    
       int64             float64       ...       float32           float32    
------------------- ------------------ ... ------------------- ---------------
4274927567020274560 274.03823198505614 ...           31.061579             nan
4275021089923348096 274.27771116663604 ...            14.45642             nan
4274940932959028352 274.50170677949245 ...           18.332487             nan
4274930758169291776 274.35488241748374 ...           19.541174             nan
4274935160522571904  274.2652202652083 ...           19.613132             nan
4089962012974230016  276.8876609140014 ...           10.008068             nan
4089962253497885184  276.8140700627627 ...           11.857016      -30.657034
4089963662247137536   277.014841049893 ...             10

In [36]:
print('There are %i stars in this dataset'%(len(T)))

There are 513123 stars in this dataset


In [27]:
T['ra'] #calling column ra 

0
274.03823198505614
274.27771116663604
274.50170677949245
274.35488241748374
274.2652202652083
276.8876609140014
276.8140700627627
277.014841049893
276.93704475699496
276.7523240629263


In this case, we want to make 2 plots (this is usually better when there are a large number of points): 1. a 2D histogram which shows the on sky location of the stars in RA/DEC (Equatorial Coordinates) 2. a 2-D histogram which shows the on sky location of the stars in l,b (Galactic Coordinates). 

In [31]:
#----plot 1 Equatorial Coordinates------
p.figure()
#we will use hexbin which creates hexagonal bins and the color in each bin represents the number of stars in that bin
#bins='log' will define logarthimic bins; mincnt = defined the min count to plot

PLT= p.hexbin(T['ra'],T['dec'],bins='log', mincnt=1,cmap='Greys') #create a hexbin/2d histogram of ra,dec with log bins. 
#in the above mincnt = 1 means there must be at least 1 star in the bin!
p.colorbar(PLT,label='log(N)') #add the colorbar to the hexbin defined as PLT
p.xlabel('RA (deg)') #Ra x label
p.ylabel('DEC (deg)') # dec y label
p.tight_layout() #remove white spaces

#--------

Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

In [32]:
#----plot 2------
p.figure()
#we will use hexbin which creates hexagonal bins and the color in each bin represents the number of stars in that bin
#bins='log' will define logarthimic bins; mincnt = defined the min count to plot
p.hexbin(T['l'],T['b'],bins='log', mincnt=1,cmap='Greys') 
p.xlabel('l (deg)') #Ra x label
p.ylabel('b (deg)') #
p.tight_layout()

#--------

Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

Lets transform this a little so that the l degree runs -180 to 180 to see the galaxy without the wrap around.
To do this, we need to find all galactic longitudes that are larger than 180 and subtract 360 deg from it. That way stars with longitudes of 360 degree will also be at 0. 

To do this .. i will need to search the Galactic longitude array to find where the values are larger than 180 degrees and subtract of 360 degrees. Thus something that has a l = 360 will actually have a modified l at -180 degrees. We will use the "where" function in numpy to do this.

In [35]:
l = T['l']
l[np.where(l>= 180)] = l[np.where(l>= 180)]-360 #find all places where l >= 180 and subtract them by 360 deg
#----plot 2------
p.figure()
#we will use hexbin which creates hexagonal bins and the color in each bin represents the number of stars in that bin
#bins='log' will define logarthimic bins; mincnt = defined the min count to plot
p.hexbin(l,T['b'], mincnt=1,gridsize=100, bins='log',cmap='Greys') 
p.xlabel('l (deg)') #Ra x label
p.ylabel('b (deg)') #
p.axhline(y=0,ls='--',color='k',lw=1)
p.axvline(x=0,ls='--',color='k',lw=1)
p.tight_layout()
#--------

Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

## Making Definitions and Using for loops
Sometimes, when coding it is easier to make a definition/function whose purpose is to take in some varibles and output something. Additionally, it may also be required to do something many times again and again, which is where for loops can be very helpful. Here we will work on both.

Let say we want to build a function which will take as inputs the the constants and x array to a 2nd order polynomial equation, y = $ax^2 + bx+ c$, and as an output give us y.

In [36]:
def poly_order2(x, a,b,c): #make a definition called poly_order2 and take as inputs x, a,b,c
    y = a*x**2 + b*x + c
    return y

now lets make an array x for which we will evaluate the polynomial 

In [37]:
x = np.arange(-40,40,0.1) # makes an array of numbers between -40,40 with 0.1 steps

now lets evaulate the function over all y's putting in a=2,b=1,c=5 and print y

In [38]:
y = poly_order2(x,2,1,5)
print(y)

[3165.   3149.12 3133.28 3117.48 3101.72 3086.   3070.32 3054.68 3039.08
 3023.52 3008.   2992.52 2977.08 2961.68 2946.32 2931.   2915.72 2900.48
 2885.28 2870.12 2855.   2839.92 2824.88 2809.88 2794.92 2780.   2765.12
 2750.28 2735.48 2720.72 2706.   2691.32 2676.68 2662.08 2647.52 2633.
 2618.52 2604.08 2589.68 2575.32 2561.   2546.72 2532.48 2518.28 2504.12
 2490.   2475.92 2461.88 2447.88 2433.92 2420.   2406.12 2392.28 2378.48
 2364.72 2351.   2337.32 2323.68 2310.08 2296.52 2283.   2269.52 2256.08
 2242.68 2229.32 2216.   2202.72 2189.48 2176.28 2163.12 2150.   2136.92
 2123.88 2110.88 2097.92 2085.   2072.12 2059.28 2046.48 2033.72 2021.
 2008.32 1995.68 1983.08 1970.52 1958.   1945.52 1933.08 1920.68 1908.32
 1896.   1883.72 1871.48 1859.28 1847.12 1835.   1822.92 1810.88 1798.88
 1786.92 1775.   1763.12 1751.28 1739.48 1727.72 1716.   1704.32 1692.68
 1681.08 1669.52 1658.   1646.52 1635.08 1623.68 1612.32 1601.   1589.72
 1578.48 1567.28 1556.12 1545.   1533.92 1522.88 1511.8

now lets plot x and y and show that it is quadratic

In [39]:
p.figure()
p.plot(x,y,'k-') #plot y vs x as a black line
p.plot(x,poly_order2(x,2,1,1000),'r-') #lets also plot a red line using the same x, a, b, but c (the offset) is now 1000 instead of 5
p.xlabel('x') ; p.ylabel('y')
p.tight_layout()

Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

So now we should understand how to make a defintion/function. Lets now use it to figure out the value of the infiintie series $\sum_{n=1}^{\infty} \frac{1}{2^n} = $?

In [44]:
def inf_series(num_iterations=2):
    Y = [] #make an empty list that we will append on to
    n = 1 #lets star n=1
    for i in range(num_iterations):
        Y.append(1/2**n) # this says Y = 1/2^n for this iteration 
        n = n+1 #this iterates n. Can also be written as n += 1
        
    print(Y) #lets print it so we can see what each iteration gives us
    return sum(Y) #this sums up everything in the list (i.e. each iteration)
    

now lets run this infinite series with only 2, 5, 100, and 1000 iterations

In [46]:
inf_series(num_iterations=2)

[0.5, 0.25]


0.75

In [47]:
inf_series(num_iterations=5)

[0.5, 0.25, 0.125, 0.0625, 0.03125]


0.96875

In [48]:
inf_series(num_iterations=100)

[0.5, 0.25, 0.125, 0.0625, 0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625, 0.00048828125, 0.000244140625, 0.0001220703125, 6.103515625e-05, 3.0517578125e-05, 1.52587890625e-05, 7.62939453125e-06, 3.814697265625e-06, 1.9073486328125e-06, 9.5367431640625e-07, 4.76837158203125e-07, 2.384185791015625e-07, 1.1920928955078125e-07, 5.960464477539063e-08, 2.9802322387695312e-08, 1.4901161193847656e-08, 7.450580596923828e-09, 3.725290298461914e-09, 1.862645149230957e-09, 9.313225746154785e-10, 4.656612873077393e-10, 2.3283064365386963e-10, 1.1641532182693481e-10, 5.820766091346741e-11, 2.9103830456733704e-11, 1.4551915228366852e-11, 7.275957614183426e-12, 3.637978807091713e-12, 1.8189894035458565e-12, 9.094947017729282e-13, 4.547473508864641e-13, 2.2737367544323206e-13, 1.1368683772161603e-13, 5.684341886080802e-14, 2.842170943040401e-14, 1.4210854715202004e-14, 7.105427357601002e-15, 3.552713678800501e-15, 1.7763568394002505e-15, 8.881784197001252e-16, 4.440892098500626e-1

1.0

In [49]:
inf_series(num_iterations=1000)

[0.5, 0.25, 0.125, 0.0625, 0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625, 0.00048828125, 0.000244140625, 0.0001220703125, 6.103515625e-05, 3.0517578125e-05, 1.52587890625e-05, 7.62939453125e-06, 3.814697265625e-06, 1.9073486328125e-06, 9.5367431640625e-07, 4.76837158203125e-07, 2.384185791015625e-07, 1.1920928955078125e-07, 5.960464477539063e-08, 2.9802322387695312e-08, 1.4901161193847656e-08, 7.450580596923828e-09, 3.725290298461914e-09, 1.862645149230957e-09, 9.313225746154785e-10, 4.656612873077393e-10, 2.3283064365386963e-10, 1.1641532182693481e-10, 5.820766091346741e-11, 2.9103830456733704e-11, 1.4551915228366852e-11, 7.275957614183426e-12, 3.637978807091713e-12, 1.8189894035458565e-12, 9.094947017729282e-13, 4.547473508864641e-13, 2.2737367544323206e-13, 1.1368683772161603e-13, 5.684341886080802e-14, 2.842170943040401e-14, 1.4210854715202004e-14, 7.105427357601002e-15, 3.552713678800501e-15, 1.7763568394002505e-15, 8.881784197001252e-16, 4.440892098500626e-1

1.0

Clearly we can see this infinite series will reduce to 1 as expected. i.e. $\sum_{n=1}^{\infty} \frac{1}{2^n} = 1$