# Table

I didn't go in to opening and reading files in python because the `astropy.tables` module covers most of what astronomers will need.  If you want to open and edit files directly, see the `open()` and `close()` funtions in python.

Astropy has a module (`astropy.io`) which covers input and output for reading and writing various file types.  The `tables` module uses that to read and write numerous file formatting options.

In [None]:
from astropy import units as u
from astropy.table import Table, Column, Row

In [None]:
# Tables can be formed from columns
a = [1, 4, 5]
b = [2.0, 5.0, 8.2]
c = ['x', 'y', 'z']
t = Table([a, b, c], names=('a', 'b', 'c'))

In [None]:
t

In [None]:
# Tables can be formed from rows
data_rows = [(1, 2.0, 'x'),
             (4, 5.0, 'y'),
             (5, 8.2, 'z')]
t = Table(rows=data_rows, names=('a', 'b', 'c'), meta={'name': 'first table'},
          dtype=(int, float, str))

In [None]:
t

In [None]:
# Tables work with astropy units
# We can assign unit of seconds to column b
t['b'].unit = u.second

In [None]:
t

In [None]:
# To get summary information about a table
t.info

In [None]:
t['b'].format = '7.3f'
t

### Reading from Files

In [None]:
data = Table.read('table.tex', format='latex')

In [None]:
data

In [None]:
# Tables have several tools for convienient display.  For example, they can display interactively in a notebook!
data.show_in_notebook()

Astropy tables are very powerful (and thus can be somewhat complex).  They can be manipulated in numerous ways.

In [None]:
data.info()

In [None]:
data.keys()

In [None]:
# Let's clean up some of those column names to remove the LaTeX
data['$M_r$'].name = 'M_r'
data['log $M^\star/M_\odot$'].name = 'log M/M_o'
data['GZ1$_c$'].name = 'GZ1_c'
data['$C$'].name = 'C'
data['$A$'].name = 'A'
data['$L(\\rm{[OIII]})$'].name = 'L_OIII'

In [None]:
data.info()

### Using Tables

Let's try to do some calculations with the data in the above table.

First, what if we wanted to sum all the A values for those objects which have a C value above 0.45?  We can index tables with a boolean array that would pick out the rows we want.

In [None]:
data['C'] > 0.45

In [None]:
data[data['C'] > 0.45]

In [None]:
col = data[data['C'] > 0.45]['A']
col.data

In [None]:
sum(data[data['C'] > 0.45]['A'])

In [None]:
import numpy as np
np.mean(data[data['C'] > 0.45]['A'])

In [None]:
np.median(data[data['C'] > 0.45]['A'])

In [None]:
np.std((data[data['C'] > 0.45]['A']))

I can also select out groups using the `group_by` method.

In [None]:
byGZ1_c = data.group_by('GZ1_c')

In [None]:
# The output is the same table, but which now has `.groups` property.
# Let's see what is in the zeroth group
byGZ1_c.groups[0]

In [None]:
# Let's see what is in the first group
byGZ1_c.groups[1]

In [None]:
# The info about which group has which value of GZ1_c is in the `.keys` property.
byGZ1_c.groups.keys

In [None]:
# We can find out which elements of the keys `Column` are equal to the one we are interested in:
byGZ1_c.groups.keys['GZ1_c'] == 'S'

In [None]:
# Therefore if I want to examine those rows of the table which have GZ1_c == S:
byGZ1_c.groups[byGZ1_c.groups.keys['GZ1_c'] == 'S']

### Editing Tables

There are times when you want to edit data in a Table.  For example, in the table above the `L_OIII` column is a string which is inconvienient.

We want to split out the value and the uncertainty.

In [None]:
# First I want figiure out how to get the values.  I get those by using the `.split` method on the string.
# An example of the content of a cell is:
data['L_OIII'][0]

In [None]:
# So I want to split on the "$\\pm$" LaTeX string.
# Using list comprehension I would do that like this:
L_OIII_value = [x.split("$\\pm$")[0] for x in data['L_OIII']]
L_OIII_value

In [None]:
# but that just gives me strings, so I need to get the float of that:
L_OIII_value = [float(x.split("$\\pm$")[0]) for x in data['L_OIII']]
L_OIII_value

In [None]:
# but that gives me a list type, I want a table Column:
L_OIII_value = Column([float(x.split("$\\pm$")[0]) for x in data['L_OIII']], name="L_OIII_value")

In [None]:
L_OIII_value

In [None]:
# Let's get the uncertainty the same way:
L_OIII_uncertainty = Column([float(x.split("$\\pm$")[1]) for x in data['L_OIII']], name="L_OIII_uncertainty")

In [None]:
# Now let's add those back in to the table:
data.add_columns([L_OIII_value, L_OIII_uncertainty])
# And look at the result:
data

In [None]:
# Let's do soemthing with that new information.  Let's add a SNR column for the L_OIII measurement.
SNR = Column([x['L_OIII_value']/x['L_OIII_uncertainty'] for x in data], name='L_OIII_SNR', format='.2f')

In [None]:
data.add_column(SNR)

In [None]:
data.write?

### Other Features of Tables

Tables also have a few other nice features which we haven't had time to go in to here.

For one thing, table columns can be associated with units.  Thus, data in a table can play well with calculations using `astropy.units`.

Another is that table columns can contain other types of data such as `SkyCoord`s, so you can have a table with object coordinates and carry around all the power of a `SkyCoord` or a `Time`.

# Exercises

Tables can also be read in from FITS files which use the FITS table format, you'll try that out in this exercise.

First, we have to get the data.  Astropy has utilities for downloading data and we will use that here.

(much of this exercise is stolen from a notebook by Lia R. Corrales)

In [None]:
import numpy as np

from astropy.io import fits
from astropy.table import Table
from astropy.utils.data import download_file

%matplotlib inline
import matplotlib.pyplot as plt

In [None]:
# First we identify the file we want to download
event_filename = download_file( 'http://data.astropy.org/tutorials/FITS-tables/chandra_events.fits', cache=True )

In [None]:
# First read in the FITS data.
# We use memmap to prevent RAM storage issues for large files.
hdu_list = fits.open(event_filename, memmap=True)
hdu_list.info()

In [None]:
# Lets look at the EVENTS data.  It is stored in a FITS table.
type(hdu_list[1].data)

In [None]:
# We could work with it in that form, but Tables are much nicer.
evt_data = Table(hdu_list[1].data)
evt_data

### Exercise 1) Make a Histogram Plot

Generate a histogram of the energy of each photon.  Set a reasonable number of bins so that the plot is informative and contains as much of the raw information as possible, but not busy.

Hint: pyplot's `hist` may be useful.

In [None]:
# your code here

### Exercise 2) Select out the events which fall on the main (ACIS-I) chips.

This particular observation spans five CCD chips. First we determine the events that only fell on the main (ACIS-I) chips, which have numerical IDs ("ccd_id") equal to 0, 1, 2, and 3.

Once you have the data for those chips, print out the total number of events which meet the criteria.

In [None]:
# your code here

### Exercise 3) Make a 2D Spatial Histogram of the Events

Generate an image of the event hits in the 2D space of the detector.

Hint: You may find the `np.histogram_2d` tool useful.

In [None]:
# your code here

### Exercise 4) Make the same image with a log-normal color scheme

You may notice that the dynamic range of your "image" from exercise 2 is far greater than can be easily displayed.  Make a similar "image" with the event count displayed in log space, so that the dynamic range is compressed.

Hint: the `plt.hist2d` tool has options you might be interested in.

In [None]:
# your code here