# A First Look at an X-ray Image Dataset

Images are data. They can be 2D, from cameras, or 1D, from spectrographs, or 3D, from IFUs (integral field units). In each case, the data come packaged as an *array* of numbers, which we can visualize, and do calculations with.

Let's suppose we are interested in clusters of galaxies. We choose one, Abell 1835, and propose to observe it with the XMM-Newton space telescope. We are successful, we design the observations, and they are taken for us. Next: we download the data, and take a look at it.

## Getting the Data 

We will download our images from HEASARC, the online archive where XMM data are stored. 

In [2]:
import astropy.io.fits as pyfits
import numpy as np
import os.path

Download the example data files if we don't already have them.

In [7]:
targdir = 'a1835_xmm/'
!mkdir -p $targdir

imagefile = 'P0098010101M2U009IMAGE_3000.FTZ'
expmapfile = 'P0098010101M2U009EXPMAP3000.FTZ'
bkgmapfile = 'P0098010101M2X000BKGMAP3000.FTZ'

remotedir = 'http://heasarc.gsfc.nasa.gov/FTP/xmm/data/rev0/0098010101/PPS/'

for filename in [imagefile,expmapfile,bkgmapfile]:
    path = targdir + filename
    url = remotedir + filename
    if not os.path.isfile(path): # i.e. if the file does not exist already:
        !wget -nd -O $path $url 

!du -h $targdir

984K	a1835_xmm/


## The XMM MOS2 image

Let's find the "science" image, and display it.

In [8]:
imfits = pyfits.open('a1835_xmm/P0098010101M2U009IMAGE_3000.FTZ')
imfits.info()

Filename: a1835_xmm/P0098010101M2U009IMAGE_3000.FTZ
No.    Name         Type      Cards   Dimensions   Format
0    PRIMARY     PrimaryHDU     262   (648, 648)   int32   
1    GTI00006    BinTableHDU     29   15R x 2C     [D, D]   
2    GTI00106    BinTableHDU     29   15R x 2C     [D, D]   
3    GTI00206    BinTableHDU     29   16R x 2C     [D, D]   
4    GTI00306    BinTableHDU     29   15R x 2C     [D, D]   
5    GTI00406    BinTableHDU     29   15R x 2C     [D, D]   
6    GTI00506    BinTableHDU     29   15R x 2C     [D, D]   
7    GTI00606    BinTableHDU     29   15R x 2C     [D, D]   


`imfits` is a FITS object, containing multiple data structures. The image itself is an array of integer type, and size 648x648 pixels, stored in the primary "header data unit" or HDU. 

> If we need it to be floating point for some reason, we need to cast it:
im = imfits[0].data.astype('np.float32')
Note that this (probably?) prevents us from using the pyfits "writeto" method to save any changes. Assuming the integer type is ok, just get a pointer to the image data.

Accessing the `.data` member of the FITS object returns the image data as a numpy ndarray.

In [9]:
im = imfits[0].data

Also read in the model background and exposure maps. (We may not need these immediately.) They are both float32 type.

In [10]:
bkfits = pyfits.open('a1835_xmm/P0098010101M2X000BKGMAP3000.FTZ')
bkfits.info()

Filename: a1835_xmm/P0098010101M2X000BKGMAP3000.FTZ
No.    Name         Type      Cards   Dimensions   Format
0    PRIMARY     PrimaryHDU      77   (648, 648)   float32   


In [11]:
exfits = pyfits.open('a1835_xmm/P0098010101M2U009EXPMAP3000.FTZ')
exfits.info()

Filename: a1835_xmm/P0098010101M2U009EXPMAP3000.FTZ
No.    Name         Type      Cards   Dimensions   Format
0    PRIMARY     PrimaryHDU      63   (648, 648)   float32   


Let's look at the data image, with `ds9`. 

In [12]:
!ds9 -log "$imagefile"

/bin/sh: ds9: command not found


-----
## Exercise

What do you think is going on in this image? Make a list with your neighbor, and report back in 5 minutes.

-----

In [27]:
print im, len(im),im.shape,im[324,324],np.max(im),im.argmax(),type(im),im[348,328], im.T

[[0 0 0 ..., 0 0 0]
 [0 0 0 ..., 0 0 0]
 [0 0 0 ..., 0 0 0]
 ..., 
 [0 0 0 ..., 0 0 0]
 [0 0 0 ..., 0 0 0]
 [0 0 0 ..., 0 0 0]] 648 (648, 648) 0 223 225832 <type 'numpy.ndarray'> 223 [[0 0 0 ..., 0 0 0]
 [0 0 0 ..., 0 0 0]
 [0 0 0 ..., 0 0 0]
 ..., 
 [0 0 0 ..., 0 0 0]
 [0 0 0 ..., 0 0 0]
 [0 0 0 ..., 0 0 0]]


todo: figure out whether the index ordering of the ndarray matches the usual image coordinates. nope, they're reversed (and also indexed from zero)

### Practical Note
We will mainly work in "image coordinates" because they are simpler to deal with than celestial coordinates and more directly map to the python/numpy representation of the data. Unfortunately, there are still a couple sticking points:
1. They conventionally start at 1, whereas python indexing of the array starts at 0.
2. The two dimensions are transposed wrt one another. We can potentially deal with this by taking the transpose (.T) immediately after reading in the data. (todo)