HST Duplication checking
---------

In [1]:
from astropy.table import Table
from astropy import coordinates as coords
import astropy.units as u
import pickle

Download the PAEC catalog from https://archive.stsci.edu/hst/paec.html.
Remove the header and replace with the following:

```targname        | ra          | dec       |config  | mode       | aper       |spec         |  wave |time |prop |cy|dataset   |release |
```

Read it in, add the SkyCoord objects and write it out in a Pickle format so it's *much* faster to re-use it.

In [2]:
paec = Table.read('../tables/paec_7-present.cat',format='ascii.fixed_width')
paec['coords'] = coords.SkyCoord(paec['ra'],paec['dec'],unit=(u.hour, u.deg),frame='icrs')
fp = open('paec.p','wb')
pickle.dump(paec,fp,protocol=pickle.HIGHEST_PROTOCOL)



**The functions below do the duplication checking.** 

 * `targets` is a table of targets that already have a `coords` column that is a `coords.SkyCoord` object
 * `paec` is the PAEC table
 * Returns:
   * `urls` -- urls to the HST Program status pages. Empty string if there is no match. 
   * `propids` -- dictionary of matching proposal ids keyed by the index in `gp`; no entry in the dictionary if there is no match

In [8]:
def propurl(id):
    base="<a href=http://www.stsci.edu/cgi-bin/get-proposal-info?id="
    suffix="&observatory=HST"
    str = "%s%d%s> %d </a>" % (base,id,suffix,id)  
    return str

def dup_urls(targets,propids):
    urls = ["" for i in range(len(targets))]
    for i in range(len(targets)):
        if i in propids:
            for p in propids[i]:
                urls[i] += propurl(p)+" "
            urls[i] = urls[i][:-1]
    return urls

def duplications(targets,paec):
    idxc, idxcatalog, d2d, d3d = targets['coords'].search_around_sky(paec['coords'],200*u.arcsec)
    propids = {}
    for id_targ,id_paec in zip(idxcatalog,idxc):
        if id_targ not in propids:
            propids[id_targ] = [paec['prop'][id_paec]]
        else: 
            propids[id_targ] += [paec['prop'][id_paec]]
    for p in propids.keys():
        propids[p] = list(set(propids[p]))
    urls = dup_urls(targets,propids)
    return urls, propids

Example usage
----
First create a table with a couple sources. In reality, you would read in your table, probably using `Table.read`. Create a SkyCoord column for the coordinates

In [9]:
data_rows = [('IC10',5.072250,59.303780),
             ('Abell209',22.95901,-13.591956)
            ]
my_catalog = Table(rows=data_rows,names=['name','RA','Dec'])
my_catalog['coords']=coords.SkyCoord(my_catalog['RA'],my_catalog['Dec'],
                                     unit=(u.deg, u.deg),frame='icrs')

Read in the PAEC pickled file

In [10]:
fp = open('paec.p','rb')
paec = pickle.load(fp)
fp.close()

Do the checking

In [11]:
urls,propids = duplications(my_catalog,paec)

Show the dictionary

In [12]:
propids

{0: [14073, 10242, 9683], 1: [8249, 12451]}

Format the catalog for the notebook, hacking the URL fields so they are clickable

In [17]:
my_catalog['paec']=urls
ipy_html = my_catalog.show_in_notebook()
ipy_html.data = ipy_html.data.replace('&lt;','<')
ipy_html.data = ipy_html.data.replace('&gt;','>')
ipy_html

idx,name,RA,Dec,coords,paec
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,"deg,deg",Unnamed: 5_level_1

idx,name,RA,Dec,coords,paec
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,"deg,deg",Unnamed: 5_level_1
0,IC10,5.07225,59.30378,"5.07225,59.30378",14073 10242 9683
1,Abell209,22.95901,-13.591956,"22.95901,-13.591956",8249 12451
