(chap:daoclean)=
# Identifying Optical Counterparts to XRBs

Now that we've identified all *HST* optical point sources with `DaoFind` and calculated the best positions and 2-$\sigma$ radii of each *CXO* X-ray source, we can finally select all candidate optical counterparts for each XRB. This is done by isolating all sources that fall within the 2-$\sigma$ radius of each X-ray source. One can do so manually, but the easiest way is to run `Sources.DaoClean()`. 

(sec:daoclean)=
## Isolating Candidate Counterparts with `DaoClean`
Before running `DaoClean`, you will need to place the coordinates of each `DaoFind` point source into a new `DataFrame`. This can be done by using `DataFrameMod.BuildFrame()` or `Sources.LoadSources()` on the `.ecsv` data file that was produced by `DaoFind` (in my case, this was saved as, for example, `photometry_M101_f555w_acs_full.ecsv`), *but only if you did not apply an additional shift to your region file*. In my case, I decided to apply a shift during the `RunPhots()` phase, because the region files that were produced did not align well to the *HST* images for reason unknown. Because my region files are shifted, and because my X-ray coordinates were aligned to these shifted regions, I need to create a `DataFrame` using the coordinates of those shifted region files. While we're at it, one may find it's a good idea to compile both the `img` and `fk5` coordinates into a single `DataFrame`, for greater flexibility later down the line. 

In [4]:
from os import chdir as cd
from os import getcwd as pwd
from XRBID.Sources import LoadSources, GetCoords, GetIDs, DaoClean
from XRBID.DataFrameMod import BuildFrame

x,y = GetCoords("M101_daofind_f555w_acs_img.reg")
ids = GetIDs("M101_daofind_f555w_acs_img.reg")
ra,dec = GetCoords("M101_daofind_f555w_acs_fk5.reg")

# Compiling into a single DataFrame
DaoFrame = BuildFrame(headers=['DaoID','X','Y','RA','Dec'], 
                      values=[ids,x,y,ra,dec])
DaoFrame.to_csv("M101_daofind_acs_coords.frame")
display(DaoFrame)

Retrieving coordinates from M101_daofind_f555w_acs_img.reg
Retrieving IDs from M101_daofind_f555w_acs_img.reg
Retrieving coordinates from M101_daofind_f555w_acs_fk5.reg


Unnamed: 0,DaoID,X,Y,RA,Dec
0,1,7710.069326,3655.196224,210.888379,54.225405
1,2,7712.935688,3688.774732,210.888312,54.225872
2,3,7926.745455,3696.670351,210.883232,54.225984
3,4,7740.506306,3722.354818,210.887658,54.226338
4,5,7802.588601,3728.903381,210.886183,54.226430
...,...,...,...,...,...
178097,178098,8939.411035,19364.831601,210.859372,54.443607
178098,178099,9082.172967,19379.774137,210.855963,54.443815
178099,178100,8920.492988,19385.066555,210.859824,54.443888
178100,178101,8947.054147,19392.263104,210.859190,54.443988


```{important}
`DaoClean` allows the user to attempt to do the radius search using the `fk5` (degree) coordinate system. However, because how this system works, you'll find that the search may return unexpected results (e.g. returning `DaoClean` sources within an ellipse around XRBs instead of a circle). It is recommended that one used the `img` coordinate system instead, which returns more predictable results.
```

As noted above, it is best to run DaoClean using the `img` coordinate system, which gives coordinates and radii in pixel units relative to the reference image. In `DS9`, open one of the region files containing the corrected coordinates of the X-ray sources and resave it using image coordinates. Then read in the image coordinates and save it to your X-ray source DataFrame.

In [14]:
M101_best = LoadSources("M101_csc_bestrads.frame")

xsources, ysources = GetCoords("M101_bestrads_2sig_img.reg")
M101_best["X"] = xsources
M101_best["Y"] = ysources

# Convert the radius from arcseconds to pixels. The conversion is 0.05 for ACS/WFC and 0.03962 for WFC3/UVIS
M101_best["2Sig (pix)"] = M101_best["2Sig"] / 0.05 
M101_best.to_csv("M101_csc_bestrads.frame")
display(M101_best)

Reading in sources from M101_csc_bestrads.frame...
Retrieving coordinates from M101_bestrads_2sig_img.reg


Unnamed: 0,Separation,CSC ID,RA,Dec,ExpTime,Theta,Err Ellipse Major,Err Ellipse Minor,Err Ellipse Angle,Significance,...,MS Ratio lolim,MS Ratio hilim,Counts,Counts lolim,Counts hilim,1Sig,2Sig,x,y,2Sig (pix)
0,0.835778,2CXO J140312.5+542056,210.802227,54.348952,98379.672624,1.280763,0.296164,0.295254,89.606978,22.302136,...,-0.616490,-0.485322,234.492581,216.678502,252.306660,0.493011,0.988256,11333.8500,12549.3440,19.765113
1,1.951564,2CXO J140312.7+542055,210.803345,54.348663,49085.676020,4.725025,0.548072,0.380093,86.863933,2.648649,...,-1.000000,-0.715178,209.970149,192.843759,227.096539,0.516471,1.038077,11286.9600,12528.5350,20.761533
2,2.480949,2CXO J140312.5+542053,210.802221,54.348072,98379.672624,1.321652,0.296164,0.295733,84.016491,19.554042,...,-0.364147,-0.265459,252.846172,234.295673,271.396671,0.492957,0.988086,11334.1110,12486.0170,19.761726
3,5.227586,2CXO J140313.1+542052,210.804553,54.347994,132129.439441,2.762119,0.880650,0.578584,29.467809,3.421053,...,-0.973766,-0.718926,23.229213,16.216620,30.241806,0.561749,1.188360,11236.2880,12480.3160,23.767200
4,9.124404,2CXO J140313.5+542053,210.806667,54.348188,98379.672624,1.410741,0.633314,0.466131,58.539300,1.739130,...,-1.000000,-0.370394,7.040343,2.992146,11.088540,0.593981,1.300029,11147.5470,12494.3280,26.000572
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
550,883.411465,2CXO J140138.8+541527,210.411860,54.257747,132129.439441,12.087047,8.647887,6.103176,111.468541,4.277778,...,-0.932542,-0.617739,76.034629,52.273807,99.795450,1.921653,4.517698,27753.7720,6032.4759,90.353960
551,884.529187,2CXO J140450.6+541721,211.211172,54.289330,139822.610581,9.951487,1.357801,1.057531,95.305118,8.976584,...,-0.472829,-0.270456,155.562531,135.448493,174.718757,0.793067,1.558488,-5851.2446,8301.6458,31.169764
552,886.380054,2CXO J140216.3+543313,210.567813,54.553731,131222.705704,13.887674,2.480733,2.445301,115.654637,6.783292,...,-0.322923,-0.118051,149.738677,121.454705,176.358886,1.676834,3.103807,21117.9750,27312.5690,62.076131
553,890.405834,2CXO J140445.4+541452,211.189426,54.247897,139819.410649,10.703677,4.581313,2.651713,135.472019,4.411765,...,0.501562,1.000000,,,,0.487466,0.974933,-4952.8736,5313.5560,19.498651


Now everything should be ready to run through `DaoClean`: 

In [16]:
DaoCleanFrame = DaoClean(daosources=DaoFrame, sources=M101_best, sourceid="CSC ID", 
                         coordsys="img", coordheader=['X','Y'], radheader="2Sig (pix)") 

display(DaoCleanFrame)

# NOTE: there is currently an issue with writing region files with 'img' coordinate system. 
# Instead, using the fk5 coordinate system to save the region file.
WriteReg(DaoCleanFrame, coordsys='fk5', width=2, outfile='M101_daoclean_f555w_acs_fk5.reg')

TypeError: DaoClean() got an unexpected keyword argument 'sourceid'

Check the results of the search by plotting both the 2-$\sigma$ radii of X-ray sources and the `DaoClean` sources on the *HST* image on `DS9`. If it looks like the search is missing sources as though the search radius is too small, you can modify the `DaoClean()` parameters to increase the search radius with `wiggle` (in pixels or degrees). This will make the search a little less strict, so you can include sources a whose centroids fall *just* outside of the 2-$\sigma$ radius of each X-ray source, if needed. 

(sec:crossref) =
## Cross-referencing `DaoFind` Sources across filters
`RunPhots()` was used to create separatel lists of point sources across each of the *HST* filters ({ref}(sec:runphots). In order to do a full color analysis of the optical counterparts of each X-ray source, we will need to figure out which point source in each filter is associated with the same object. This can be done with `Sources.Crossref()`. 

```{note}
`Sources.Crossref()` isn't actually very different, functionally, from `Sources.DaoClean()`, except `Crossref` will search through multiple region files (AKA catalogs) at once and add as many extra lines to the `DataFrame` as needed to accommodate all possible associations. Also, `Crossref` will only keep information on the coordinates and ID of each source, and the IDs of sources that may be associated with it in the other catalogs. It is not recommended to use `Crossref` instead of `DaoClean` to find daosources associated with each XRB, because several sources may be found within each XRB 2-$\sigma$, and they way the IDs of point sources in each catalog are organized may give the false impressions that they are associated with one another across catalogs (e.g. in a single line in the resulting `DataFrame`, you could have an F555W ID from one nearby point source, an F435W ID from another, and an F814W from a third). This is only mitigated by reducing the search radius, which I do in `Crossref` below. 
```

In [None]:
from XRBID.Sources import Crossref

DaoCleanMatch = Crossref(DaoCleanFrame, regions=['M101_daofind_f435w_acs_img.reg',
                                                 'M101_daofind_f814w_acs_img.reg'], 
                         catalogs=['F435W', 'F814W'], coordheads=['X','Y'], 
                         sourceid="DaoID", outfile="M101_daoclean_matches.frame")

# Renaming the DaoID header to "F555W ID", to match the output of the other filters
DaoCleanMatch = DaoCleanMatch.rename(columns={'DaoID': 'F555W ID'})
DaoCleanMatch

Note, it's possible that each source above is associated with more than one source in one of the catalogs (region file) its being cross-referenced with. For example, a source that's saturated in one filter may appear as multiple sources in a filter that's separated. `Crossref` keeps track of every possible association, so it's likely that `DaoCleanMatch` ends up longer than `DaoCleanFrame`. Keeping the search radius low (the default is 3 pixels) is one way to prevent this as much as possible, which is what we'd want in this case.

It's also useful for the next step to add the CSC ID back onto the DataFrame, which can be done as so: 

In [None]:
# Adding the CSC IDs of each source the daosources are associated with
for i in range(len(DaoCleanMatch)): 
    # Searches for the CSC ID associated with each F555W ID (DaoID in DaoCleanFrame)
    tempid = DaoCleanMatch["F555W ID"][i] 
    tempcsc = Find(DaoCleanFrame, "DaoID = " + str(tempid))["CSC ID"][0]
    DaoCleanMatch["CSC ID"][i] = tempcsc
    
DaoCleanMatch

Next, we need to pull the photometry associated with each ID in each filter. This can be done with `Sources.GetDaoPhots()`. 

In [None]:
GetDaoPhots(df, daofiles, 