# PANOPTES Utils Data Explorer

The tools in the `panoptes.utils.data` modules are designed to help you easily find and start using PANOPTES data.

The module primarly offers an interface that allows you to search and find any observation metadata from the PANOPTES network. The module also offers convenient methods for downloading the raw data.

In [1]:
import sys

import numpy as np
import holoviews as hv
from holoviews import opts
import pandas as pd
from astropy.coordinates import SkyCoord

from panoptes.utils.data import search_observations
from panoptes.utils.data import get_metadata
from panoptes.utils.images import fits as fits_utils
from panoptes.utils.images import crop_data
from panoptes.utils.logger import logger

hv.extension('bokeh')

# Set up the logger for notebook viewing
logger.enable('panoptes')
logger.remove()
_ = logger.add(sys.stdout, format='<{level}> {message}')

In [2]:
# Holoviews styling options
opts.defaults(
    opts.Image(cmap='viridis', tools=['hover'], width=400, height=400),
    opts.Labels(text_color='white', text_font_size='8pt', text_align='left', text_baseline='bottom'),
)

### Search for observations

We can search for observations in a variety of ways. Here we just lookup the M42 coordinates and search within the default 5° radius.

In [3]:
# Get from existing coords
m42_coords = SkyCoord.from_name('M42')

search_results = search_observations(coords=m42_coords, min_num_images=10)

<DEBUG> Getting new firestore client
<DEBUG> Setting up search params
<DEBUG> Searching for observations
<DEBUG> Found 439 observations
<DEBUG> Filtering observations
<DEBUG> Found 60 observations after filtering


### Visualize

The `search_results` DataFrame has been automatically wrapped by [`hvplot`](https://hvplot.holoviz.org/), meaning you can easily visualize the results.

In [5]:
# Group by unit and field and sum the metrics
field_sums = search_results.groupby(['unit_id', 'field_name']).sum().reset_index()
field_sums.sample(5)

Unnamed: 0,unit_id,field_name,dec,exptime,iso,num_images,ra,total_minutes_exptime
0,PAN001,FlameNebula,-27.722065,899.4,1500,436,1281.313925,436.0
2,PAN001,TESS_SEC06_CAM01,4.525611,120.3,100,57,93.483361,114.0
4,PAN006,M42,-27.991933,601.5,500,111,447.295011,222.0
3,PAN001,Wasp 35,-87.426748,1680.6,1400,688,1069.846135,1376.0
5,PAN006,Wasp 35,-13.99781,360.8,300,101,240.284605,202.0


In [4]:
# Look at a random set
search_results.sample(5)

Unnamed: 0,camera_id,dec,exptime,field_name,iso,num_images,project,ra,received_time,sequence_id,software_version,status,time,total_minutes_exptime,unit_id
211,ee04d1,-5.393048,120.3,M42,100,17,Project PANOPTES,83.8232,2020-04-17 08:36:02.404000+00:00,PAN001_ee04d1_20181015T132754,POCSv0.6.2,receiving_files,2018-10-15 13:27:54+00:00,34.0,PAN001
225,14d3bd,-5.344291,120.0,M42,100,13,Project PANOPTES,83.84606,2020-04-17 09:10:28.723000+00:00,PAN001_14d3bd_20181011T134202,POCSv0.6.2,receiving_files,2018-10-11 13:42:02+00:00,26.0,PAN001
256,7bab97,-3.764367,120.2,M42,100,40,Project PANOPTES,88.81163,2020-04-17 04:50:24.165000+00:00,PAN006_7bab97_20181005T071422,POCSv0.6.2,receiving_files,2018-10-05 07:14:22+00:00,80.0,PAN006
192,14d3bd,-6.225509,120.0,Wasp 35,100,57,Project PANOPTES,76.045046,2020-04-17 08:48:00.464000+00:00,PAN001_14d3bd_20181022T094546,POCSv0.6.2,receiving_files,2018-10-22 09:45:46+00:00,114.0,PAN001
296,14d3bd,-1.845487,59.9,FlameNebula,100,30,Project PANOPTES,85.452651,2020-04-17 08:11:13.072000+00:00,PAN001_14d3bd_20191016T135841,POCSv0.6.2,receiving_files,2019-10-16 13:58:41+00:00,30.0,PAN001


In [7]:
search_results.hvplot.scatter(
    x='ra',
    y='dec',
    by='field_name',
    ylabel='Dec [deg]',
    xlabel='RA [deg]',
)

### Filter to needs

Not all of the search results will be relevant.  Here we see that we have a lot within our radius.  Let's get just M42.

In [9]:
print(f'Total minutes exptime: {m42_df.total_minutes_exptime.sum()}')

Total minutes exptime: 988.0


### Get image metadata for all observations

We currently have the observation level metadata for all the observations we are interested in, now we want to get the image metadata associated with each of these observations.

In [10]:
sequence_ids = m42_df.sequence_id.tolist()

In [11]:
images_df = get_metadata(sequence_id=sequence_ids)

<DEBUG> Getting new firestore client
<DEBUG> Getting images metadata for observation=PAN001_14d3bd_20180919T142354
<DEBUG> Getting images metadata for observation=PAN001_14d3bd_20180929T133352
<DEBUG> Getting images metadata for observation=PAN001_14d3bd_20181001T131909
<DEBUG> Getting images metadata for observation=PAN006_7bab97_20181005T071422
<DEBUG> Getting images metadata for observation=PAN001_14d3bd_20181010T135222
<DEBUG> Getting images metadata for observation=PAN001_14d3bd_20181011T134202
<DEBUG> Getting images metadata for observation=PAN006_6575fc_20181014T075926
<DEBUG> Getting images metadata for observation=PAN001_ee04d1_20181015T132754
<DEBUG> Getting images metadata for observation=PAN001_ee04d1_20181015T142232
<DEBUG> Getting images metadata for observation=PAN006_6575fc_20181019T090734
<DEBUG> Getting images metadata for observation=PAN006_6575fc_20181022T081839
<DEBUG> Getting images metadata for observation=PAN001_14d3bd_20181023T141457
<DEBUG> Getting images meta

In [10]:
images = list()
for seq_id in m42_df.sequence_id:
    images.append(get_metadata(sequence_id=seq_id))
    
images_df = pd.concat(images)

CPU times: user 3 µs, sys: 1e+03 ns, total: 4 µs
Wall time: 5.72 µs


In [57]:
images_df.hvplot.bar(
    x='num_images'
)

KeyError: "['num_images'] not in index"

### Get the actual images.

We now have a list of image metadata and want to download the FITS files for all those images that have been properly solved.

We can use the `getdata` utility function to get both the data and the header from a url.

#### Fetch and crop data

Here we are going to just grab the first 10 images, crop the center, then visualize.

In [15]:
data_list = list()
header_list = list()

# Loop through the first 10 files.
for url in m42_fits_files[:10]:
    # Fetch the remote data and url
    d0, h0 = fits_utils.getdata(url, header=True)
    
    # Crop
    d1 = crop_data(d0)
    
    # Store
    data_list.append(d1)
    header_list.append(h0)
    
m42_data = np.array(data_list)    

<DEBUG> Using center: 1738 2604
<DEBUG> Box width: 200
<DEBUG> Using center: 1738 2604
<DEBUG> Box width: 200
<DEBUG> Using center: 1738 2604
<DEBUG> Box width: 200
<DEBUG> Using center: 1738 2604
<DEBUG> Box width: 200
<DEBUG> Using center: 1738 2604
<DEBUG> Box width: 200
<DEBUG> Using center: 1738 2604
<DEBUG> Box width: 200
<DEBUG> Using center: 1738 2604
<DEBUG> Box width: 200
<DEBUG> Using center: 1738 2604
<DEBUG> Box width: 200
<DEBUG> Using center: 1738 2604
<DEBUG> Box width: 200
<DEBUG> Using center: 1738 2604
<DEBUG> Box width: 200


In [16]:
# Create some domain and range axes for our Dataset. Note they are reversed.
axes = list()
for dim in reversed(m42_data.shape):
    axes.append(np.arange(dim))

In [74]:
# Create the holoviews dataset
ds = hv.Dataset(
    (*axes, m42_data),
    ['x', 'y', 'Time'], 
    'Image'
)

In [76]:
# Visualize the first 10 frames.
ds.to(hv.Image, ['x', 'y']).hist()

### Going on...

We can see that the center of the image is saturated here because our exposure time is too long.

However, our list of observations contain sets that have shorter exposure times. What we should do is select some sets at various differet exposures and intelligently combine them to bring out the best image we can.

This is left as an exercise for the reader. ;)