<a href="https://colab.research.google.com/github/jkanner/odw-2018/blob/master/gwpy/0%20-%20GWOSC%20python.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Query the OpenScience datasets using `gwosc`

This pre-tutorial describes how you can use the [`gwosc`](//gwosc.readthedocs.io) python module to search for GW open data information.

First, lets install it:

In [1]:
# -- For Google Colab
#! pip install 'gwosc==0.4.3'

## Querying for event information

The `gwosc.datasets` module provides tools to search for datasets, including filtering on GPS times.

For example, we can search for what event datasets are available:

In [2]:
from gwosc.datasets import find_datasets
events = find_datasets(type='event')
print(events)

['151008', '151012A', '151116', '161202', '161217', '170208', '170219', '170405', '170412', '170423', '170616', '170630', '170705', '170720', 'GW150914', 'GW151012', 'GW151226', 'GW170104', 'GW170608', 'GW170729', 'GW170809', 'GW170814', 'GW170817', 'GW170818', 'GW170823']


Here we see the list of confirmed detections (those prefixed as 'GW') and one likely detection (prefixed as 'LVT'). `find_datasets` also accepts a `detector` keyword to return only those datasets that include data for that detector.

We can query for the GPS time of a given event:

In [3]:
from gwosc.datasets import event_gps
gps = event_gps('GW170817')
print(gps)

1187008882.4


<div class="alert alert-info">All of these times are returned in the GPS time system, which counts the number of seconds that have elapsed since the start of the GPS epoch at midnight (00:00) on January 6th 1980. LOSC provides a <a href="https://losc.ligo.org/gps/">GPS time converter</a> you can use to translate into datetime, or you can use <a href="https://gwpy.github.io/docs/stable/time/"><code>gwpy.time</code></a>.</div>

We can query for the GPS time interval for an observing run:

In [4]:
from gwosc.datasets import run_segment
print(run_segment('S6'))

(931035615, 971622015)


## Querying for data files

### Events during O1

The `gwosc.locate` module provides a function to find the URLs of data files associated with a given dataset.

For event datasets, one can get the list of URLs using only the event name:

In [5]:
from gwosc.locate import get_event_urls
urls = get_event_urls('GW150914')
print(urls)

['https://www.gw-osc.org/catalog/GWTC-1-confident/data/GW150914/H-H1_GWOSC_4KHZ_R1-1126259447-32.hdf5', 'https://www.gw-osc.org/catalog/GWTC-1-confident/data/GW150914/L-L1_GWOSC_4KHZ_R1-1126259447-32.hdf5', 'https://www.gw-osc.org/catalog/GWTC-1-confident/data/GW150914/H-H1_GWOSC_4KHZ_R1-1126257415-4096.hdf5', 'https://www.gw-osc.org/catalog/GWTC-1-confident/data/GW150914/L-L1_GWOSC_4KHZ_R1-1126257415-4096.hdf5']


By default, this function returns all of the files associated with a given event, which isn't particularly helpful. However, we can can filter on any of these by using keyword arguments, for example to get the URL for the 32-second file for the LIGO-Livingston detector:

In [6]:
urls = get_event_urls('GW150914', duration=32, detector='L1')
print(urls)

['https://www.gw-osc.org/catalog/GWTC-1-confident/data/GW150914/L-L1_GWOSC_4KHZ_R1-1126259447-32.hdf5']


### Events during O2

For events during O2 (and beyond), we can also the list of URLs

In [7]:
urls = get_event_urls('GW170817')
print(urls)

['https://www.gw-osc.org/catalog/GWTC-1-confident/data/GW170817/G-G1_GWOSC_4KHZ_R1-1187008867-32.hdf5', 'https://www.gw-osc.org/catalog/GWTC-1-confident/data/GW170817/H-H1_GWOSC_4KHZ_R1-1187008867-32.hdf5', 'https://www.gw-osc.org/catalog/GWTC-1-confident/data/GW170817/L-L1_GWOSC_4KHZ_R1-1187008867-32.hdf5', 'https://www.gw-osc.org/catalog/GWTC-1-confident/data/GW170817/V-V1_GWOSC_4KHZ_R1-1187008867-32.hdf5', 'https://www.gw-osc.org/catalog/GWTC-1-confident/data/GW170817/G-G1_GWOSC_4KHZ_R1-1187006835-4096.hdf5', 'https://www.gw-osc.org/catalog/GWTC-1-confident/data/GW170817/H-H1_GWOSC_4KHZ_R1-1187006835-4096.hdf5', 'https://www.gw-osc.org/catalog/GWTC-1-confident/data/GW170817/L-L1_GWOSC_4KHZ_R1-1187006835-4096.hdf5', 'https://www.gw-osc.org/catalog/GWTC-1-confident/data/GW170817/V-V1_GWOSC_4KHZ_R1-1187006835-4096.hdf5']


We could select one with keywords:

In [8]:
urls = get_event_urls('GW170817', duration=4096, detector='V1')
print(urls)

['https://www.gw-osc.org/catalog/GWTC-1-confident/data/GW170817/V-V1_GWOSC_4KHZ_R1-1187006835-4096.hdf5']


# Exercises

Now that you've seen examples of how to query for dataset information using the `gwosc` package, please try and complete the following exercies using that interface:

- How many months did S6 last?
- How many events were detected during O1?
- Which event releases include data for the Virgo detector?