_This notebook was developed by [Keneth Garcia](https://stivengarcia7113.wixsite.com/kenethgarcia). Source and license info are on [GitHub](https://github.com/KenethGarcia/ClassiPyGRB)._

# Swift Data Download
Results for the Swift/BAT Gamma-Ray Burst (GRBs) on board of The Neil Gehrels Swift Observatory are presented in [this website](https://swift.gsfc.nasa.gov/results/batgrbcat/) (open access).

This notebook summarizes the process to download these data in different resolutions. Through this document, we are using the _python3_ implementations from the _ClassiPyGRB_ package. It is necessary to have an internet connection and a _Jupyter Notebook_/_Python 3_ compiler software.

First, we need to import the _SWIFT_ instance of _ClassiPyGRB_ to our notebook (and some other packages needed in this notebook):

In [1]:
from ClassiPyGRB import SWIFT
# Packages needed in this notebook
import os
import pandas as pd
import numpy as np

## Changing the Swift GRB binning
By default, the following sections download the data for 64ms binning in Swift. There are some cases in which we need to use different data resolutions and binning; handling these situations can be solved in this package by managing the _resolution_ $res$ argument.

Through **ClassiPyGRB**, you can change the _resolution_ variable to $2$, $8$, $16$, $64$, and $256$ ms respectively. Additionally, you can set $res=1000$ for 1s binning and $res=10000$ to use data with a signal-to-noise ratio higher than 5 or 10 s binning (10s data don't have uniform time spacing).

In **ClassiPyGRB** you have a high-level of saving data customization. You can set the following paths:
- $root\_path$ (str): Main path to save data/results from SWIFT. Unique mandatory path to ensure the functionality of SWIFT Class.
- $original\_data\_path$ (str, optional): Path to save non-manipulated data from SWIFT. Defaults to Original\_Data folder inside data\_path.
- $noise\_data\_path$ (str, optional): Path to save noise-reduced data from SWIFT. Defaults to Noise\_Filtered\_Data folder inside data\_path.
- $results\_path$ (str, optional): Path to save non-manipulated data from SWIFT. Defaults to Results folder inside root\_path.
- $noise\_images\_path$ (str, optional): Path to save noise-reduced plots from SWIFT. Defaults to Noise\_Filter\_Images folder inside results\_path.
- $table\_path$ (str, optional): Path to save tables from SWIFT. Defaults to Tables folder inside root\_path.

However, the most simple and easy implementation of **ClassiPyGRB** only uses $root\_path$ as a main folder and saves both data and results in this path:


In [2]:
swift = SWIFT(root_path=r'type-your-path-here', res=16)

Now, there are two different approaches to download data from Swift/BAT. Download only one GRB or the complete dataset.

## Single data download
If you'd like', for example, to download data only for GRB060614, you can use the `single_download` instance. This function only receives the GRB Name as argument. This instance returns a pandas Dataframe containing data from the requested GRB or, if there is any error when requesting, a string containing details about the error:

In [3]:
name = 'GRB060614'  # Change this name if you want another GRB
result_GRB = swift.single_download(name)
print(f"{name} has been downloaded") if result_GRB is None else print(f"Error downloading {name} data: {result_GRB}")

GRB060614 has been downloaded


At this point, there is a remark: **For some GRBs, there are not any data due to Swift technical problems**. On June 27, 2022, there are (at least) 22 GRBs with this issue for the 64ms binning:  _170131A, 160623A, 070125, 060123, 160409A, 140611A, 131031A, 130913A, 130518A, 120817B, 110604A, 101204A, 090827, 090720A, 071112C, 071028B, 071010C, 071006, 070227, 140909A, and 041219A._ If you get the _Not Found for url_ error, may be due to this.

## Multiple data download

The `multiple_downloads` instance use the same arguments as `single_download`: An array of GRB names. Additionally, there is a boolean value named `error` to indicate if you want to save a report datafile. Let me import the GRB Names from the summary table of Swift:


In [4]:
df = swift.summary_table()  # Obtain Summary Table
GRB_names = df['GRBname']  # Extract column with GRB Names
print(GRB_names)

0       GRB220715B
1       GRB220714B
2       GRB220711B
3       GRB220708A
4       GRB220706A
           ...    
1522     GRB041220
1523    GRB041219C
1524    GRB041219B
1525    GRB041219A
1526     GRB041217
Name: GRBname, Length: 1527, dtype: object


To download the entire GRB dataset, you need only one line of code:

In [5]:
swift.multiple_downloads(GRB_names)

Downloading: 100%|██████████| 1527/1527 [03:29<00:00,  7.30GRB/s]


Now, in the Data folder created, you can see the `Original_Data` subfolder and the "Errors_64ms.txt" summary file. Reading this file, we can check how many errors there are:

In [6]:
df_error = pd.read_table(os.path.join(swift.original_data_path, f"Errors_{swift.end}.txt"), sep='\t', comment='#', names=['GRB Name', 'Error'], header=None)
GRB_errors = np.array(df_error['GRB Name'])
print(df_error)

      GRB Name                                              Error
0   GRB170131A  404 Client Error: Not Found for url: https://s...
1   GRB160623A  404 Client Error: Not Found for url: https://s...
2   GRB160409A  404 Client Error: Not Found for url: https://s...
3   GRB150407A  404 Client Error: Not Found for url: https://s...
4   GRB140909A  404 Client Error: Not Found for url: https://s...
5   GRB140611A  404 Client Error: Not Found for url: https://s...
6   GRB131031A  404 Client Error: Not Found for url: https://s...
7   GRB130913A  404 Client Error: Not Found for url: https://s...
8   GRB130518A  404 Client Error: Not Found for url: https://s...
9   GRB120817B  404 Client Error: Not Found for url: https://s...
10  GRB110604A  404 Client Error: Not Found for url: https://s...
11  GRB101204A  404 Client Error: Not Found for url: https://s...
12   GRB090827  404 Client Error: Not Found for url: https://s...
13  GRB090720A  404 Client Error: Not Found for url: https://s...
14  GRB071

If you get some _HTTPSConnectionPool_ or _HDF5ExtError_ in the errors summary file, you can run the following code lines as many times as you need:

In [7]:
match = np.where(np.isin(GRB_names, GRB_errors))[0]  # Index the IDs of GRB Errors
swift.multiple_downloads(np.array(GRB_names[match]))  # Try to re-download the GRBs

Downloading: 100%|██████████| 22/22 [00:01<00:00, 15.78GRB/s]


By this, we close this section by remarking that original size data can use 2.4 GB of free space on disk approximately. Actually, there are:

In [8]:
size = 0  # Set size variable to zero
for path, dirs, files in os.walk(swift.original_data_path):  # Loop over the folder containing all data downloaded
    for f in files:  # Loop over files into folder
        fp = os.path.join(path, f)  # Join file name with folder path
        size += os.stat(fp).st_size  # Get file size and sum over previous size
print(f"There are {round(size / (1024 * 1024), 3)} MB of data")

There are 2492.048 MB of data
