# A workflow for calculating SHARP keywords directly from magnetic field maps using Dask

The SHARP data series include patches of vector magnetic field data, or all three components of the surface magnetic field, taken by the Helioseismic and Magnetic Imager (HMI) instrument on NASA's Solar Dynamics Observatory (SDO) satellite. These patches encapsulate automatically-detected active regions that are tracked throughout the entirety of their disk passage. A bitmap array, called `bitmap`, describes whether any given pixel in any array is part of an active region. Another bitmap, called `conf_disambig`, describes the confidence in the azimuthal component of the magnetic field vector. The SHARP data seres also include keywords, as metadata, that describe various physical parameters of solar active regions; for example, the total unsigned flux within an active region. 

In this notebook, we will calculate these SHARP keywords directly from the magnetic field maps using Dask.

First, we import some modules (note that this assumes that [dask-jobqueue](https://jobqueue.dask.org/en/latest/install.html) is already installed and the [Jupyter notebook is properly configured for interactive use](https://jobqueue.dask.org/en/latest/interactive.html)).

In [1]:
import dask
import drms
import numpy as np
import math
from astropy.io import fits
from datetime import datetime as dt_obj
from dask.distributed import Client
import dask.array as da
import calculate_swx_fits as swx

Define some constants useful for calculating space weather keywords:

In [2]:
radsindeg = np.pi/180.
munaught  = 0.0000012566370614

Now, start a Dask client. In this case, we're running Dask locally on a personal machine, so [scheduling](https://docs.dask.org/en/latest/scheduling.html#) is relatively straightforward. The Dask client starts up a dashboard for monitoring the machine and computational processes.

In [3]:
dask_client = Client()
dask_client

0,1
Client  Scheduler: tcp://127.0.0.1:51699  Dashboard: http://127.0.0.1:8787/status,Cluster  Workers: 4  Cores: 8  Memory: 17.18 GB


Fetch the magnetic field image data using SunPy affiliated package [`drms`](https://joss.theoj.org/papers/10.21105/joss.01614).

In [4]:
drms_client = drms.Client()

In [5]:
keys, segments = drms_client.query('hmi.sharp_cea_720s[377][2011.02.14_15:00:00/12h]', key='T_REC, CDELT1, RSUN_REF, RSUN_OBS, DSUN_OBS, USFLUX, ERRVF, CMASK', seg='Br, Bp, Bt, Br_err, Bp_err, Bt_err, bitmap, conf_disambig')

In [6]:
def parse_tai_string(tstr,datetime=True):
    year   = int(tstr[:4])
    month  = int(tstr[5:7])
    day    = int(tstr[8:10])
    hour   = int(tstr[11:13])
    minute = int(tstr[14:16])
    if datetime: return dt_obj(year,month,day,hour,minute)
    else: return year,month,day,hour,minute

In [7]:
t_rec = np.array([parse_tai_string(keys.T_REC[i],datetime=True) for i in range(keys.T_REC.size)])

As an example, read one magnetic field map in as a Dask array!

In [8]:
url = 'http://jsoc.stanford.edu' + segments.Br[0]
bz = da.from_array(fits.getdata(url))
bz

Unnamed: 0,Array,Chunk
Bytes,2.24 MB,2.24 MB
Shape,"(377, 744)","(377, 744)"
Count,1 Tasks,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 2.24 MB 2.24 MB Shape (377, 744) (377, 744) Count 1 Tasks 1 Chunks Type float64 numpy.ndarray",744  377,

Unnamed: 0,Array,Chunk
Bytes,2.24 MB,2.24 MB
Shape,"(377, 744)","(377, 744)"
Count,1 Tasks,1 Chunks
Type,float64,numpy.ndarray


In [9]:
def get_data_for_a_single_trec(i):
    url = 'http://jsoc.stanford.edu' + segments.Br[i]
    bz = da.from_array(fits.getdata(url))
    url = 'http://jsoc.stanford.edu' + segments.Br_err[i]
    bz_err = da.from_array(fits.getdata(url))
    url = 'http://jsoc.stanford.edu' + segments.conf_disambig[i]
    mask = da.from_array(fits.getdata(url))
    url = 'http://jsoc.stanford.edu' + segments.bitmap[i]
    bitmask = da.from_array(fits.getdata(url))
    nx = bz.shape[1]
    ny = bz.shape[0]
    rsun_ref = keys.RSUN_REF[i]
    rsun_obs = keys.RSUN_OBS[i]
    cdelt1 = keys.CDELT1[i]
    dsun_obs = keys.DSUN_OBS[i]
    cdelt1_arcsec = (math.atan((rsun_ref*cdelt1*radsindeg)/(dsun_obs)))*(1/radsindeg)*(3600.)
    return bz, bz_err, mask, bitmask, nx, ny, rsun_ref, rsun_obs, cdelt1_arcsec

In [None]:
mean_vf_list = []
for i in range(10):
    bz, bz_err, mask, bitmask, nx, ny, rsun_ref, rsun_obs, cdelt1_arcsec = get_data_for_a_single_trec(i)
    mean_vf, mean_vf_err, count_mask  = swx.compute_abs_flux(bz, bz_err, mask, bitmask, nx, ny, rsun_ref, rsun_obs, cdelt1_arcsec)
    print(mean_vf)
    mean_vf_list.append(mean_vf)

In [None]:
mean_vf_list

In [None]:
print(mean_vf, mean_vf_err, count_mask)

In [None]:
print(keys.USFLUX[0:10],keys.ERRVF[0],keys.CMASK[0])