# The SoHO/MDI, and SDO/HMI Line-of-sight Magnetogram Dataset

In this notebook we demonstrate the process for interacting with a small sample of the co-temporal line-of-sight SoHO/MDI, and SDO/HMI dataset

*Paul Wright*

---

## Introduction

Over the last 50 years, space-, and ground-based instruments have imaged the magnetic field on the surface of the Sun. These images, known as magnetograms, have significantly advanced our understanding of solar magnetism, and its role in space-weather. Unfortunately, most instruments are operational for a time-frame of roughly one solar cycle (11-years), preventing long-term studies.

Over this time, there are two notable space-based instruments that have been paramount in improving our understanding of these processes:

### SoHO/MDI Overview

Since its launch in 2010, NASA’s Solar Dynamics Observatory (SDO; ([Pesnell et al. 2012](https://ui.adsabs.harvard.edu/abs/2012SoPh..275....3P/abstract)) has continuously monitored Sun's activity, delivering a wealth of valuable scientific data for heliophysics researchers with the use of three instruments.

The Helioseismic and Magnetic Imager (HMI; [Schou et al. 2012](https://ui.adsabs.harvard.edu/abs/2012SoPh..275..229S/abstract)) was launched as part of NASA’s Solar Dynamics Observatory (SDO; ([Pesnell et al. 2012](https://ui.adsabs.harvard.edu/abs/2012SoPh..275....3P/abstract)) in 2010. HMI is a sucessor to MDI, and captures visible wavelength filtergrams of the full Sun at 4096 x 4096, with a resolution of 1 arcsecond (0.5 arcsecond/pixel) and takes one magnetogram every 45 seconds. These are processed into a number of data products, including photospheric Dopplergrams, line-of-sight magnetograms, and vector magnetograms ([Hoeksema et al. 2014](https://ui.adsabs.harvard.edu/abs/2014SoPh..289.3483H/abstract)).

Prior to 2010, the Michelson Doppler Imager (MDI; scherrer et al 1995) on board the Solar and Heliospheric Observatory (SOHO) was the communities flagship space-based magnetic field imager. MDI is HMI’s predecessor with a resolution of 4 arcseconds (2 arcseconds/pixel).

MDI and HMI have an overlapping set of 4,850 magnetograms. MDI is an instrument that is no longer operational, but who was designed as a scientific instrument. A successful calibration enables the joint use of both instruments from 1996 until the present (>24 years).


### SDO/HMI Overview

 its launch in 2010, NASA’s Solar Dynamics Observatory (SDO; ([Pesnell et al. 2012](https://ui.adsabs.harvard.edu/abs/2012SoPh..275....3P/abstract)) has continuously monitored Sun's activity, delivering a wealth of valuable scientific data for heliophysics researchers with the use of three instruments.

The Helioseismic and Magnetic Imager (HMI; [Schou et al. 2012](https://ui.adsabs.harvard.edu/abs/2012SoPh..275..229S/abstract)) was launched as part of NASA’s Solar Dynamics Observatory (SDO; ([Pesnell et al. 2012](https://ui.adsabs.harvard.edu/abs/2012SoPh..275....3P/abstract)) in 2010. HMI is a sucessor to MDI, and captures visible wavelength filtergrams of the full Sun at 4096 x 4096 resolution (a pixel size of 0.5 arcsecond), which are then processed into a number of data products, including photospheric Dopplergrams, line-of-sight magnetograms, and vector magnetograms ([Hoeksema et al. 2014](https://ui.adsabs.harvard.edu/abs/2014SoPh..289.3483H/abstract)).


## The co-temporal magnetogram dataset

This contains pre-processed magnetograms covering 1995 - 2021, with co-temporal, co-aligned data during the one-year of overlap (2010 - 2011).

These data have been 

* Cleaned based on `QUALITY` flags
* Rotated to Solar North
* Normalised the Solar radius to account for different instruments residing in different obits
* De-rotated each instrument to a common observer location
* Temporally aligned
* and finally, registered on a per-patch basis to account for a distortion map that exists between both instruments 

*The dataset currently in preparation for publication, and will be accessible from the Google Cloud Platform*

### Who is the dataset for?

The sheer volume of structured scientific data obtained by SDO and SoHO, is ideal for a range machine learning tasks from time-series analysis, to computer vision (such as super-resolution tasks). Magnetic field images are unique in that the distribution of values are non-Gaussian, centered around zero, and importantly, *unbounded* at high-, and low- magnetic field values.

While the HMI and MDI data are easily accessible, pre-processing these data for scientific analysis requires specialised heliophysics (and instrument-specific) knowledge as each instrument observes the Sun from a different location in space. This may act as an unnecessary hurdle for non-heliophysics machine learning researchers who may wish to experiment with datasets from the physical sciences, but are unaware of domain-specific nuances (e.g., that images must be spatially and temporally adjusted).

**Our aim is to supply this standardised co-temporal, co-aligned data-set for heliophysicists who wish to use machine learning in their own research, as well as machine-learning researchers who wish to develop models specialized for the physical sciences.**

---

## Table of Contents

The notebook is set out as follows:

1. Setting up the notebook <br>
2. Loading SDO/HMI & SoHO/MDI from JSOC <br>
    2a. Generating a SunPy map <br>
3. Reading and Loading the ML dataset <br>
4. Visualise the ML dataset

## 1. Setting up the notebook

In [None]:
#sunpy isn't included on colab
!pip install sunpy[all]

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import time

import os
from os import listdir
from os.path import isfile, join

import sunpy
import sunpy.map
from sunpy.net import Fido, attrs as a
import matplotlib.colors as colors
import skimage.transform

## 2. Loading SDO/HMI & SoHO/MDI from JSOC


In [None]:
# Query the data (using a registered e-mail address)
jsoc_email = 'paul@wrightai.com'

In [None]:
mdi_data = Fido.search(
    a.Time('2011/03/31 01:36:00', '2011/03/31 01:37:00'),
    a.jsoc.Series.mdi_fd_m_96m_lev182,
    a.jsoc.Notify(jsoc_email),
)
dwnld_mdi = Fido.fetch(mdi_data[0])

hmi_data = Fido.search(
    a.Time('2011/03/31 01:34:00', '2011/03/31 01:35:00'),
    a.jsoc.Series.hmi_m_720s,
    a.jsoc.Notify(jsoc_email),
)
dwnld_hmi = Fido.fetch(hmi_data[0])

due to the nature each telescope being in a different orbit, SDO/HMI and SoHO/MDI observations are not co-aligned by default. This can be easily demonstrated as follows:

In [None]:
import astropy.io.fits as fits

### 2a. Generating a SunPy map

In [None]:
mdi_map = sunpy.map.Map(dwnld_mdi[0])
mdi_rotated = mdi_map.rotate(order=3)

hmi_map = sunpy.map.Map(dwnld_hmi[0])
hmi_rotated = hmi_map.rotate(order=3)

In [None]:
hmimag = plt.get_cmap('hmimag')

In [None]:
fig = plt.figure(figsize=(16,6))

ax = fig.add_subplot(121, projection=hmi_rotated)
hmi_rotated.plot(
    norm=colors.Normalize(vmin=-1400,vmax=1400), 
    cmap=hmimag
)
North.draw_limb()
plt.colorbar()

ax = fig.add_subplot(122, projection=mdi_rotated)
mdi_rotated.plot(
    norm=colors.Normalize(vmin=-1400,vmax=1400), 
    cmap=hmimag
)
mdi_rotated.draw_limb()
plt.colorbar()

**Figure 1:** Full-disk SDO/HMI and SoHO/MDI magnetograms on 31st March 2011, after rotation to solar North. These images are prior to any further alignment

In [None]:
import astropy.units as u
from astropy.coordinates import SkyCoord

In [None]:
bottom_left = SkyCoord(453 * u.arcsec, -225 * u.arcsec, frame=hmi_rotated.coordinate_frame)
width = 60 * u.arcsec
height = 60 * u.arcsec

hmisub = hmi_rotated.submap(bottom_left, width=width, height=height)
mdisub = mdi_rotated.submap(bottom_left, width=width, height=height)

In [None]:
bicubic_sub = skimage.transform.resize(mdisub.data, hmisub.data.shape)

In [None]:
plot_images(hmisub.data, mdisub.data, bicubic_sub, [-300,300])

**Figure 2:** *Left and center*: Example pair of (non-aligned) patches from SDO/HMI, and SoHO/MDI magnetograms. *Right*: Bi-cubic upsampled SoHO/MDI magnetogram with corresponding SDO/HMI countours (+/- 300 Gauss) overplot highlight the misalignment.

---

## 3. Reading and Loading the ML dataset

In [None]:
path = '/media/paul/data/mdi-hmi/train/2011/3/31/01'
onlyfiles = [f for f in listdir(path) if isfile(join(path, f))]

## 4. Visualise the ML dataset

In [None]:
def plot_images(one, two, three, levels):
    fig = plt.figure(figsize=(17,4))
    ax = fig.add_subplot(131)

    plt.imshow(one, norm=colors.Normalize(vmin=-1400,vmax=1400), cmap=hmimag)
    plt.colorbar().set_label('Gauss')

    ax = fig.add_subplot(132)
    plt.imshow(two, norm=colors.Normalize(vmin=-1400,vmax=1400), cmap=hmimag)
    plt.colorbar().set_label('Gauss')

    ax = fig.add_subplot(133)
    plt.imshow(three, norm=colors.Normalize(vmin=-1400,vmax=1400), cmap=hmimag)
    plt.colorbar().set_label('Gauss')
    #plt.contour(data[0], colors='black', levels=[-240,240]);
    CS = ax.contour(one, colors='black', levels=levels)
    ax.clabel(CS, inline=True, fontsize=8)

def plot_images_radius(one, two, three, levels):
    fig = plt.figure(figsize=(17,4))
    ax = fig.add_subplot(131)

    plt.imshow(one, norm=colors.Normalize(vmin=-1400,vmax=1400), cmap=hmimag)
    plt.colorbar().set_label('Gauss')

    ax = fig.add_subplot(132)
    plt.imshow(two, norm=colors.Normalize(vmin=-1400,vmax=1400), cmap=hmimag)
    plt.colorbar().set_label('Gauss')

    ax = fig.add_subplot(133)
    plt.imshow(three)
    CS = ax.contour(three, colors='black', levels=levels)
    ax.clabel(CS, inline=True, fontsize=8)
    plt.colorbar().set_label('Radius [Solar Radii]')

In [None]:
hmi = np.load(join(path, 'HMI_20110331-013417_641.npy'))
mdi = np.load(join(path, 'MDI-NEW_20110331-013626_641.npy'))

In [None]:
plot_images_radius(hmi[0], mdi[0], mdi[1], [1.00])

**Figure 3:** Example pair of co-aligned, co-temporal SDO/HMI, and SoHO/MDI magnetograms (left, center), alongside the corresponding SoHO/MDI location channel, with the solar radius highlighted. As the Sun is split into ?? x ?? patches, each patch contains a corresponding location channel in solar radius.

In [None]:
hmi = np.load(join(path, 'HMI_20110331-013417_407.npy'))
mdi = np.load(join(path, 'MDI-NEW_20110331-013626_407.npy'))
bicubic = skimage.transform.resize(mdi[0], hmi[0].shape)

In [None]:
plot_images(hmi[0], mdi[0], bicubic, [-300,300])

**Figure 4:** *Left and center*: Example pair of co-aligned, co-temporal SDO/HMI, and SoHO/MDI magnetograms. *Right*: Bi-cubic upsampled SoHO/MDI magnetogram with corresponding SDO/HMI countours (+/- 300 Gauss) overplot to demonstrate alignment.

---