# GOES-R Solar Irradiance
The GOES-R series have been monitoring atmospheric conditions and space weather since 1975. They are operated by the National Oceanic and Atmospheric Administration (NOAA) with the support of the National Aeronautic and Space Administration (NASA). 
https://www.goes-r.gov/

## About
[GOES-R Extreme Ultraviolet and X-ray Irradiance Sensors (EXIS)](https://www.ncei.noaa.gov/products/goes-r-extreme-ultraviolet-xray-irradiance)
>The Geostationary Operational Environmental Satellite (GOES)-R Extreme Ultraviolet and X-Ray Irradiance Sensors (EXIS) suite consists of two main instruments, the Extreme Ultraviolet Sensor (EUVS) and the X-Ray Sensor (XRS). EUVS products include high-cadence and time-averaged measurements of irradiance for seven extreme ultraviolet (EUV) solar lines, the magnesium II core-to-wing ratio (Mg II index), and modeled EUV spectra. 


# Extreme Ultraviolet Sensors (EUVS) Data

## Lyman-Alpha (EUV Proxy)
The **Lyman-Alpha (121.6nm)** is the brightest solar wavelength in the ultraviolet. It is emitted by the hydrogen in the Sun's atmosphere. It has been found to have corralation to the square root of the F10.7 index. 

### How Lyman-Alpha corralates to the F10.7 index

>"The second model (used when the Mg II index is not available) is based on the 10.7‐cm solar radio flux (F10.7) measurements made near Ottawa and Penticton (Canada) (Tapping, 2013). The Lyman α composite goes back to 1947 with the use of the F10.7 proxy... The F10.7 proxy is based on the square root of the F10.7, because the square root of the F10.7 has a better correlation with the Lyman α measurements."
>- Machol, J., Snow, M., Woodraska, D., Woods, T., Viereck, R., & Coddington, O. (2019). An improved lyman‐alphacomposite. Earth and Space Science, 6, 2263–2272. https://doi.org/10.1029/2019EA000648

## GOES-13
- **Source:** [NOAA](https://www.ncei.noaa.gov/products/goes-1-15/space-weather-instruments)
- **File:** G15_EUVE_daily_2010_2016_v4.txt
- **Product:**  1-min & Daily Averages (version 4)
     - EUVS channel E (121.6nm) 1-min and 1-day averaged counts, irradiances and flags
- **Date:** 2006-2016

**Note:**
we will need to  multiple the degration corrected average irradiance (irrad_ly) by AU_corr to correct the irradiance given that the distance between the Earth and Sun varies throughout the year. 

In [34]:
import pandas as pd


path = "datasets/G13_EUVE_daily_2006_2016_v4.txt"

df = pd.read_csv(
    path,
    comment=";",
    sep=r"\s+",
    header=None,
    names=["date", "Julday", "counts", "flags", "num", "irrad", "irrad_ly", "au_corr"],
    na_values=["-999.0", "-999"],
    dtype=str
)


# Change irrad_ly and au_corr to numeric
df[["irrad_ly", "au_corr"]] = df[["irrad_ly", "au_corr"]].apply(pd.to_numeric, errors="coerce")

# Create datetime column
df["date"] = pd.to_datetime(df["date"], format='%Y-%m-%d', errors="coerce")

#1 AU correction
df = pd.eval("irrad_corr = df.irrad_ly * df.au_corr", target=df)

# Drop bad/missing data
df = df.dropna().reset_index(drop=True)

# Drop uneccessary column
goes13 = df.drop(["Julday", "counts", "flags", "num", "irrad",], axis=1)

# rename to stay consistent with GOES-16
goes13 = goes13.rename(columns={
    'irrad_ly': 'Lyman_alpha',
    'au_corr': 'au_factor',
    'irrad_corr': 'Lyman_alpha_corr'
})

#save as csv
goes13.to_csv("datasets/GOES13_average_irradiance.csv", index=False)

print(goes13)

           date  Lyman_alpha  au_factor  Lyman_alpha_corr
0    2006-07-04     0.006585   1.033726          0.006807
1    2006-07-05     0.006513   1.033738          0.006733
2    2006-07-06     0.006426   1.033739          0.006643
3    2006-07-07     0.006533   1.033730          0.006753
4    2006-07-08     0.006335   1.033711          0.006549
...         ...          ...        ...               ...
1729 2016-07-28     0.006372   1.030941          0.006569
1730 2016-07-29     0.006427   1.030700          0.006624
1731 2016-07-30     0.006430   1.030450          0.006626
1732 2016-07-31     0.006452   1.030191          0.006647
1733 2016-08-01     0.006444   1.029922          0.006637

[1734 rows x 4 columns]


## GOES-15
- **Source:** [NOAA](https://www.ncei.noaa.gov/products/goes-1-15/space-weather-instruments)
- **File:** G15_EUVE_daily_2010_2016_v4.txt
- **Product:**  1-min & Daily Averages (version 4)
     - EUVS channel E (121.6nm) 1-min and 1-day averaged counts, irradiances and flags
- **Date:** 2010-2016

**Note:**
we will need to  multiple the degration corrected average irradiance (irrad_ly) by AU_corr to correct the irradiance given that the distance between the Earth and Sun varies throughout the year. 

In [35]:
import pandas as pd


path = "datasets/G15_EUVE_daily_2010_2016_v4.txt"

df = pd.read_csv(
    path,
    comment=";",
    sep=r"\s+",
    header=None,
    names=["date", "Julday", "counts", "flags", "num", "irrad", "irrad_ly", "au_corr"],
    na_values=["-999.0", "-999"],
    dtype=str
)


# Change irrad_ly and au_corr to numeric
df[["irrad_ly", "au_corr"]] = df[["irrad_ly", "au_corr"]].apply(pd.to_numeric, errors="coerce")

# Create datetime column
df["date"] = pd.to_datetime(df["date"], format='%Y-%m-%d', errors="coerce")

#1 AU correction
df = pd.eval("irrad_corr = df.irrad_ly * df.au_corr", target=df)

# Drop bad/missing data
df = df.dropna().reset_index(drop=True)

# Drop uneccessary column
goes15 = df.drop(["Julday", "counts", "flags", "num", "irrad",], axis=1)

# rename to stay consistent with GOES-16
goes15 = goes15.rename(columns={
    'irrad_ly': 'Lyman_alpha',
    'au_corr': 'au_factor',
    'irrad_corr': 'Lyman_alpha_corr'
})

#save as csv
goes15.to_csv("datasets/GOES15_average_irradiance.csv", index=False)

print(goes15)

           date  Lyman_alpha  au_factor  Lyman_alpha_corr
0    2010-04-07     0.006309   1.000411          0.006312
1    2010-04-08     0.006492   1.000987          0.006498
2    2010-04-09     0.006475   1.001563          0.006485
3    2010-04-10     0.006446   1.002138          0.006460
4    2010-04-11     0.006424   1.002714          0.006441
...         ...          ...        ...               ...
2195 2016-06-02     0.006737   1.028364          0.006928
2196 2016-06-03     0.006713   1.028679          0.006906
2197 2016-06-04     0.006695   1.028986          0.006889
2198 2016-06-05     0.006666   1.029284          0.006861
2199 2016-06-06     0.006651   1.029572          0.006848

[2200 rows x 4 columns]


# GOES-16
- **Source:** [NOAA](https://www.ncei.noaa.gov/products/goes-r-extreme-ultraviolet-xray-irradiance)
- **File:** sci_euvs-l2-avg1d_g16_s20170207_e20250406_v1-0-6.nc
- **Product:**  EUVS Daily Average
     - Daily averages of spectral line irradiances, the Mg II index, and proxy spectra.
- **Date:** 2017-2025

In [36]:
import netCDF4 as nc
import xarray as xr

path = "datasets/sci_euvs-l2-avg1d_g16_s20170207_e20250406_v1-0-6.nc"

# see variables we are interested in
file = nc.Dataset(path)
print(file.variables['irr_1216'])
print("\n")
print(file.variables['MgII_standard'])
print("\n")
print(file.variables['au_factor'])

<class 'netCDF4.Variable'>
float32 irr_1216(time)
    _FillValue: -9999.0
    long_name: Daily average irradiance for 121.6-nm line.
    comments: Average excludes hours of major geocoronal impact.
    units: W/m2
    valid_min: 0.004
    valid_max: 0.2
unlimited dimensions: time
current shape = (2981,)
filling on


<class 'netCDF4.Variable'>
float32 MgII_standard(time)
    _FillValue: -9999.0
    long_name: Daily average of EXIS Mg II index scaled to  a standard Mg II measurement resolution.
    valid_min: 0.0
    valid_max: 1.0
unlimited dimensions: time
current shape = (2981,)
filling on


<class 'netCDF4.Variable'>
float32 au_factor(time)
    _FillValue: 0.0
    long_name: 1-AU factor.
    comments: Multiplicative factor to convert fluxes to values at 1 AU.
unlimited dimensions: time
current shape = (2981,)
filling on


In [37]:
# convert to dataframe
ds = xr.open_dataset(path)
df = ds.to_dataframe()

# flatten dataset (one sample per a day)
df_time = df.groupby('time').mean(numeric_only=True)

# assign features & rename
features = df_time[['irr_1216', 'MgII_standard', 'au_factor']]
features = features.reset_index() # make time a col not index

features = features.rename(columns={
    'time': 'date',
    'irr_1216': 'Lyman_alpha',
    'MgII_standard': 'MgII'
})

features.head(5)

Unnamed: 0,date,Lyman_alpha,MgII,au_factor
0,2017-02-07,0.006339,0.262891,0.972852
1,2017-02-08,0.006352,0.263016,0.973178
2,2017-02-09,0.006431,0.263391,0.973516
3,2017-02-10,0.006431,0.263411,0.973863
4,2017-02-11,0.006484,0.263713,0.974222


In [49]:
import numpy as np

# Change columns to numeric
features[['Lyman_alpha', 'MgII', 'au_factor']] = features[['Lyman_alpha', 'MgII', 'au_factor']].apply(pd.to_numeric, errors="coerce")

# Assign datetime column
features["date"] = pd.to_datetime(features["date"], errors="coerce")

#1 AU correction
features = pd.eval("Lyman_alpha_corr = features.Lyman_alpha * features.au_factor", target=features)
features = pd.eval("MgII_corr = features.MgII * features.au_factor", target=features)

# Drop bad/missing data
features[['Lyman_alpha', 'MgII', 'au_factor']] = features[['Lyman_alpha', 'MgII', 'au_factor']].replace([-9999.0, 0], np.nan)
goes16 = features.dropna().reset_index(drop=True)

#save as csv
goes16.to_csv("datasets/GOES16_average_irradiance.csv", index=False)

features.head(5)

Unnamed: 0,date,Lyman_alpha,MgII,au_factor,Lyman_alpha_corr,MgII_corr
0,2017-02-07,0.006339,0.262891,0.972852,0.006166,0.255754
1,2017-02-08,0.006352,0.263016,0.973178,0.006181,0.255961
2,2017-02-09,0.006431,0.263391,0.973516,0.00626,0.256415
3,2017-02-10,0.006431,0.263411,0.973863,0.006263,0.256527
4,2017-02-11,0.006484,0.263713,0.974222,0.006317,0.256915


In [50]:
# merge goes 13, 15, 16
full_EUV = pd.concat([goes13, goes15]).drop_duplicates(subset='date', keep='last')
full_EUV = pd.concat([full_EUV, goes16]).drop_duplicates(subset='date', keep='last')
full_EUV = full_EUV.sort_values('date').reset_index(drop=True)
full_EUV[::500]

Unnamed: 0,date,Lyman_alpha,au_factor,Lyman_alpha_corr,MgII,MgII_corr
0,2006-07-04,0.006585,1.033726,0.006807,,
500,2010-07-14,0.006411,1.033376,0.006625,,
1000,2011-11-26,0.008047,0.973841,0.007836,,
1500,2013-04-16,0.007191,1.005581,0.007231,,
2000,2014-09-02,0.007725,1.017637,0.007861,,
2500,2016-01-17,0.007477,0.967613,0.007235,,
3000,2018-01-03,0.00645,0.966849,0.006236,0.263272,0.254544
3500,2019-05-18,0.005941,1.023007,0.006078,0.262929,0.268978
4000,2020-09-29,0.006367,1.003181,0.006388,0.264396,0.265237
4500,2022-02-16,0.007725,0.975926,0.007539,0.26999,0.26349


In [53]:
full_EUV.to_csv("datasets/full_EUV.csv")
print(full_EUV)

           date  Lyman_alpha  au_factor  Lyman_alpha_corr      MgII  MgII_corr
0    2006-07-04     0.006585   1.033726          0.006807       NaN        NaN
1    2006-07-05     0.006513   1.033738          0.006733       NaN        NaN
2    2006-07-06     0.006426   1.033739          0.006643       NaN        NaN
3    2006-07-07     0.006533   1.033730          0.006753       NaN        NaN
4    2006-07-08     0.006335   1.033711          0.006549       NaN        NaN
...         ...          ...        ...               ...       ...        ...
5641 2025-04-02     0.008860   0.999291          0.008853  0.281381   0.281182
5642 2025-04-03     0.008705   0.999852          0.008704  0.280452   0.280410
5643 2025-04-04     0.008656   1.000412          0.008659  0.280049   0.280165
5644 2025-04-05     0.008621   1.000972          0.008629  0.279801   0.280073
5645 2025-04-06     0.008610   1.001531          0.008623  0.279584   0.280012

[5646 rows x 6 columns]


# X-Ray Sensors Data

## Solar Flares

## GOES-15
- **Source:** [NOAA](https://www.ncei.noaa.gov/products/goes-1-15/space-weather-instruments)
- **File:** sci_xrsf-l2-flsum_g15_s20100407_e20200304_v2-3-0.nc
- **Product:**  Flare Summary
     - List of solar flares with times, flare classes and integrated fluxes
- **Date:** 2010-2020
