# ACE_Import

# Data Wrangling
### ACE Satellite Mission Data - Space Weather Prediction Center
There are nine scientific instruments on ACE to make the comprehensive and coordinated in situ measurements to accomplish the scientific objectives.

These instruments are:

High Resolution Spectrometers

- Cosmic Ray Isotope Spectrometer (CRIS)
- Solar Isotope Spectrometer (SIS)
- Ultra Low Energy Isotope Spectrometer (ULEIS)
- Solar Energetic Particle Ionic Charge Analyzer (SPEICA)
- SWICS Solar Wind Ion Composition Spectrometer (SWICS)
- SWIMS Solar Wind Ion Mass Spectrometer (SWIMS)

Monitoring Instruments

- Electron, Proton and Alpha Monitor (EPAM)
- Solar Wind Electron, Proton and Alpha Monitor (SWEPAM)
- Magnetic Field Monitor (MAG)


The Space Weather Prediction Center also uses predicted ACE spacecraft location information to create a monthly list of hourly predicted locations. Files contain values from the first of the month through hour 23 of the current day. The hourly predicted location values are X, Y, and Z position in GSE coordinates, with an accuracy of 0.1 earth radii (about 600 km). 

This notebook will be aggregating the near-real-time 1-hour averaged data from [solarsoft](https://sohoftp.nascom.nasa.gov/sdb/goes/ace/monthly/). Real time solar wind data is captured from the MAG, SWEPAM, EPAM, and SIS instruments. 


For more information about the ACE Spacecraft Data, refer to the README file in the Github repo.

In [88]:
from bs4 import BeautifulSoup
import requests
from datetime import date
import pandas as pd

In [103]:
#Getting List of Files from SolarSoft

root_url ="https://sohoftp.nascom.nasa.gov/sdb/goes/ace/monthly/"
r  = requests.get(root_url)
data = r.text
soup = BeautifulSoup(data)

file_list = []


for link in soup.find_all('a'):
    hl=link.get('href')
    if hl[0].isdigit():
        file_list.append(root_url+hl)
    else:
        continue

In [104]:
#Creating Dataframe for each sensor
epam_df = pd.DataFrame(columns=['Year', 'Month', 'Day', 'HHMM', 'Julian_Day', 'Seconds_OTD', 'Status_E', 'E_38-53', 'E_175-315', 'Status_P', 'P_47-65', 'P_112-187', 'P_310-580', 'P_761-1220', 'P_1060-1910', 'Anis_Index'])
loc_df = pd.DataFrame(columns=['Year', 'Month', 'Day', 'HHMM', 'Julian_Day', 'Seconds_OTD', 'X', 'Y', 'Z'])
mag_df = pd.DataFrame(columns=['Year', 'Month', 'Day', 'HHMM', 'Julian_Day', 'Seconds_OTD', 'Status_Mag', 'Bx', 'By', 'Bz', 'Bt', 'Lat','Long'])
sis_df = pd.DataFrame(columns=['Year', 'Month', 'Day', 'HHMM', 'Julian_Day', 'Seconds_OTD', 'Status_PF_Low', '>10_MeV', 'Status_PF_High', '>30_MeV'])
swepam_df = pd.DataFrame(columns=['Year', 'Month', 'Day', 'HHMM', 'Julian_Day', 'Seconds_OTD', 'Status_SW', 'Proton_Density', 'Bulk_Speed', 'Ion_Temp'])


In [122]:
#Pulling data from solarsoft to dataframe
for link in file_list:
    print("Importing:" + link)
    data = pd.read_csv(link, comment='#' ,sep= '\s+', header = None, skiprows=2 )
    if "_epam" in link:
        data.columns = epam_df.columns
        epam_df = pd.concat([epam_df,data], ignore_index=True)
    elif "loc" in link:
        data.columns = loc_df.columns
        loc_df = pd.concat([loc_df,data], ignore_index=True)
    elif "mag" in link:
        data.columns = mag_df.columns
        mag_df = pd.concat([mag_df,data], ignore_index=True)
    elif "sis" in link:
        data.columns = sis_df.columns
        sis_df = pd.concat([sis_df,data], ignore_index=True)
    elif "_swepam" in link:
        data.columns = swepam_df.columns
        swepam_df = pd.concat([swepam_df,data], ignore_index=True)

Importing:https://sohoftp.nascom.nasa.gov/sdb/goes/ace/monthly/200008_ace_epam_1h.txt
Importing:https://sohoftp.nascom.nasa.gov/sdb/goes/ace/monthly/200008_ace_loc_1h.txt
Importing:https://sohoftp.nascom.nasa.gov/sdb/goes/ace/monthly/200008_ace_mag_1h.txt
Importing:https://sohoftp.nascom.nasa.gov/sdb/goes/ace/monthly/200008_ace_sis_1h.txt
Importing:https://sohoftp.nascom.nasa.gov/sdb/goes/ace/monthly/200008_ace_swepam_1h.txt
Importing:https://sohoftp.nascom.nasa.gov/sdb/goes/ace/monthly/200009_ace_epam_1h.txt
Importing:https://sohoftp.nascom.nasa.gov/sdb/goes/ace/monthly/200009_ace_loc_1h.txt
Importing:https://sohoftp.nascom.nasa.gov/sdb/goes/ace/monthly/200009_ace_mag_1h.txt
Importing:https://sohoftp.nascom.nasa.gov/sdb/goes/ace/monthly/200009_ace_sis_1h.txt
Importing:https://sohoftp.nascom.nasa.gov/sdb/goes/ace/monthly/200009_ace_swepam_1h.txt
Importing:https://sohoftp.nascom.nasa.gov/sdb/goes/ace/monthly/200010_ace_epam_1h.txt
Importing:https://sohoftp.nascom.nasa.gov/sdb/goes/ace/m

In [123]:
#Saving dataframes as csv files

epam_df.to_csv('/data/workspace_files/'+ str(date.today()) + '_ace_master_epam_1hr.csv', index=False)
loc_df.to_csv('/data/workspace_files/'+ str(date.today()) + '_ace_master_loc_1hr.csv', index=False)
mag_df.to_csv('/data/workspace_files/'+ str(date.today()) + '_ace_master_mag_1hr.csv', index=False)
sis_df.to_csv('/data/workspace_files/'+ str(date.today()) + '_ace_master_sis_1hr.csv', index=False)
swepam_df.to_csv('/data/workspace_files/'+ str(date.today()) + '_ace_master_swepam_1hr.csv', index=False)