<p><img alt="Nasa Space Apps Logo" width="140" height="140" src="https://www.nasa.gov/wp-content/uploads/2021/07/space_apps_003.png" align="left" hspace="15px" ></p>


<h1>Helios</h1>

----
<p> <b>Challenge: </b>Save the Earth from another Carrington Event!</p>
<div align="right">
<font size=3>
 <br><br>
  
</font>
</div>



[See More](https://www.youtube.com/watch?v=m_pDSJive-E&embeds_referring_euri=https%3A%2F%2F2022.spaceappschallenge.org%2F&source_ve_path=Mjg2NjY&feature=emb_logo)

## Data Scrapping


The Large Angle and Spectrometric Coronagraph (LASCO) is a suite of three solar coronagraphs onboard the Solar and Heliospheric Observatory (SOHO) satellite, which continuously observe the solar corona and provide crucial data for studying coronal mass ejections (CMEs). The LASCO instrument includes a Fabry-Pérot interferometer coronagraph (C1) and two white light coronagraphs (C2 and C3) that image the corona at different distances from the Sun. In your project, you plan to utilize the CACTUS (Computer Aided CME Tracking Software) tool, which autonomously detects and tracks CMEs in the LASCO image sequences. CACTUS provides a list of detected CME events, including their principal angle, angular width, and velocity estimates, allowing for faster and more objective identification of potentially dangerous CMEs compared to manual catalogs. By analyzing the CACTUS output based on the LASCO data, you can gain valuable insights into the frequency, properties, and potential impacts of CMEs that could pose a threat to the Earth and space-based systems, enabling you to develop strategies for mitigating these risks.

[More info about Lasco](https://www.swpc.noaa.gov/products/lasco-coronagraph)
<br>
[More info about Cactus](https://www.sidc.be/cactus/)

**Import required modules**

In [1]:
import pandas as pd
import urllib.request

 We utilize the CACTUS package to autonomously identify coronal mass ejections (CMEs) within image sequences obtained from LASCO. By leveraging this package, we generate a comprehensive list of CME events akin to traditional catalogs. This list includes crucial parameters such as principal angle, angular width, and velocity estimation for each detected CME. Unlike manual assembly by human operators, our software-driven approach offers significant advantages. It enables faster detection, a critical factor in the realm of space weather monitoring. Additionally, the automated detection criteria encoded within the software promotes objectivity in the identification process, ensuring consistent and reliable results.

In [None]:
def get_lasco(year:str, month:str):
    """Get dataFrame of CME detected by Lasco catalog CACTUS of the giving date
    # Update every five days
    Arguments: year: str; month: str (in numeric format)
    Example of expected arguments: Year= '2021', month= '04' 
    OUTPUT: DataFrame"""


    #Get data from Cactus cmecat.txt of this month
    cactus = f'https://www.sidc.be/cactus/catalog/LASCO/2_5_0/qkl/{year}/{month}/cmecat.txt'

    cmecat = urllib.request.urlopen(cactus)
    lines = []

    # -- Decodificar el txt --
    for line in cmecat:
        decoded_line = line.decode("utf-8")
        lines.append(decoded_line)

    # -- Limpiar los datos -- 
    datos = lines[26: 26+lines[26:].index(' \n')]    # Los datos inician en la fila 26 y terminan cuando aparece ' \n'
    data = {i: [j.replace('\n', '').replace('?', '').replace('#', '') for j in datos[i].split('|')] for i in range(len(datos))}

    # -- Crear dataframe auxiliar para corregir las columnas --
    df_cme = pd.DataFrame.from_dict(data, orient='Index')
    df_cme.columns = [df_cme.iloc[0][i].replace(' ', '') for i in range(df_cme.shape[1])] #Columsn tag are the first line, without spaces

    lasco = pd.read_json(df_cme.iloc[1:].to_json()).set_index('CME') #Crear el dataframe cme
    lasco['t0'] = pd.to_datetime(lasco.t0)
    lasco.rename(columns={'t0':'time_tag'}, inplace=True)

    return lasco


In [None]:


def get_lasco_rt():
    """Get dataFrame of CME detected by Lasco near real time data
    OUTPUT: DataFrame"""

    import pandas as pd
    import urllib.request
    
    cactus = 'https://www.sidc.be/cactus/out/cmecat.txt'
    cmecat = urllib.request.urlopen(cactus)
    lines = []

    # -- Decodificar el txt --
    for line in cmecat:
        decoded_line = line.decode("utf-8")
        lines.append(decoded_line)

    # -- Limpiar los datos -- 
    datos = lines[26: 26+lines[26:].index(' \n')]    # Los datos inician en la fila 26 y terminan cuando aparece ' \n'
    data = {i: [j.replace('\n', '').replace('?', '').replace('#', '') for j in datos[i].split('|')] for i in range(len(datos))}

    # -- Crear dataframe auxiliar para corregir las columnas --
    df_cme = pd.DataFrame.from_dict(data, orient='Index')
    df_cme.columns = [df_cme.iloc[0][i].replace(' ', '') for i in range(df_cme.shape[1])] #Columsn tag are in the first line, without spaces

    lasco = pd.read_json(df_cme.iloc[1:].to_json()).set_index('CME') #Crear el dataframe cme
    lasco['t0'] = pd.to_datetime(lasco.t0)
    lasco.rename(columns={'t0':'time_tag'}, inplace=True)
    
    return lasco

#Funciones para obtener Xray
def get_goes():
    """"Get the 7 days-real time data from xray/goes 16
    OUTPUT: DataFrame"""
    url = 'https://services.swpc.noaa.gov/json/goes/primary/xrays-7-day.json'
    xray = pd.read_json(url)

    # -- Manejo de datos temporales --
    xray.time_tag = xray['time_tag'].apply(lambda x: x.replace('T', ' ').replace('Z', ''))
    xray.time_tag = pd.to_datetime(xray.time_tag)

    return xray




<h3>Data Usage</h3>

[Automated detection of CMEs in LASCO data](https://ui.adsabs.harvard.edu/abs/2004A%26A...425.1097R/abstract) Berghmans, D.; Foing, B. H.; Fleck, B.<br>

[DSCOVR real time solar wind](https://www.swpc.noaa.gov/products/real-time-solar-wind)

[Goes Xray real time](https://www.swpc.noaa.gov/products/goes-x-ray-flux)
   