# How to use Mask objects

In this tutorial, we go over the importance of masks and how they are made and used. 

First import `specpolFlow` and any other packages.

In [2]:
## Importing Necessary Packages
import pandas as pd
import specpolFlow as pol

import matplotlib.pyplot as plt
import numpy as np


## What is a Mask?

Analytically, a mask is a function with Dirac deltas at wavelengths corresponding to specific spectral lines. The amplitude of the Dirac delta function corresponds to the line depth. Numerically, a mask is an array of zeros with a depth at the center of each line. Thus, a **mask tells us the location and depth of all lines** in a spectrum but does not tell us about the shape of the lines of the spectrum as a whole.

## Why do we care?

Given an LSD profile and a mask, we can convolve the LSD profile with the mask to get the spectrum. Typically, though, we have the spectrum and a mask but want the LSD profile. This process of going from a spectrum and a line mask to an LSD profile can be done by doing the reverse operation called deconvolution. **We need a mask to help us weigh each spectral line in the spectrum so that they can be averaged together in an LSD**.

```{note}
Hydrogen lines are automatically excluded when `atomsOnly = True`. This is done because the hydrogen lines, due to their broad wings, have a different shape than all the other lines in the spectrum.
```

## Mask creation 

We will use the `make_mask` function to create a mask. You will usually only use the arguments *lineListFile* and *maskFile*, as well as two optional arguments, *depthCutoff* and *atomsOnly*.

```{margin}

:::{seealso}
You can see the [Mask API](https://folsomcp.github.io/specpolFlow/API/Mask_API.html) for more information on available kwargs.
:::

```

- `lineListFile` is the name of the file containing the line list;
- `maskFile` is the name of the file to write the output mask to;
- `depthCutoff` is a float that only include lines in the mask that are deeper than this value;
- `atomsOnly` is a boolead that decides whether to include only atomic lines (no molecular lines and no H-lines).

The input line list is a VALD line list file obtained from the [VALD website](http://vald.astro.uu.se). More details about VALD are given in the tutorial {doc}`../GetStarted/OneObservationFlow_Tutorial`. In the example case used below, we use all atomic lines in the line list except those without effective Lande factors and the H-lines. 

In [19]:
LineList_file_name = '../GetStarted/OneObservationFlow_tutorialfiles/LongList_T27000G35.dat'
Mask_file_name = '../GetStarted/OneObservationFlow_tutorialfiles/test_output/T27000G35_depth0.02.mask'

mask_clean = pol.make_mask(LineList_file_name, Mask_file_name, 
                           depthCutoff = 0.02, atomsOnly = True)

missing Lande factors for 160 lines (skipped) from:
['He 2', 'O 2']
skipped all lines for species:
['H 1']


```{warning}
The `make_mask` function will automatically attempt to calculate the **effective Lande factor** for lines that are not given. 

However, if one is unable to be calculated it will be excluded if `includeNoLande = False` or it will equal the `DefaultLande` value if `includeNoLande = True`.
```

## Mask Cleaning

After obtaining our mask, the next step is to clean it. **Mask cleaning** involves removing lines that we do not want to use in the computation of LSD profiles. Typically, we exclude lines that fall within the **Telluric regions** and those within the **H wings**. The lines within the Telluric regions are contaminated by Earth's atmosphere and are therefore unusable. The lines in the H wings are blended with Hydrogen lines, and hence, they are also contaminated and unusable because they have a different shape that can affect averaging. Additionally, when dealing with stars with emission, care should be taken to **exclude emission lines** as they have different shapes.


This tutorial will clean the mask using some already defined regions (see {doc}`./4-ExcludeMaskRegionClass_Tutorial` for more details).

In [4]:
# inputs
velrange = 600 # units are in km/s
excluded_regions = pol.get_Balmer_regions_default(velrange) + pol.get_telluric_regions_default()

# display the excluded regions using Pandas
pd.DataFrame(excluded_regions.to_dict())

Unnamed: 0,start,stop,type
0,654.967529,657.594471,Halpha
1,485.167047,487.112953,Hbeta
2,433.181299,434.918701,Hgamma
3,409.349092,410.990908,Hdelta
4,396.21543,397.80457,Hepsilon
5,360.0,392.0,Hjump
6,587.5,592.0,telluric
7,627.5,632.5,telluric
8,684.0,705.3,telluric
9,717.0,735.0,telluric


Once we have our excluded regions, we can clean the mask using the `mask.clean` function. This function takes in the name of the uncleaned mask, the name of the output mask file, and the dictionary containing the excluded regions. The output is a cleaned line mask, in which lines that fall within the `excluded_regions` have been removed.


In [9]:
# reading in the mask that we created earlier
mask = pol.read_mask('../GetStarted/OneObservationFlow_tutorialfiles/test_output/T27000G35_depth0.02.mask')

# applying the ExcludeMaskRegions that we created
mask_clean = mask.clean(excluded_regions)

# saving the new mask to a file
mask_clean.save('../GetStarted/OneObservationFlow_tutorialfiles/test_output/hd46328_test_depth0.02_clean.mask')

## Other useful tools

1. **Interactive Line Cleaning**

    SpecpolFlow also includes an interactive tool to visually inspect a spectrum, select/deselect lines, and compare with the model spectrum from an LSD profile calculated on the fly.

1.  **Prune**

    Additionally, the Mask class has functions to `prune` the mask object, removing all lines from the list that have `iuse = 0`.

In [10]:
# reading in the cleaned mask that we created earlier
mask_clean = pol.read_mask('../GetStarted/OneObservationFlow_tutorialfiles/test_output/hd46328_test_depth0.02_clean.mask')
print('Number of lines in the clean mask with iuse = 0: {}'.format(mask_clean[np.where(mask_clean.iuse==0)].iuse.size))

mask_clean_prune=mask_clean.prune()
print('Number of lines in the pruned mask with iuse = 0: {}'.format(mask_clean_prune[np.where(mask_clean_prune.iuse==0)].iuse.size))

Number of lines in the clean mask with iuse = 0: 533
Number of lines in the pruned mask with iuse = 0: 0


````{margin}
```{attention}
Here, lines with `iuse = 0` are also included, so make sure to prune the mask beforehand. 
```
````

3. **Get Line Weights**

    We can calculate the LSD weight of all lines in the mask using the `get_weights` function. This function requires the following inputs:
    * `normDepth`: the normalizing line depth;
    * `normWave`: the normalizing wavelength in nm; 
    * `normLande`: the normalizing effective Lande factor.
    
    The function then outputs two arrays, the weight of the stokes I lines, and the weight of the stokes V lines.

In [18]:
weightI, weightV = mask_clean_prune.get_weights(normDepth=0.2,normWave=500.0,normLande=1.2)

print(weightI)
print(weightV)

[1.735 1.99  0.14  ... 0.25  0.285 0.335]
[1.51113292 1.52142026 0.08239234 ... 0.39070416 0.4466502  0.52675226]
