## For 1015861
This notebook will be used to generate the cmask (overall) and the individual masks.
I'm thinking that we can make each half mask seperately using the polygon feature, and then combine those masks with a general cmask to make the mask that actually goes into the calibration directory in the experiment folder. Then we can use the indivdual half masks and output them seperately into another dir in the exp folder and then we can point the producer to them so that we get two azavs.

# Mask Maker Notebook

This notebook will help you make a Jungfrau4M mask and apply it to the data stream.
## 0 - Loading the data
This cell will load pertinent data from the pre-processed .h5 files of three seprate runs.
- Dark (no X-rays)
- Background (X-rays, no sample)
- Sample (X-rays, with sample)

#### Notes
__The following keys are required__ inside the .h5 files for the masking process.
- `'lightStatus/xray'`
- `'lightStatus/laser'`
- `'Sums/jungfrau4M_calib_dropped'`
- `'Sums/jungfrau4M_calib_xrayOn_thresADU1'`

Some things can be auto-loaded in via the config.yaml specification. However, this is not required. You can add your own things here if you want.

In [None]:
import xrayscatteringtools as xrst
################ Edit these parameters before starting ################
# Run numbers
dark_run_number = 1
background_run_number = 2
sample_run_number = 3
# Experiment & path
experiment = xrst.get_config('expNumber') # Auto with config.yaml
data_path = xrst.get_data_paths(dark_run_number) # Auto with config.yaml
# Optimized geometry parameters, these are general. They do not have to be perfect.
x0 = 0
y0 = 0
z0 = 90_000 
photon_energy = xrst.get_config_for_runs(sample_run_number,'photon_energy','energy') # Auto with config.yaml
verbose=True
###############################################################################
# Create mask maker object
MaskMaker = xrst.calib.MaskMaker(
    experiment = experiment,
    data_path = data_path,
    dark_run_number=dark_run_number,
    background_run_number=background_run_number,
    sample_run_number=sample_run_number,
    verbose=verbose
)

## 1 - Process Dark run
This will take a __non-thresholded, pedestal-subtracted, unmasked average__ from the `Jungfrau4M.calib(cmpars=(7, 10, 0),mbits=0)` method from `psana`, which are the same images that are sent through the pre-processing notebook.
#### Parameters
- lb $\to$ Lower bound for cutoff. If set to `None`, it will prompt the user for an input and calculate the 0.5 percentile bound as a recommendation. Default is `None`
- ub $\to$ Upper bound for cutoff. If set to `None`, it will prompt the user for an input and calculate the 99.5 percentile bound as a recommendation. Default is `None`.
- plotting $\to$ `bool`. Determines if plotting happens or not. Plotting slows down the function significatly as it calls `xrayscatteringtools.plot_j4m` three times among some other plots. Then again, you can see the masks. It can be nice. Default is `True`
#### Notes
The upper and lower bounds are for determining the __inclusive data to keep__. Any pixels outside of the range are selected for masking.


The histograms which are plotted to help the user determine bounds visually have the x-axis scaled via `np.arcsinh()`, which takes the hyperbolic arcsin. This is done because most pixels that need to be masked have very large values and would throw off the scale. $\text{arcsinh}(x) \approx x$ for $x \to 0$, and $\text{arcsinh}(x) \approx \text{ln(2x)}$ for $x \to \infty$. In a way, it scales large numbers down similarly to taking a log, but it works with negatives. The bounds that the user selects are what the user sees on the plot, __not__ the values before taking the arcsinh.

When `ub` and `lb` are none, the user may leave the input fields blank to use the recommended values. This will always result in 1% of the pixels being masked, which usually does the job.

The main idea of this mask is to filter out bad pixels. If you are plotting the results, the last two jungfrau images are color scaled automatically. Indication of a good mask is when the masked detector image looks like "snow", i.e. you can see the inherent noise from the pixels. 

In [None]:
MaskMaker.process_dark(
    lb=None,
    ub=None,
    plotting=True
)

## 2 - Process Background run
This will take a __1 ADU thresholded, pedestal-subtracted, dark-masked average__ from the `Jungfrau4M.calib(cmpars=(7, 10, 0),mbits=0)` method from `psana`, which are the same images that are sent through the pre-processing notebook.
#### Parameters
- lb $\to$ Lower bound for cutoff. If set to `None`, it will prompt the user for an input and recommend 0. Default is `None`
- ub $\to$ Upper bound for cutoff. If set to `None`, it will prompt the user for an input and calculate the 99 percentile bound as a recommendation. Default is `None`.
- plotting $\to$ `bool`. Determines if plotting happens or not. Plotting slows down the function significatly as it calls `xrayscatteringtools.plot_j4m` three times among some other plots. Then again, you can see the masks. It can be nice. Default is `True`
#### Notes
The upper and lower bounds are for determining the __inclusive data to keep__. Any pixels outside of the range are selected for masking.

When `ub` and `lb` are none, the user may leave the input fields blank to use the recommended values. This will always result in ~1% of the pixels being masked, which usually does the job.

When calculating the 99th percentile, it excludes pixels that have already been masked. Therefore, usually a little less than 1% of the total pixels are masked.

The general idea of background masking is to remove any pixels which are consistantly hit by x-rays from the background. The second jungfrau shaped plot already has the dark-pixels masked (np.nan style) to highlight these over the bad pixels from the dark mask. The color limits are set automatically. Therefore, a good mask is when you can again see the inherent noise in pixels on the third plot. There is a block of overactive pixels which are in quadrant 5, this is expected.


In [None]:
MaskMaker.process_background(
    lb=None,
    ub=None,
    plotting=True
)

## 3 - Optionally, make polygon mask
This will take a __1 ADU thresholded, pedestal-subtracted, dark/background-masked sample average__ from the `Jungfrau4M.calib(cmpars=(7, 10, 0),mbits=0)` method from `psana`, which are the same images that are sent through the pre-processing notebook.

Then it will either draw a polygon interactively or from supplied verticies.

#### Parameters
- num_points $\to$ `int`. Number of polygon vertices (must be >= 3).
- points $\to$ list of (x, y) `tuples`, optional. Pre-defined vertices.  If fewer than *num_points* are given the user is prompted for the remainder.
- plotting $\to$ `bool`,optional. Show the masked result.
- z0 $\to$ `float`. Normal Z-distance scattering cell to the Jungfrau4M detector, in micron. Default is `90_000`
- *args, **kwargs. Forwarded to `:func:plot_j4m`.
#### Notes
If *points* are not (fully) provided, the user is prompted to enter coordinates one by one, with an updated plot shown after each entry for spatial context.  The polygon is automatically closed.

In [None]:
MaskMaker.apply_polygon_mask(
    num_points = 4,
    points = [
        # [-86180, -82594],
        # [86180, -82594],
        # [86180, 82594],
        # [-86180, 82594],
    ],
    plotting = True,
)

## 4 - Process Sample run
This will take a __1 ADU thresholded, pedestal-subtracted, dark/background/polygon-masked sample average__ from the `Jungfrau4M.calib(cmpars=(7, 10, 0),mbits=0)` method from `psana`, which are the same images that are sent through the pre-processing notebook.

Then, it will determine a q array, loop through it, then determine the mean $\mu$ and standard deviation $\sigma$ of the distribution of Thompson-corrected intensity values for each q-bin, and filter any pixels that are outside $\mu \pm n\cdot\sigma$, where $n$ is a keyword argument.

Then, it will check if 95% of the pixels are kept within these bounds. If so, it automatically adds the ring mask to the total mask. If not, a histogram will pop up and user input for lower and upper bounds will be required.

#### Parameters
- n_std $\to$ `float`. How many standard deviations away from the mean for each q-bin. Equivalent to $n$ in the above section. Default is `2`
- x0 $\to$ `float`. X-center of the beam relative to the Jungfrau4M detector coordinate, in micron. Default is `0`
- y0 $\to$ `float`. Y-center of the beam relative to the Jungfrau4M detector coordinate, in micron. `0`
- z0 $\to$ `float`. Normal Z-distance scattering cell to the Jungfrau4M detector, in micron. Default is `90_000`
- keV $\to$ `float`. X-ray photon energy in keV. Default is `10`
- plotting $\to$ `bool`. Determines if extra plotting happens or not. It can be nice. Default is `True`
#### Notes
Use `diagnose_q_bins()` to determine what to set n_std before running `process_sample()`. 

Upper and lower bounds are for determining the __inclusive data to keep__. Any pixels outside of the range are selected for masking.

When calculating the std, mean, and which pixels to mask, it excludes pixels that have already been masked (dark and background).

The general idea of sample masking is to remove any pixels which deviate from the mean. Anything that the dark and background masks don't catch, while also dealing with pixels that don't get hit by-xrays, such as the shadow of the tube protruding from the center of the detector. Good masking will cover the area that is shadowed, and catch any stray pixels. However, if you see concentric rings, this is bad masking. If this happens, increast the `n_std` parameter.

In [None]:
# Use this function to preview per-q-bin statistics before running :meth:`process_sample`. Usually only the first two bins or so should need manual review.
########
n_std = 2
########
MaskMaker.diagnose_q_bins(
    n_std=n_std,
    x0=x0,
    y0=y0,
    z0=z0,
    keV=photon_energy,
)

In [1]:
# With the optimal n_std set, now do the actual masking process.

In [None]:
MaskMaker.process_sample(
    n_std=n_std,
    x0=x0,
    y0=y0,
    z0=z0,
    keV=photon_energy,
    plotting=True
)

## 4 - Combine Masks
Combine the dark, background, sample, masks that wer just created, along with a line and T mask which are constant. Plot all of the masks if specified

#### Parameters
- plotting $\to$ `bool`. Determines if extra plotting happens or not. It can be nice. Default is `True`
#### Notes
Mask inspection is crucial to getting good results. Check for any outlying data. For the last plot, which shows the fully masked sample data, the bound are set automatically.

In [None]:
MaskMaker.combine_masks(
    plotting=True
)

## 5 - Save combined mask (cmask)
## For 1015861: Instead of running the cell below, you can save the mask to an arbitrary directory as a npz (smaller size) in a different way (look at the next cell)
Save the combined mask to a special directory in the experimental folder, which applies the mask to the valid runs. This mask is used to calculate the azimuthal average. It is __not__ aplied to Jungfrau4m sums. The producer will also save it into the .h5 files under the key `'UserDataCfg/jungfrau4M/cmask'`.

#### Parameters
- valid_from_run $\to$ `int`. Determines where this mask will be valid from. If `None`, defaults to the run number for the vacuum background. Default is `None`
- mask_directory $\to$ `str`. Where the mask will be saved. Only change this if you want to save to a different location. If `None`, will save to the specific directory so that it can be applied to data processing. Default is `None`
#### Notes
Typically leave both of the keyword arguments as `None`

In [None]:
MaskMaker.save_mask(
    valid_from_run=None,
    mask_directory=None
)

## For 1015861:
### Save mask to arbitrary location as npz

In [None]:
##### Half mask folder #####
_half_mask_path = 'Name.npz'
############################
np.savez(_half_mask_file,mask=MaskMaker.cmask.astype(bool))