Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add flagging functions for power spectra #157

Closed
wants to merge 9 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion hera_pspec/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
"""
__init__.py file for hera_pspec
"""
from hera_pspec import version, conversions, grouping, pspecbeam, plot, pstokes, testing
from hera_pspec import version, conversions, grouping, pspecbeam, plot, pstokes, testing, flags
from hera_pspec import uvpspec_utils as uvputils

from hera_pspec.uvpspec import UVPSpec
Expand Down
83 changes: 83 additions & 0 deletions hera_pspec/flags.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
import numpy as np

mask_generator(nsamples, flags, n_threshold, greedy=False, axis, greedy_threshold, retain_flags=True):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a more descriptive function name, e.g. construct_factorizable_mask?

"""
Generates a greedy flags mask from input flags and nsamples arrays
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth briefly explaining how the algorithm works here.


Parameters
----------
nsamples : numpy.ndarray
integer array with number of samples available for each frequency channel at a given LST angle

flags : numpy.ndarray
binary array with 1 representing flagged, 0 representing unflagged
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the same as the output of UVData.get_flags(), or does that need to be modified in some way before passing it to this function?


n_threshold : int
minimum number of samples needed for a point to remain unflagged

greedy : bool
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if it's False?

greedy flagging is used if true (default is False)

axis : int
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be better off as a string, and maybe with a name change, e.g. first='row'.

which axis to flag first if greedy=True (1 is row-first, 0 is col-first)

greedy_threshold : float
if greedy=True, the threshold used to flag rows or columns if axis=1 or 0, respectively

retain_flags : bool
LST-Bin Flags are left flagged even if thresholds are not met (default is True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data going into this are not necessarily LST binned.


Returns
-------
mask : numpy.ndarray
binary array of the new mask where 1 is flagged, 0 is unflagged

"""

shape = nsamples.shape
flags_output = np.zeros(shape)

num_exactly_equal = 0

# comparing the number of samples to the threshold

for i in range(shape[0]):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this nested for loop perhaps be replaced by a couple of strategic calls to np.where? We can discuss this.

for j in range(shape[1]):
if nsamples[i, j] < n_threshold:
flags_output[i, j] = 1
elif nsamples[i, j] > n_threshold:
if retain_flags and flags[i, j] == 1:
flags_output[i, j] = 1
else:
flags_output[i, j] = 0
elif nsamples[i, j] == n_threshold:
if retain_flags and flags[i, j] == 1:
flags_output[i, j] = 1
else:
flags_output[i, j] = 0
num_exactly_equal += 1

# the greedy part

if axis == 0:
if greedy:
column_flags_counter = 0
for j in range(shape[1]):
if np.sum(flags_output[:, j])/shape[0] > greedy_threshold:
flags_output[:, j] = np.ones([shape[0]])
column_flags_counter += 1
for i in range(shape[0]):
if np.sum(flags_output[i, :]) > column_flags_counter:
flags_output[i, :] = np.ones([shape[1]])
elif axis == 1:
if greedy:
row_flags_counter = 0
for i in range(shape[0]):
if np.sum(flags_output[i, :])/shape[1] > greedy_threshold:
flags_output[i, :] = np.ones([shape[1]])
row_flags_counter += 1
for j in range(shape[1]):
if np.sum(flags_output[:, j]) > row_flags_counter:
flags_output[:, j] = np.ones([shape[0]])

return flags_output