You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
basically:
cooler.ICE(..., nnz_safety_margin = 0) # suggested value ~3-4
if 2 * cooler.info["nnz"] / cooler.info["nbins"] < nnz_safery_margin * min_nnz:
do_nothing_and_dont_do_ICE
if nnz_safery_margin = 1, and cooler.ICE is performed, then exactly at the margin about 1/2 of the bins will be filtered out, which is bad. If nnz_safety_margin=3, then maybe only 5-10% of the bins will be filtered out, and if the dataset is more sparce, ICE will be aborted.
The text was updated successfully, but these errors were encountered:
relatedly, should the actual value that the mad_max filter thresholds at be reported in the balance.py stats? e.g. 'mad_max_cutoff':cutoff
this could be useful for debugging balancing.
bumping this-- for balancing, it would be useful to add an easy way to see what thresholds in terms of counts a given mad-max value would return for a given cooler. it would also be useful to save the #counts as metadata.
basically:
cooler.ICE(..., nnz_safety_margin = 0) # suggested value ~3-4
if 2 * cooler.info["nnz"] / cooler.info["nbins"] < nnz_safery_margin * min_nnz:
do_nothing_and_dont_do_ICE
if nnz_safery_margin = 1, and cooler.ICE is performed, then exactly at the margin about 1/2 of the bins will be filtered out, which is bad. If nnz_safety_margin=3, then maybe only 5-10% of the bins will be filtered out, and if the dataset is more sparce, ICE will be aborted.
The text was updated successfully, but these errors were encountered: