# Minimum Bin Count Analysis for Spatial Probability Heatmaps

This notebook provides methods for determining appropriate minimum bin count thresholds when creating probability heatmaps from spatial data across multiple subjects and trials. We'll analyze the statistical properties of different thresholds and their impact on the reliability of probability estimates.

In [None]:
# Import required libraries
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
from matplotlib import cm
from matplotlib.patches import Polygon
from mpl_toolkits.axes_grid1 import make_axes_locatable
import pandas as pd
import scipy.stats as stats
from scipy.stats import binom
import seaborn as sns
import os
import pickle
import math

## 1. Load Sample Data

First, let's load a sample of the existing data or create simulated data that mimics your heatmap structure. We'll use this to explore different bin count thresholds.

In [None]:
# Option 1: Load existing data if available
try:
    # Try to load saved bincount data from your existing analysis
    folder = "heatmap_phigh_variables"
    variable_name = 'competitive_see-low_choose-high'  # adjust as needed
    path = os.path.join(folder, variable_name)
    with open(path, 'rb') as f:
        bins_dict_wall_seen_wall_chosen, bins_dict_wall_seen = pickle.load(f)
    
    print("Loaded existing bin data:")
    print(f"Shape of denominator array: {bins_dict_wall_seen.shape}")
    print(f"Total trials in denominator: {np.sum(bins_dict_wall_seen)}")
    print(f"Range of bin counts: {np.min(bins_dict_wall_seen[bins_dict_wall_seen > 0])} to {np.max(bins_dict_wall_seen)}")
    
    # Create a boolean mask for bins with data
    valid_bins = bins_dict_wall_seen > 0
    print(f"Number of bins with data: {np.sum(valid_bins)} out of {bins_dict_wall_seen.size}")
    
    # Calculate probabilities
    probabilities = np.divide(
        bins_dict_wall_seen_wall_chosen, bins_dict_wall_seen,
        out=np.zeros_like(bins_dict_wall_seen_wall_chosen, dtype=float),
        where=bins_dict_wall_seen > 0
    )
    
    has_real_data = True
    
except Exception as e:
    print(f"Could not load existing data: {e}")
    print("Will create simulated data instead.")
    has_real_data = False

In [None]:
# Option 2: Create simulated data if needed
if not has_real_data:
    # Create a simulated dataset with 10x10 bins
    n_rows, n_cols = 10, 10
    
    # Simulate denominator counts (number of trials per bin)
    # Gaussian distribution centered in the middle with higher counts
    x, y = np.meshgrid(np.linspace(-1, 1, n_cols), np.linspace(-1, 1, n_rows))
    dist_from_center = np.sqrt(x**2 + y**2)
    mean_counts = 100 * np.exp(-2 * dist_from_center)  # Higher in center, lower at edges
    
    # Add random noise to counts (ensure integer values)
    bins_dict_wall_seen = np.random.poisson(mean_counts)
    
    # Create a gradient of probabilities (e.g., higher on one side)
    true_probabilities = 0.3 + 0.4 * (x + 1) / 2  # Probabilities range from 0.3 to 0.7
    
    # Generate binomial samples for each bin based on true probabilities
    bins_dict_wall_seen_wall_chosen = np.zeros((n_rows, n_cols))
    for i in range(n_rows):
        for j in range(n_cols):
            if bins_dict_wall_seen[i, j] > 0:
                bins_dict_wall_seen_wall_chosen[i, j] = np.random.binomial(
                    bins_dict_wall_seen[i, j], 
                    true_probabilities[i, j]
                )
    
    # Calculate observed probabilities
    probabilities = np.divide(
        bins_dict_wall_seen_wall_chosen, bins_dict_wall_seen,
        out=np.zeros_like(bins_dict_wall_seen_wall_chosen, dtype=float),
        where=bins_dict_wall_seen > 0
    )
    
    print("Created simulated data:")
    print(f"Shape of denominator array: {bins_dict_wall_seen.shape}")
    print(f"Total trials in denominator: {np.sum(bins_dict_wall_seen)}")
    print(f"Range of bin counts: {np.min(bins_dict_wall_seen[bins_dict_wall_seen > 0])} to {np.max(bins_dict_wall_seen)}")

## 2. Confidence Interval Analysis

One principled approach to determine minimum bin count is to look at the width of confidence intervals. For a binomial proportion, the confidence interval width depends on the sample size and the proportion itself.

Let's analyze how confidence interval width changes with different bin counts.

In [None]:
# Function to calculate binomial proportion confidence interval (Wilson score interval)
def wilson_ci(successes, trials, alpha=0.05):
    """
    Calculate Wilson score interval for binomial proportion.
    Returns lower and upper bounds of the confidence interval.
    """
    if trials == 0:
        return 0, 0
    
    p_hat = successes / trials
    z = stats.norm.ppf(1 - alpha/2)
    
    denominator = 1 + z**2/trials
    center = (p_hat + z**2/(2*trials)) / denominator
    spread = z * np.sqrt(p_hat * (1 - p_hat) / trials + z**2/(4*trials**2)) / denominator
    
    lower = max(0, center - spread)
    upper = min(1, center + spread)
    
    return lower, upper

# Calculate CI width for different bin counts and different true probabilities
bin_counts = np.arange(5, 101, 5)  # From 5 to 100 in steps of 5
prob_values = [0.1, 0.3, 0.5, 0.7, 0.9]
alpha = 0.05  # 95% confidence interval

ci_widths = {}
for p in prob_values:
    widths = []
    for n in bin_counts:
        # Expected number of successes for this probability and bin count
        k = int(n * p)
        lower, upper = wilson_ci(k, n, alpha)
        widths.append(upper - lower)
    ci_widths[p] = widths

# Plot CI width vs. bin count for different probabilities
plt.figure(figsize=(10, 6))
for p in prob_values:
    plt.plot(bin_counts, ci_widths[p], label=f'p = {p}')

plt.axhline(y=0.2, linestyle='--', color='red', label='CI width = 0.2')
plt.axhline(y=0.1, linestyle='--', color='orange', label='CI width = 0.1')

plt.xlabel('Bin Count (Number of Trials)')
plt.ylabel('95% CI Width')
plt.title('Confidence Interval Width vs. Bin Count')
plt.legend()
plt.grid(True)
plt.show()

# Find bin counts needed to achieve certain CI widths
desired_widths = [0.3, 0.2, 0.15, 0.1]
p_middle = 0.5  # Worst-case scenario (widest CI at p=0.5)

print("Minimum bin counts needed for different CI widths at p=0.5:")
for width in desired_widths:
    for i, n in enumerate(bin_counts):
        if ci_widths[p_middle][i] <= width:
            print(f"CI width ≤ {width}: {n} trials")
            break

## 3. Visualizing the Effect of Different Minimum Bin Counts

Let's see how applying different minimum bin count thresholds affects your heatmap visualization. We'll create multiple heatmaps with different thresholds and compare them.

In [None]:
# Create a function to plot heatmap with different minimum bin count thresholds
def plot_heatmap_with_threshold(probabilities, bin_counts, threshold, ax, title):
    # Make a copy to avoid modifying the original
    prob_masked = probabilities.copy()
    
    # Apply threshold
    prob_masked[bin_counts < threshold] = np.nan
    
    # Count how many bins pass the threshold
    n_valid_bins = np.sum(bin_counts >= threshold)
    percent_valid = 100 * n_valid_bins / np.sum(bin_counts > 0)
    
    # Setup colormap
    cmap = cm.get_cmap('inferno').copy()
    cmap.set_bad(color='lightgrey')
    norm = mpl.colors.Normalize(vmin=0, vmax=1)
    
    # Create the heatmap
    im = ax.imshow(prob_masked, origin='lower', norm=norm, cmap=cmap)
    
    # Add title with information about the threshold
    ax.set_title(f"{title}\n({n_valid_bins} bins, {percent_valid:.1f}% coverage)")
    
    # Remove ticks
    ax.set_xticks([])
    ax.set_yticks([])
    
    return im

# Plot multiple heatmaps with different thresholds
thresholds = [5, 10, 20, 30]
fig, axs = plt.subplots(2, 2, figsize=(12, 10))
axs = axs.flatten()

for i, thresh in enumerate(thresholds):
    im = plot_heatmap_with_threshold(
        probabilities, 
        bins_dict_wall_seen, 
        thresh, 
        axs[i], 
        f"Min {thresh} trials/bin"
    )

# Add a colorbar
cbar_ax = fig.add_axes([0.92, 0.15, 0.02, 0.7])
cbar = fig.colorbar(im, cax=cbar_ax)
cbar.set_label("Probability of Choosing High Wall", fontsize=12)

plt.tight_layout(rect=[0, 0, 0.9, 1])
plt.show()

## 4. Bootstrapping Analysis for Stability of Probability Estimates

Another approach to determine an appropriate minimum bin count is to use bootstrapping to assess the stability of probability estimates at different sample sizes.

In [None]:
# Function to perform bootstrapping for a given bin count and true probability
def bootstrap_probability(n_samples, true_prob, n_bootstrap=1000):
    """
    Estimate the variability of probability estimates through bootstrapping.
    
    Parameters:
    - n_samples: Number of samples (bin count)
    - true_prob: True probability for generating data
    - n_bootstrap: Number of bootstrap iterations
    
    Returns:
    - bootstrap_estimates: Array of probability estimates from bootstrap samples
    """
    # Generate the original sample (successes/failures)
    original_data = np.random.binomial(1, true_prob, n_samples)
    
    # Bootstrap resampling
    bootstrap_estimates = []
    for _ in range(n_bootstrap):
        # Resample with replacement
        resampled_data = np.random.choice(original_data, size=n_samples, replace=True)
        # Calculate probability estimate
        p_estimate = np.mean(resampled_data)
        bootstrap_estimates.append(p_estimate)
    
    return np.array(bootstrap_estimates)

# Test various bin counts and probabilities
bin_counts_to_test = [5, 10, 15, 20, 30, 50, 100]
true_probs = [0.2, 0.5, 0.8]
n_bootstrap = 1000

# Store results in a dictionary
bootstrap_results = {}

for true_p in true_probs:
    bootstrap_results[true_p] = {}
    for n in bin_counts_to_test:
        bootstrap_estimates = bootstrap_probability(n, true_p, n_bootstrap)
        bootstrap_results[true_p][n] = {
            'estimates': bootstrap_estimates,
            'mean': np.mean(bootstrap_estimates),
            'std': np.std(bootstrap_estimates),
            '95ci_width': np.percentile(bootstrap_estimates, 97.5) - np.percentile(bootstrap_estimates, 2.5)
        }

# Plot the distribution of bootstrap estimates for different bin counts
fig, axs = plt.subplots(len(true_probs), len(bin_counts_to_test), figsize=(20, 12), sharey='row')

for i, true_p in enumerate(true_probs):
    for j, n in enumerate(bin_counts_to_test):
        results = bootstrap_results[true_p][n]
        
        axs[i, j].hist(results['estimates'], bins=30, alpha=0.7, color='skyblue')
        axs[i, j].axvline(true_p, color='red', linestyle='--', linewidth=2)
        axs[i, j].set_title(f'n={n}, p={true_p}')
        
        if j == 0:
            axs[i, j].set_ylabel('Frequency')
        if i == len(true_probs)-1:
            axs[i, j].set_xlabel('Probability Estimate')

plt.tight_layout()
plt.show()

# Plot the 95% CI width vs bin count
plt.figure(figsize=(10, 6))
for true_p in true_probs:
    ci_widths = [bootstrap_results[true_p][n]['95ci_width'] for n in bin_counts_to_test]
    plt.plot(bin_counts_to_test, ci_widths, marker='o', label=f'p={true_p}')

plt.xlabel('Bin Count')
plt.ylabel('Width of 95% CI')
plt.title('Bootstrap 95% CI Width vs. Bin Count')
plt.axhline(y=0.2, linestyle='--', color='red', label='Width = 0.2')
plt.axhline(y=0.1, linestyle='--', color='orange', label='Width = 0.1')
plt.grid(True)
plt.legend()
plt.show()

# Calculate minimum bin counts needed for different levels of precision
precision_levels = [0.3, 0.2, 0.15, 0.1]
print("\nMinimum bin counts needed for different bootstrap CI widths:")
for precision in precision_levels:
    print(f"\nFor 95% CI width ≤ {precision}:")
    for true_p in true_probs:
        for n in bin_counts_to_test:
            if bootstrap_results[true_p][n]['95ci_width'] <= precision:
                print(f"  p={true_p}: {n} trials")
                break

## 5. Statistical Power Analysis

When determining a minimum bin count, it's also important to consider the statistical power needed to detect meaningful differences in probabilities.

For example, if we want to detect a difference of 0.2 between two probabilities with adequate power (e.g., 80%), how many samples do we need per bin?

In [None]:
from scipy.stats import norm

def power_analysis_proportions(p1, p2, power=0.8, alpha=0.05):
    """
    Calculate the sample size needed to detect a difference between two proportions
    with specified power using a two-tailed z-test.
    
    Parameters:
    - p1: First proportion
    - p2: Second proportion (the difference p2-p1 is the effect size)
    - power: Desired statistical power (default: 0.8)
    - alpha: Significance level (default: 0.05)
    
    Returns:
    - n: Required sample size in each group
    """
    # Effect size
    delta = abs(p2 - p1)
    
    # Z critical values
    z_alpha = norm.ppf(1 - alpha/2)
    z_beta = norm.ppf(power)
    
    # Pooled proportion
    p_pool = (p1 + p2) / 2
    
    # Calculate required sample size
    n = ((z_alpha + z_beta)**2 * 2 * p_pool * (1 - p_pool)) / delta**2
    
    return math.ceil(n)

# Calculate sample sizes needed for different effect sizes
effect_sizes = [0.05, 0.1, 0.15, 0.2, 0.25, 0.3]
baseline_probs = [0.2, 0.3, 0.4, 0.5]
power_level = 0.8
alpha_level = 0.05

# Create a table of results
print(f"Sample sizes needed to detect differences with {power_level*100}% power:")
print("\nEffect size | Baseline probability | Required sample size per bin")
print("-" * 60)

for p_baseline in baseline_probs:
    for effect in effect_sizes:
        p2 = p_baseline + effect
        if p2 <= 1.0:  # Ensure valid probability
            n = power_analysis_proportions(p_baseline, p2, power_level, alpha_level)
            print(f"{effect:.2f}      | {p_baseline:.2f}                | {n}")

# Plot required sample size vs. effect size for different baseline probabilities
plt.figure(figsize=(10, 6))

for p in baseline_probs:
    sizes = []
    valid_effects = []
    for effect in effect_sizes:
        if p + effect <= 1.0:
            valid_effects.append(effect)
            sizes.append(power_analysis_proportions(p, p + effect, power_level, alpha_level))
    plt.plot(valid_effects, sizes, marker='o', label=f'p={p}')

plt.xlabel('Effect Size (Difference in Proportions)')
plt.ylabel(f'Required Sample Size for {power_level*100}% Power')
plt.title('Sample Size Requirements for Different Effect Sizes and Baseline Probabilities')
plt.grid(True)
plt.legend()
plt.yscale('log')
plt.show()

## 6. Analysis of Your Current Threshold

You mentioned you're currently using a threshold of 11 trials per bin. Let's analyze the statistical properties of this threshold and see if it's appropriate.

In [None]:
# Your current threshold
current_threshold = 11

# Calculate the properties of this threshold
p_middle = 0.5  # Worst-case scenario for CI width

# Calculate CI width at this threshold
k = int(current_threshold * p_middle)
lower, upper = wilson_ci(k, current_threshold)
ci_width = upper - lower

print(f"For a threshold of {current_threshold} trials per bin:")
print(f"- 95% CI width at p=0.5: {ci_width:.3f}")
print(f"- This means the true probability could be approximately ±{ci_width/2:.3f} around the observed probability")

# Effect size detectable with 80% power
from scipy.stats import norm

def min_detectable_effect(n, power=0.8, alpha=0.05, p=0.5):
    """Calculate the minimum detectable effect size with given sample size and power"""
    z_alpha = norm.ppf(1 - alpha/2)
    z_beta = norm.ppf(power)
    min_effect = (z_alpha + z_beta) * np.sqrt(2 * p * (1-p) / n)
    return min_effect

min_effect = min_detectable_effect(current_threshold)
print(f"- Minimum effect size detectable with 80% power: {min_effect:.3f}")
print(f"  (This is the minimum difference in probability that can be reliably detected)")

# Check how many of your bins would pass this threshold
bins_above_threshold = np.sum(bins_dict_wall_seen >= current_threshold)
total_bins_with_data = np.sum(bins_dict_wall_seen > 0)
coverage_percentage = 100 * bins_above_threshold / total_bins_with_data

print(f"\nCoverage with threshold of {current_threshold}:")
print(f"- Bins above threshold: {bins_above_threshold} out of {total_bins_with_data} ({coverage_percentage:.1f}%)")

# Plot histogram of bin counts to see their distribution
plt.figure(figsize=(10, 6))
non_zero_counts = bins_dict_wall_seen[bins_dict_wall_seen > 0].flatten()
plt.hist(non_zero_counts, bins=30, alpha=0.7, color='skyblue')
plt.axvline(current_threshold, color='red', linestyle='--', linewidth=2, label=f'Current threshold ({current_threshold})')

# Suggest alternative thresholds
suggested_thresholds = [5, 10, 15, 20, 25, 30]
colors = ['orange', 'green', 'purple', 'brown', 'pink', 'gray']

for i, thresh in enumerate(suggested_thresholds):
    if thresh != current_threshold:  # Don't duplicate the current threshold
        pct = 100 * np.sum(bins_dict_wall_seen >= thresh) / total_bins_with_data
        plt.axvline(thresh, color=colors[i % len(colors)], linestyle='--', linewidth=2, 
                   label=f'Threshold {thresh} ({pct:.1f}% coverage)')

plt.xlabel('Bin Count')
plt.ylabel('Number of Bins')
plt.title('Distribution of Trial Counts per Bin')
plt.grid(True)
plt.legend()
plt.show()

## 7. Recommended Threshold Selection

Based on the analyses above, we can make an informed decision about an appropriate minimum bin count threshold for your heatmap visualization.

In [None]:
# Calculate the tradeoff between statistical quality and data coverage
# For different thresholds
threshold_options = list(range(5, 51, 5))
threshold_metrics = []

for thresh in threshold_options:
    # Calculate coverage
    coverage = 100 * np.sum(bins_dict_wall_seen >= thresh) / np.sum(bins_dict_wall_seen > 0)
    
    # Calculate CI width at p=0.5 (worst case)
    k = int(thresh * 0.5)
    lower, upper = wilson_ci(k, thresh)
    ci_width = upper - lower
    
    # Calculate minimum detectable effect with 80% power
    min_effect = min_detectable_effect(thresh)
    
    threshold_metrics.append({
        'threshold': thresh,
        'coverage': coverage,
        'ci_width': ci_width,
        'min_detectable_effect': min_effect
    })

# Convert to DataFrame for easier analysis
df_metrics = pd.DataFrame(threshold_metrics)

# Plot the tradeoff between coverage and statistical quality
fig, ax1 = plt.subplots(figsize=(10, 6))

color = 'tab:blue'
ax1.set_xlabel('Minimum Bin Count Threshold')
ax1.set_ylabel('Coverage (%)', color=color)
ax1.plot(df_metrics['threshold'], df_metrics['coverage'], color=color, marker='o')
ax1.tick_params(axis='y', labelcolor=color)

ax2 = ax1.twinx()
color = 'tab:red'
ax2.set_ylabel('95% CI Width', color=color)
ax2.plot(df_metrics['threshold'], df_metrics['ci_width'], color=color, marker='s')
ax2.tick_params(axis='y', labelcolor=color)

plt.title('Tradeoff: Coverage vs. Statistical Precision')
plt.grid(True)
plt.show()

# Print recommendation table
print("Threshold Selection Guide:")
print("\nThreshold | Coverage (%) | 95% CI Width | Min. Detectable Effect")
print("-" * 65)

for _, row in df_metrics.iterrows():
    print(f"{row['threshold']:8.0f} | {row['coverage']:11.1f} | {row['ci_width']:11.3f} | {row['min_detectable_effect']:21.3f}")

# Provide a final recommendation based on the analysis
print("\nRecommendation:")
print("Based on the analyses above, consider the following guidelines:")

# Find threshold with CI width around 0.25-0.30 as reasonable
for _, row in df_metrics.iterrows():
    if 0.25 <= row['ci_width'] <= 0.30:
        suggested = row['threshold']
        print(f"1. A threshold of {suggested} trials provides a reasonable balance between coverage " +
              f"({row['coverage']:.1f}%) and statistical precision (95% CI width of {row['ci_width']:.3f}).")
        break

# Find threshold to detect moderate effect sizes (around 0.2)
for _, row in df_metrics.iterrows():
    if row['min_detectable_effect'] <= 0.2:
        print(f"2. To reliably detect moderate effect sizes (differences in probability of 0.2 or greater), " +
              f"consider a threshold of at least {row['threshold']} trials per bin.")
        break

print("3. Your current threshold of 11 trials is reasonable if you're primarily concerned with " +
      "identifying large effects while maintaining good spatial coverage.")

## 8. Apply the Selected Threshold to Your Data

Based on the analysis above, let's apply an appropriate threshold to your data and create the final visualization.

In [None]:
# Let's implement a visualization that allows you to explore different thresholds interactively
from ipywidgets import interact, widgets

@interact(threshold=widgets.IntSlider(min=5, max=50, step=5, value=11, 
                                    description='Min Trials:'))
def update_threshold(threshold):
    """Interactive visualization to explore different minimum bin count thresholds"""
    # Apply the threshold
    mask = bins_dict_wall_seen >= threshold
    p_masked = np.ma.masked_where(~mask, probabilities)
    
    # Coverage statistics
    total_bins = np.sum(bins_dict_wall_seen > 0)
    valid_bins = np.sum(mask)
    coverage = 100 * valid_bins / total_bins
    
    # Statistical properties
    k = int(threshold * 0.5)
    lower, upper = wilson_ci(k, threshold)
    ci_width = upper - lower
    min_effect = min_detectable_effect(threshold)
    
    # Create the visualization
    fig, ax = plt.subplots(figsize=(10, 8))
    
    # Setup colormap
    cmap = cm.get_cmap('inferno').copy()
    cmap.set_bad(color='lightgrey')
    norm = mpl.colors.Normalize(vmin=0, vmax=1)
    
    # Create the heatmap
    im = ax.imshow(p_masked, origin='lower', norm=norm, cmap=cmap)
    
    # Add colorbar
    divider = make_axes_locatable(ax)
    cax = divider.append_axes("right", size="5%", pad=0.05)
    cbar = plt.colorbar(im, cax=cax)
    cbar.set_label("Probability", fontsize=12)
    
    # Remove ticks
    ax.set_xticks([])
    ax.set_yticks([])
    
    # Add title with statistical info
    plt.suptitle(f"Probability Heatmap with Minimum {threshold} Trials per Bin", fontsize=14)
    ax.set_title(f"Coverage: {valid_bins}/{total_bins} bins ({coverage:.1f}%)\n" +
                f"95% CI Width: ±{ci_width/2:.3f}, Min Detectable Effect: {min_effect:.3f}", 
                fontsize=10)
    
    plt.tight_layout()
    plt.show()
    
    # Print detailed statistics
    print(f"Threshold = {threshold} trials:")
    print(f"- Valid bins: {valid_bins}/{total_bins} ({coverage:.1f}%)")
    print(f"- 95% CI half-width at p=0.5: ±{ci_width/2:.3f}")
    print(f"- Minimum detectable effect with 80% power: {min_effect:.3f}")

## Summary and Conclusions

Based on the analyses in this notebook, we can draw the following conclusions about minimum bin count thresholds for spatial probability heatmaps:

1. **Statistical Considerations**:
   - For reliable probability estimates, the minimum bin count should be determined based on the desired precision of your estimates
   - With 11 trials per bin, the 95% confidence interval width is approximately ±0.15 around the estimated probability
   - To detect smaller effect sizes, higher bin counts are needed

2. **Coverage vs. Precision Tradeoff**:
   - Increasing the threshold improves statistical reliability but reduces spatial coverage
   - The optimal threshold depends on your specific research questions and what you consider a meaningful effect size

3. **Recommendations**:
   - For exploratory analyses: 10-15 trials per bin provides reasonable coverage while filtering out the most unreliable estimates
   - For confirmatory analyses: 20-30 trials per bin ensures more reliable probability estimates
   - When presenting results: Clearly indicate the minimum bin count threshold used and consider showing how results change with different thresholds

4. **Implementation**:
   - Continue using your current approach of masking bins with fewer than the threshold number of trials
   - Consider reporting confidence intervals alongside probability estimates for key regions of interest

This framework can be applied to any spatial binning analysis to determine appropriate minimum bin counts based on statistical principles rather than arbitrary thresholds.