[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/GabbyTab/boofun/blob/main/notebooks/lecture2_linearity_testing.ipynb)

# Lecture 2: Convolution and Linearity Testing

**Based on CS294-92: Analysis of Boolean Functions (Spring 2025)**  
**Instructor: Avishay Tal**  
**Based on lecture notes by: Austin Pechan**  
**Notebook by: Gabriel Taboada**

This notebook covers:
1. Expectation and Variance of Boolean Functions
2. Convolution in Fourier Domain
3. The BLR Linearity Test
4. Local Correction

---

## Key Theorems (Recap)

- **Fundamental Theorem**: $f(x) = \sum_{S \subseteq [n]} \hat{f}(S) \cdot \chi_S(x)$
- **Plancherel**: $\langle f, g \rangle = \sum_S \hat{f}(S) \hat{g}(S)$
- **Parseval**: $\sum_S \hat{f}(S)^2 = 1$ for Boolean-valued $f$

In [1]:
# Install/upgrade boofun (required for Colab)
# This ensures you have the latest version with all features
!pip install --upgrade boofun -q

import boofun as bf
print(f"BooFun version: {bf.__version__}")

# Verify local_correct is available (added in v1.1)
from boofun.analysis import PropertyTester
assert hasattr(PropertyTester, 'local_correct'), "Please restart runtime and re-run this cell"


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.2[0m[39;49m -> [0m[32;49m25.3[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49m/Library/Developer/CommandLineTools/usr/bin/python3 -m pip install --upgrade pip[0m






BooFun version: 1.1.1


In [2]:
import numpy as np
from boofun.analysis import SpectralAnalyzer
from boofun.analysis import PropertyTester
from boofun.analysis.fourier import parseval_verify, plancherel_inner_product, convolution

import warnings
warnings.filterwarnings('ignore')

## 2.1 Expectation and Variance

**Fact 2.6**: $\mathbf{E}[f(x)] = \hat{f}(\emptyset)$

The constant Fourier coefficient is the expectation!

**Fact 2.7**: $\mathbf{Var}[f] = \sum_{S \neq \emptyset} \hat{f}(S)^2$

For Boolean functions: $\mathbf{Var}[f] = 1 - \hat{f}(\emptyset)^2$

In [3]:
# Demonstrate expectation = f̂(∅) across many function families
# The library provides f.bias() = f.fourier()[0] = E[f] directly!

functions = {
    # Basic gates
    "AND₃": bf.AND(3),
    "OR₃": bf.OR(3),
    "PARITY₃": bf.parity(3),
    # Threshold functions
    "MAJORITY₃": bf.majority(3),
    "THRESHOLD₃,₂": bf.threshold(3, 2),  # ≥2 of 3 bits
    # Structured functions
    "TRIBES(2,3)": bf.tribes(2, 3),  # OR of 3 ANDs of 2
    "DICTATOR₀": bf.dictator(3, 0),  # Just x₀
    # Random function for comparison
    "RANDOM₃": bf.random(3, seed=42),
}

print("Expectation = f̂(∅) = f.bias() verification:")
print("=" * 65)
print(f"{'Function':<15} | {'f.bias()':<10} | {'f̂(∅)':<10} | {'balanced?':<10} | Interpretation")
print("-" * 65)

for name, f in functions.items():
    # Use library methods directly!
    bias = f.bias()              # Returns E[f] = f̂(∅) in ±1 convention
    f_hat_empty = f.fourier()[0] # Same value
    is_bal = f.is_balanced()     # True if E[f] ≈ 0
    
    # Interpret the expectation
    if is_bal:
        interp = "balanced"
    elif bias > 0:
        interp = f"biased to +1 ({(1+bias)/2:.0%} are 0)"
    else:
        interp = f"biased to -1 ({(1-bias)/2:.0%} are 1)"
    
    print(f"{name:<15} | {bias:<10.4f} | {f_hat_empty:<10.4f} | {str(is_bal):<10} | {interp}")

Expectation = f̂(∅) = f.bias() verification:
Function        | f.bias()   | f̂(∅)      | balanced?  | Interpretation
-----------------------------------------------------------------
AND₃            | 0.7500     | 0.7500     | False      | biased to +1 (88% are 0)
OR₃             | -0.7500    | -0.7500    | False      | biased to -1 (88% are 1)
PARITY₃         | 0.0000     | 0.0000     | True       | balanced
MAJORITY₃       | 0.0000     | 0.0000     | True       | balanced
THRESHOLD₃,₂    | 0.0000     | 0.0000     | True       | balanced
TRIBES(2,3)     | 0.2500     | 0.2500     | False      | biased to +1 (62% are 0)
DICTATOR₀       | 0.0000     | 0.0000     | True       | balanced
RANDOM₃         | 0.5000     | 0.5000     | False      | biased to +1 (75% are 0)


In [4]:
# Demonstrate variance = Σ_{S≠∅} f̂(S)² = 1 - f̂(∅)²
print("Variance verification: Var[f] = 1 - f̂(∅)² = Σ_{S≠∅} f̂(S)²")
print("=" * 70)
print(f"{'Function':<15} | {'Var[f]':<10} | {'1-f̂(∅)²':<10} | {'Σ f̂(S)²':<10} | Match")
print("-" * 70)

for name, f in functions.items():
    tt = np.asarray(f.get_representation("truth_table"))
    pm_vals = 1 - 2 * tt
    var_direct = np.var(pm_vals)
    
    fourier = f.fourier()
    f_hat_empty = fourier[0]
    var_formula = 1 - f_hat_empty**2
    sum_sq = sum(c**2 for c in fourier[1:])
    
    match = "✓" if abs(var_direct - var_formula) < 1e-10 else "✗"
    print(f"{name:<15} | {var_direct:<10.4f} | {var_formula:<10.4f} | {sum_sq:<10.4f} | {match}")

print("\nKey insight: Balanced functions (E[f]=0) have maximum variance = 1")

Variance verification: Var[f] = 1 - f̂(∅)² = Σ_{S≠∅} f̂(S)²
Function        | Var[f]     | 1-f̂(∅)²   | Σ f̂(S)²   | Match
----------------------------------------------------------------------
AND₃            | 0.4375     | 0.4375     | 0.4375     | ✓
OR₃             | 0.4375     | 0.4375     | 0.4375     | ✓
PARITY₃         | 1.0000     | 1.0000     | 1.0000     | ✓
MAJORITY₃       | 1.0000     | 1.0000     | 1.0000     | ✓
THRESHOLD₃,₂    | 1.0000     | 1.0000     | 1.0000     | ✓
TRIBES(2,3)     | 0.9375     | 0.9375     | 0.9375     | ✓
DICTATOR₀       | 1.0000     | 1.0000     | 1.0000     | ✓
RANDOM₃         | 0.7500     | 0.7500     | 0.7500     | ✓

Key insight: Balanced functions (E[f]=0) have maximum variance = 1


## 2.1.1 Singleton Fourier Coefficients and Conditional Expectations

For singleton sets $S = \{i\}$, we have:

$$\hat{f}(\{i\}) = \mathbf{E}[f(x) \cdot x_i] = \frac{1}{2}\left(\mathbf{E}[f(x) | x_i = 1] - \mathbf{E}[f(x) | x_i = -1]\right)$$

**Interpretation**: $\hat{f}(\{i\})$ measures how much variable $i$ "pulls" the function:
- $\hat{f}(\{i\}) > 0$: $f$ tends to output +1 when $x_i = +1$
- $\hat{f}(\{i\}) < 0$: $f$ tends to output +1 when $x_i = -1$
- $\hat{f}(\{i\}) = 0$: Variable $i$ has no "net effect" on the output

In [5]:
# Demonstrate the conditional expectation interpretation of f̂({i})
def analyze_singleton_coefficients(f, name):
    """Show how f̂({i}) relates to conditional expectations."""
    n = f.n_vars
    fourier = f.fourier()
    
    print(f"\n{name} (n={n}):")
    print("-" * 70)
    print(f"{'Variable':<10} | {'f̂({i})':<10} | {'E[f|xᵢ=+1]':<12} | {'E[f|xᵢ=-1]':<12} | {'Diff/2':<10}")
    print("-" * 70)
    
    tt = np.asarray(f.get_representation("truth_table"))
    pm_vals = 1 - 2 * tt  # Convert to ±1: Boolean 0 → +1, Boolean 1 → -1
    
    for i in range(n):
        # f̂({i}) from Fourier expansion
        idx = 1 << i  # Index for singleton {i}
        f_hat_i = fourier[idx]
        
        # Compute conditional expectations directly
        # Key insight: bit=0 means xᵢ=+1, bit=1 means xᵢ=-1 (O'Donnell convention)
        mask_minus = np.array([(x >> i) & 1 == 1 for x in range(len(tt))])  # xᵢ = -1
        mask_plus = ~mask_minus  # xᵢ = +1
        
        E_given_plus = np.mean(pm_vals[mask_plus])   # E[f | xᵢ = +1]
        E_given_minus = np.mean(pm_vals[mask_minus]) # E[f | xᵢ = -1]
        
        diff_half = (E_given_plus - E_given_minus) / 2
        
        match = "✓" if abs(f_hat_i - diff_half) < 1e-10 else "✗"
        print(f"x_{i:<8} | {f_hat_i:<10.4f} | {E_given_plus:<12.4f} | {E_given_minus:<12.4f} | {diff_half:<10.4f} {match}")

# Analyze several functions
analyze_singleton_coefficients(bf.majority(5), "MAJORITY₅")
analyze_singleton_coefficients(bf.tribes(2, 3), "TRIBES(2,3)")
analyze_singleton_coefficients(bf.dictator(4, 1), "DICTATOR x₁")


MAJORITY₅ (n=5):
----------------------------------------------------------------------
Variable   | f̂({i})    | E[f|xᵢ=+1]   | E[f|xᵢ=-1]   | Diff/2    
----------------------------------------------------------------------
x_0        | 0.3750     | 0.3750       | -0.3750      | 0.3750     ✓
x_1        | 0.3750     | 0.3750       | -0.3750      | 0.3750     ✓
x_2        | 0.3750     | 0.3750       | -0.3750      | 0.3750     ✓
x_3        | 0.3750     | 0.3750       | -0.3750      | 0.3750     ✓
x_4        | 0.3750     | 0.3750       | -0.3750      | 0.3750     ✓

TRIBES(2,3) (n=3):
----------------------------------------------------------------------
Variable   | f̂({i})    | E[f|xᵢ=+1]   | E[f|xᵢ=-1]   | Diff/2    
----------------------------------------------------------------------
x_0        | 0.2500     | 0.5000       | 0.0000       | 0.2500     ✓
x_1        | 0.2500     | 0.5000       | 0.0000       | 0.2500     ✓
x_2        | 0.7500     | 1.0000       | -0.5000      | 0.750

## 2.2 Convolution

**Definition**: The convolution of $f$ and $g$ is:
$$f * g(x) = \mathbf{E}_{y}[f(y) \cdot g(x \cdot y)]$$

where $x \cdot y$ denotes coordinate-wise multiplication in $\{\pm 1\}^n$.

**Theorem 2.8 (Convolution Theorem)**: 
$$\widehat{f * g}(S) = \hat{f}(S) \cdot \hat{g}(S)$$

Convolution in the time domain = multiplication in the Fourier domain!

In [6]:
# Demonstrate the Convolution Theorem
f = bf.majority(3)
g = bf.parity(3)

# Get Fourier coefficients
f_fourier = f.fourier()
g_fourier = g.fourier()

# The Convolution Theorem says: (f*g)^(S) = f̂(S) · ĝ(S)
# convolution() returns the Fourier coefficients directly as a numpy array!
conv_fourier = convolution(f, g)  # Already the coefficients (f*g)^(S) = f̂(S) · ĝ(S)

# Verify the theorem
print("Convolution Theorem: (f*g)^(S) = f̂(S) · ĝ(S)")
print("=" * 70)
print(f"{'S':<12} | {'f̂(S)':<10} | {'ĝ(S)':<10} | {'(f*g)^(S)':<10} | {'f̂·ĝ':<10} | Match")
print("-" * 70)

for i, (fc, gc, conv_c) in enumerate(zip(f_fourier, g_fourier, conv_fourier)):
    if abs(fc) > 0.01 or abs(gc) > 0.01 or abs(conv_c) > 0.01:
        expected_c = fc * gc
        match = "✓" if abs(conv_c - expected_c) < 1e-10 else "✗"
        # Convert index to set notation
        bits = [j for j in range(3) if (i >> j) & 1]
        set_str = "{" + ",".join(map(str, bits)) + "}" if bits else "∅"
        print(f"{set_str:<12} | {fc:<10.4f} | {gc:<10.4f} | {conv_c:<10.4f} | {expected_c:<10.4f} | {match}")

print()
print("convolution(f, g) computes (f*g)^(S) = f̂(S) · ĝ(S) directly.")
print("It returns a numpy array of Fourier coefficients, not a BooleanFunction.")

Convolution Theorem: (f*g)^(S) = f̂(S) · ĝ(S)
S            | f̂(S)      | ĝ(S)       | (f*g)^(S)  | f̂·ĝ       | Match
----------------------------------------------------------------------
{0}          | 0.5000     | 0.0000     | 0.0000     | 0.0000     | ✓
{1}          | 0.5000     | 0.0000     | 0.0000     | 0.0000     | ✓
{2}          | 0.5000     | 0.0000     | 0.0000     | 0.0000     | ✓
{0,1,2}      | -0.5000    | 1.0000     | -0.5000    | -0.5000    | ✓

convolution(f, g) computes (f*g)^(S) = f̂(S) · ĝ(S) directly.
It returns a numpy array of Fourier coefficients, not a BooleanFunction.


### The PropertyTester

The `boofun` library provides `PropertyTester` - a class that implements many property testing algorithms from the literature. Property testing lets us efficiently check if a function has a property (or is close to having it) using only a small number of queries.

**Available tests:**
- `constant_test()` - Is $f$ constant?
- `blr_linearity_test()` - Is $f$ linear? (BLR algorithm)
- `dictator_test()` - Is $f$ a dictator function?
- `junta_test(k)` - Does $f$ depend on at most $k$ variables?
- `monotonicity_test()` - Is $f$ monotone?
- `symmetry_test()` - Is $f$ symmetric?
- `balanced_test()` - Does $f$ output 0 and 1 equally often?
- `affine_test()` - Is $f$ affine over GF(2)?
- `run_all_tests()` - Run everything!

In [7]:
# PropertyTester demonstration - run all tests on various functions
from boofun.analysis import PropertyTester

demo_functions = {
    "PARITY₄": bf.parity(4),
    "MAJORITY₅": bf.majority(5),
    "DICTATOR x₀": bf.dictator(4, 0),
    "AND₃": bf.AND(3),
    "TRIBES(2,2)": bf.tribes(2, 2),
}

print("PropertyTester: Comprehensive Function Analysis")
print("=" * 80)

for name, f in demo_functions.items():
    tester = PropertyTester(f)
    
    # Run individual tests
    is_const = tester.constant_test()
    is_balanced = tester.balanced_test()
    is_linear = tester.blr_linearity_test(100)
    is_monotone = tester.monotonicity_test(500)
    is_symmetric = tester.symmetry_test(500)
    is_dictator, dictator_var = tester.dictator_test(500)
    
    print(f"\n{name}:")
    props = []
    if is_const: props.append("constant")
    if is_balanced: props.append("balanced")
    if is_linear: props.append("linear")
    if is_monotone: props.append("monotone")
    if is_symmetric: props.append("symmetric")
    if is_dictator: props.append(f"dictator(x_{dictator_var})")
    
    print(f"  Properties: {', '.join(props) if props else 'none detected'}")
    
    # Show Fourier summary
    fourier = f.fourier()
    degree = max(bin(i).count('1') for i, c in enumerate(fourier) if abs(c) > 1e-10)
    sparsity = sum(1 for c in fourier if abs(c) > 1e-10)
    print(f"  Fourier degree: {degree}, Sparsity: {sparsity}/{len(fourier)}")

PropertyTester: Comprehensive Function Analysis

PARITY₄:
  Properties: balanced, linear, symmetric
  Fourier degree: 4, Sparsity: 1/16

MAJORITY₅:
  Properties: balanced, monotone, symmetric
  Fourier degree: 5, Sparsity: 16/32

DICTATOR x₀:
  Properties: balanced, linear, monotone, dictator(x_0)
  Fourier degree: 1, Sparsity: 1/16

AND₃:
  Properties: monotone, symmetric
  Fourier degree: 3, Sparsity: 8/8

TRIBES(2,2):
  Properties: monotone, symmetric
  Fourier degree: 2, Sparsity: 4/4


## 2.3 BLR Linearity Test

**Setup**: Given oracle access to $F: \{0,1\}^n \to \{\pm 1\}$, is $F$ a character $\chi_S$?

**Definitions**:
- **ε-close**: $f$ and $g$ are ε-close if $\Pr_x[f(x) \neq g(x)] \leq \varepsilon$
- **Linear**: $f(x \oplus y) = f(x) \cdot f(y)$ for all $x, y$. The *only* linear functions are characters $\chi_S$.  
  (Sums of characters are not linear — cross-terms break linearity.)

**BLR Algorithm** (3 queries):
1. Pick $x, y \sim \{0,1\}^n$ uniformly
2. Accept if $F(x) \cdot F(y) = F(x \oplus y)$

**Key fact**: Characters always satisfy $\chi_S(x) \cdot \chi_S(y) = \chi_S(x \oplus y)$, so they pass with probability 1.

**Theorem 2.10**: $\Pr[\text{accept}] = \frac{1 + \sum_S \hat{F}(S)^3}{2}$

In [8]:
# BLR Linearity Testing on diverse function families

# Built-in functions
test_functions = {
    # Linear functions (should PASS)
    "PARITY₅ (linear)": bf.parity(5),
    "x₂ (dictator)": bf.dictator(5, 2),
    
    # Classic non-linear functions
    "MAJORITY₅": bf.majority(5),
    "AND₄": bf.AND(4),
    "OR₄": bf.OR(4),
    "TRIBES(2,3)": bf.tribes(2, 3),
    
    # Random functions
    "RANDOM₅": bf.random(5, seed=123),
    "RANDOM₅ balanced": bf.random(5, balanced=True, seed=456),
}

# Create a "hash-like" function (complex, non-linear)
def create_hash_like(n):
    """Create a pseudorandom-looking function that mixes inputs."""
    def hash_fn(x):
        # Simple hash: XOR of rotated products
        h = 0
        for i in range(n):
            for j in range(i+1, n):
                h ^= (x[i] & x[j])
        return h
    
    tt = [hash_fn([int(b) for b in format(i, f'0{n}b')]) for i in range(2**n)]
    return bf.create(tt)

test_functions["HASH-LIKE₅"] = create_hash_like(5)

print("BLR Linearity Test Results")
print("=" * 70)
print(f"{'Function':<20} | {'Linear?':<8} | {'BLR':<8} | {'Σf̂³':<10} | Interpretation")
print("-" * 70)

for name, f in test_functions.items():
    tester = PropertyTester(f)
    blr_result = tester.blr_linearity_test(200)
    is_linear = f.is_linear()
    
    # Compute theoretical acceptance probability
    fourier = f.fourier()
    sum_cubed = sum(c**3 for c in fourier)
    
    status = "PASS" if blr_result else "FAIL"
    
    # Interpret
    if is_linear:
        interp = "exact linear!"
    elif sum_cubed > 0.9:
        interp = "nearly linear"
    elif sum_cubed > 0.5:
        interp = "somewhat structured"
    else:
        interp = "far from linear"
    
    print(f"{name:<20} | {str(is_linear):<8} | {status:<8} | {sum_cubed:<10.4f} | {interp}")

BLR Linearity Test Results
Function             | Linear?  | BLR      | Σf̂³       | Interpretation
----------------------------------------------------------------------
PARITY₅ (linear)     | True     | PASS     | 1.0000     | exact linear!
x₂ (dictator)        | True     | PASS     | 1.0000     | exact linear!
MAJORITY₅            | False    | FAIL     | 0.2969     | far from linear
AND₄                 | False    | FAIL     | 0.6719     | somewhat structured
OR₄                  | False    | FAIL     | -0.6406    | far from linear
TRIBES(2,3)          | False    | FAIL     | 0.4375     | far from linear
RANDOM₅              | False    | FAIL     | 0.0625     | far from linear


RANDOM₅ balanced     | False    | FAIL     | 0.0312     | far from linear


HASH-LIKE₅           | False    | FAIL     | 0.0625     | far from linear


In [9]:
# Testing arbitrary user-defined functions

def blr_empirical_acceptance(f, trials=10000):
    """Run BLR test many times and compute acceptance rate."""
    n = f.n_vars
    accepts = 0
    np.random.seed(42)  # Reproducibility
    
    for _ in range(trials):
        x = np.random.randint(0, 2, n)
        y = np.random.randint(0, 2, n)
        xy = (x + y) % 2  # XOR in F_2
        
        fx = 1 - 2 * f.evaluate(x)  # Convert to ±1
        fy = 1 - 2 * f.evaluate(y)
        fxy = 1 - 2 * f.evaluate(xy)
        
        if fx * fy == fxy:  # BLR acceptance condition
            accepts += 1
    
    return accepts / trials

print("Testing arbitrary functions with BLR:")
print("=" * 70)

# Example 1: A "noisy parity" - parity with some bits flipped
def create_noisy_linear(n, noise_rate):
    """Create parity with random noise."""
    parity = bf.parity(n)
    tt = list(parity.get_representation("truth_table"))
    np.random.seed(42)
    for i in range(len(tt)):
        if np.random.random() < noise_rate:
            tt[i] = 1 - tt[i]
    return bf.create(tt)

# Example 2: A "nearly majority" - perturb a few entries
def create_near_majority(n):
    """Majority with slight modifications."""
    maj = bf.majority(n)
    tt = list(maj.get_representation("truth_table"))
    tt[0] = 1 - tt[0]  # Flip one entry
    return bf.create(tt)

# Example 3: A completely custom function
def create_weird_function(n):
    """A strange custom function for testing."""
    def weird(x):
        # Returns 1 if popcount is prime, 0 otherwise
        popcount = sum(x)
        return 1 if popcount in [2, 3, 5, 7, 11, 13] else 0
    tt = [weird([int(b) for b in format(i, f'0{n}b')]) for i in range(2**n)]
    return bf.create(tt)

custom_functions = {
    "PARITY₅ (exact)": bf.parity(5),
    "NOISY PARITY (5%)": create_noisy_linear(5, 0.05),
    "NOISY PARITY (20%)": create_noisy_linear(5, 0.20),
    "NEAR MAJORITY₅": create_near_majority(5),
    "POPCOUNT IS PRIME": create_weird_function(5),
}

# Compute both theoretical and empirical BLR acceptance rates
print(f"{'Function':<22} | {'Σf̂³':<10} | {'Pr[accept]':<12} | {'Empirical':<10} | Match")
print("-" * 75)

for name, f in custom_functions.items():
    # Theoretical: Pr[BLR accepts] = (1 + Σ f̂(S)³) / 2
    fourier = f.fourier()
    sum_cubed = sum(c**3 for c in fourier)
    theoretical = (1 + sum_cubed) / 2  # Correct formula!
    
    # Empirical: Run many BLR queries
    empirical = blr_empirical_acceptance(f, trials=10000)
    
    match = "✓" if abs(theoretical - empirical) < 0.02 else "✗"
    print(f"{name:<22} | {sum_cubed:<10.4f} | {theoretical:<12.4f} | {empirical:<10.4f} | {match}")

print()
print("Key insight: Even for non-linear functions, Pr[accept] is always in [0, 1]!")
print("Σf̂³ can be negative, but (1 + Σf̂³)/2 is always a valid probability.")

Testing arbitrary functions with BLR:
Function               | Σf̂³       | Pr[accept]   | Empirical  | Match
---------------------------------------------------------------------------


PARITY₅ (exact)        | 1.0000     | 1.0000       | 1.0000     | ✓


NOISY PARITY (5%)      | 0.6719     | 0.8359       | 0.8381     | ✓


NOISY PARITY (20%)     | 0.0625     | 0.5312       | 0.5360     | ✓


NEAR MAJORITY₅         | 0.1133     | 0.5566       | 0.5579     | ✓


POPCOUNT IS PRIME      | -0.1133    | 0.4434       | 0.4421     | ✓

Key insight: Even for non-linear functions, Pr[accept] is always in [0, 1]!
Σf̂³ can be negative, but (1 + Σf̂³)/2 is always a valid probability.


In [10]:
# Verify: Pr[BLR accepts] = (1 + Σ f̂(S)³) / 2
#
# Why? BLR accepts when f(x)·f(y) = f(x⊕y) in ±1, which means f(x)·f(y)·f(x⊕y) = 1.
# So Pr[accept] = Pr[f(x)·f(y)·f(x⊕y) = 1] = (1 + E[f(x)f(y)f(x⊕y)]) / 2
# And by Theorem 2.10: E[f(x)f(y)f(x⊕y)] = Σ f̂(S)³

print("BLR Acceptance Probability = (1 + Σ f̂(S)³) / 2")
print("=" * 75)
print(f"{'Function':<22} | {'Σ f̂(S)³':<10} | {'(1+Σf̂³)/2':<12} | {'Empirical':<12}")
print("-" * 75)

def blr_empirical_acceptance(f, trials=10000):
    """Run BLR test many times and compute acceptance rate."""
    n = f.n_vars
    accepts = 0
    np.random.seed(42)  # Reproducibility
    
    for _ in range(trials):
        x = np.random.randint(0, 2, n)
        y = np.random.randint(0, 2, n)
        xy = (x + y) % 2  # XOR in F_2
        
        fx = 1 - 2 * f.evaluate(x)  # Convert to ±1
        fy = 1 - 2 * f.evaluate(y)
        fxy = 1 - 2 * f.evaluate(xy)
        
        if fx * fy == fxy:  # BLR acceptance condition
            accepts += 1
    
    return accepts / trials

for name, f in test_functions.items():
    fourier = f.fourier()
    sum_cubed = sum(c**3 for c in fourier)
    theoretical = (1 + sum_cubed) / 2  # Correct formula!
    empirical = blr_empirical_acceptance(f)
    match = "✓" if abs(theoretical - empirical) < 0.02 else "✗"
    print(f"{name:<22} | {sum_cubed:<10.4f} | {theoretical:<12.4f} | {empirical:<12.4f} {match}")

BLR Acceptance Probability = (1 + Σ f̂(S)³) / 2
Function               | Σ f̂(S)³   | (1+Σf̂³)/2   | Empirical   
---------------------------------------------------------------------------


PARITY₅ (linear)       | 1.0000     | 1.0000       | 1.0000       ✓


x₂ (dictator)          | 1.0000     | 1.0000       | 1.0000       ✓


MAJORITY₅              | 0.2969     | 0.6484       | 0.6488       ✓


AND₄                   | 0.6719     | 0.8359       | 0.8310       ✓


OR₄                    | -0.6406    | 0.1797       | 0.1821       ✓


TRIBES(2,3)            | 0.4375     | 0.7188       | 0.7113       ✓


RANDOM₅                | 0.0625     | 0.5312       | 0.5245       ✓


RANDOM₅ balanced       | 0.0312     | 0.5156       | 0.5101       ✓


HASH-LIKE₅             | 0.0625     | 0.5312       | 0.5326       ✓


## 2.4 Local Correction

**Setup**: You have oracle access to $F$, which is ε-close to some unknown character $\chi_S$.  
**Goal**: Compute $\chi_S(x)$ (the true value), not $F(x)$ (which may be corrupted).  
**Challenge**: You don't know $S$, and $F(x)$ might be wrong at $x$.

**LocalCorrect(F, x)** (2 queries):
1. Pick $y \sim \{0,1\}^n$ uniformly
2. Return $F(y) \cdot F(x \oplus y)$

**Theorem 2.15**: $\Pr_y[\text{LocalCorrect}(F, x) = \chi_S(x)] \geq 1 - 2\varepsilon$

**Why it works**: 

By linearity, $\chi_S(x) = \chi_S(y) \cdot \chi_S(x \oplus y)$ for any $y$.  
So if $F$ agrees with $\chi_S$ at both $y$ and $x \oplus y$, we get $\chi_S(x)$.

Since $y$ is uniform and $x \oplus y$ is also uniform (XOR is a bijection), each point independently has ≤ε chance of being corrupted. We fail only when *exactly one* is wrong — if both are wrong, the errors cancel. Hence $\Pr[\text{fail}] \leq 2\varepsilon$.

In [11]:
# Local Correction: Recovering the true linear function from a noisy version.
# Key idea: we can compute χ_S(x) without knowing S.
# 
# Now available in the library: PropertyTester.local_correct()

from boofun.analysis import PropertyTester
from scipy import stats

def create_noisy_function_exact(base_fn, num_flips, seed=42):
    """Create a noisy version with EXACTLY num_flips corrupted entries."""
    rng = np.random.default_rng(seed)
    tt = list(base_fn.get_representation("truth_table"))
    flip_indices = rng.choice(len(tt), size=num_flips, replace=False)
    for i in flip_indices:
        tt[i] = 1 - tt[i]
    return bf.create(tt)

def majority_vote_bound(p_single, k):
    """Probability that majority of k Bernoulli(p) trials succeeds."""
    if p_single <= 0.5:
        return p_single  # Can't do better than random
    threshold = (k + 1) // 2
    return 1 - stats.binom.cdf(threshold - 1, k, p_single)

print("Local Correction Demonstration (using boofun.PropertyTester)")
print("=" * 78)
print()
print("Theorem 2.15: For F that is ε-close to χ_S:")
print("  Pr[F(y)·F(x⊕y) = χ_S(x)] ≥ 1 - 2ε  (for a SINGLE random y)")
print()
print("With majority voting over k queries, Chernoff bounds give exponential boost!")
print("-" * 78)

# Test local correction at different noise levels
n = 7  # 128 entries for finer granularity
k = 21  # Odd number for clean majority
true_parity = bf.parity(n)
size = 2**n

print(f"\n{'Noise ε':<12} | {'1-Query Bound':<14} | {f'k={k} Majority':<14} | {'Empirical':<12}")
print(f"{'(flips/128)':<12} | {'(≥ 1-2ε)':<14} | {'(theory)':<14} | {'(actual)':<12}")
print("-" * 78)

for num_flips in [3, 6, 10, 16, 25, 35]:  # Exact number of corrupted entries
    noise_rate = num_flips / size
    noisy_fn = create_noisy_function_exact(true_parity, num_flips, seed=num_flips)
    
    # Theoretical bounds (1 - 2ε is a LOWER BOUND on single-query success)
    p_single_bound = max(0, 1 - 2 * noise_rate)
    p_majority_bound = majority_vote_bound(p_single_bound, k)
    
    # Use the library's local_correct method!
    tester = PropertyTester(noisy_fn, random_seed=42)
    
    # Test local correction accuracy on ALL inputs
    correct = 0
    for x in range(size):
        corrected = tester.local_correct(x, repetitions=k)
        true_val = 1 - 2 * (bin(x).count("1") % 2)  # Parity: (-1)^popcount
        if corrected == true_val:
            correct += 1
    
    accuracy = correct / size
    
    print(f"ε = {noise_rate:>5.1%} ({num_flips:>2}) | {p_single_bound:>12.1%}   | {p_majority_bound:>12.1%}   | {accuracy:>10.1%}")

print()
print("Key observations:")
print("  • The '1-Query Bound' (1-2ε) is a WORST-CASE lower bound, not the actual rate")
print("  • Actual per-query success is often much higher → majority voting amplifies it")
print("  • Even at ε = 27%, empirical accuracy >> 45% bound (actual p_query > 50%!)")
print("  • Local correction is remarkably robust across a wide noise range")
print()
print("# The library makes this easy:")
print("# tester = PropertyTester(f)")
print("# tester.local_correct(x, repetitions=21)  # Returns ±1")

Local Correction Demonstration (using boofun.PropertyTester)

Theorem 2.15: For F that is ε-close to χ_S:
  Pr[F(y)·F(x⊕y) = χ_S(x)] ≥ 1 - 2ε  (for a SINGLE random y)

With majority voting over k queries, Chernoff bounds give exponential boost!
------------------------------------------------------------------------------

Noise ε      | 1-Query Bound  | k=21 Majority  | Empirical   
(flips/128)  | (≥ 1-2ε)       | (theory)       | (actual)    
------------------------------------------------------------------------------


ε =  2.3% ( 3) |        95.3%   |       100.0%   |     100.0%
ε =  4.7% ( 6) |        90.6%   |       100.0%   |     100.0%
ε =  7.8% (10) |        84.4%   |       100.0%   |     100.0%
ε = 12.5% (16) |        75.0%   |        99.4%   |      99.2%
ε = 19.5% (25) |        60.9%   |        84.8%   |      97.7%
ε = 27.3% (35) |        45.3%   |        45.3%   |      81.2%

Key observations:
  • The '1-Query Bound' (1-2ε) is a WORST-CASE lower bound, not the actual rate
  • Actual per-query success is often much higher → majority voting amplifies it
  • Even at ε = 27%, empirical accuracy >> 45% bound (actual p_query > 50%!)
  • Local correction is remarkably robust across a wide noise range

# The library makes this easy:
# tester = PropertyTester(f)
# tester.local_correct(x, repetitions=21)  # Returns ±1


## Summary

### Key Takeaways from Lecture 2:

1. **Expectation & Variance**: 
   - $\mathbf{E}[f] = \hat{f}(\emptyset)$ (the constant Fourier coefficient)
   - $\mathbf{Var}[f] = 1 - \hat{f}(\emptyset)^2$ for Boolean functions
   - Balanced functions have $\mathbf{E}[f] = 0$ and maximum variance

2. **Singleton Coefficients**:
   - $\hat{f}(\{i\}) = \frac{1}{2}(\mathbf{E}[f|x_i=+1] - \mathbf{E}[f|x_i=-1])$
   - Measures how much variable $i$ "pulls" the function

3. **Convolution Theorem**: $\widehat{f * g}(S) = \hat{f}(S) \cdot \hat{g}(S)$
   - Convolution in time domain = multiplication in Fourier domain

4. **BLR Linearity Test**: 
   - Tests if $F$ is linear with only 3 queries per round
   - Acceptance probability = $\sum_S \hat{F}(S)^3$
   - Works on ANY function - built-ins, custom, hash-like, random!

5. **Local Correction**: 
   - Recover $\chi_S(x)$ without knowing $S$
   - Single query: $F(y) \cdot F(x \oplus y) = \chi_S(x)$ with probability $\geq 1 - 2\varepsilon$
   - With majority voting over $k$ queries: exponential amplification via Chernoff bounds
   - Remarkably robust even at high noise levels!

### Using boofun:

```python
import boofun as bf
from boofun.analysis import PropertyTester
from boofun.analysis.fourier import convolution

# Create functions easily
f = bf.majority(5)
g = bf.tribes(2, 3)
h = bf.random(4, balanced=True)
custom = bf.create([0, 1, 1, 0, 1, 0, 0, 1])  # Any truth table!

# Fourier analysis - simple!
coeffs = f.fourier()
expectation = coeffs[0]  # E[f] = f̂(∅)

# Property testing
tester = PropertyTester(f)
tester.blr_linearity_test(100)
tester.monotonicity_test(500)
tester.run_all_tests()  # Everything at once!

# Local correction (BLR self-correction with majority voting)
tester.local_correct(x, repetitions=21)  # Returns ±1
tester.local_correct_all(repetitions=21)  # Correct all 2^n inputs

# Convolution returns exact Fourier coefficients
conv_coeffs = convolution(f, g)
```

---
*Based on lecture notes by Austin Pechan. Notebook created by Gabriel Taboada using the `boofun` library.*