[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/GabbyTab/boofun/blob/main/notebooks/lecture2_linearity_testing.ipynb)

# Lecture 2: Convolution and Linearity Testing

**Based on CS294-92: Analysis of Boolean Functions (Spring 2025)**  
**Instructor: Avishay Tal**  
**Based on lecture notes by: Austin Pechan**  
**Notebook by: Gabriel Taboada**

This notebook covers:
1. Expectation and Variance of Boolean Functions
2. Convolution in Fourier Domain
3. The BLR Linearity Test
4. Local Correction

---

## Key Theorems (Recap)

- **Fundamental Theorem**: $f(x) = \sum_{S \subseteq [n]} \hat{f}(S) \cdot \chi_S(x)$
- **Plancherel**: $\langle f, g \rangle = \sum_S \hat{f}(S) \hat{g}(S)$
- **Parseval**: $\sum_S \hat{f}(S)^2 = 1$ for Boolean-valued $f$

In [None]:
# Install boofun if running in Google Colab
try:
    import boofun as bf
except ImportError:
    !pip install boofun -q
    import boofun as bf

print(f"BooFun version: {bf.__version__}")

In [None]:
import numpy as np
from boofun.analysis import SpectralAnalyzer
from boofun.analysis import PropertyTester
from boofun.analysis.fourier import parseval_verify, plancherel_inner_product, convolution

import warnings
warnings.filterwarnings('ignore')

## 2.1 Expectation and Variance

**Fact 2.6**: $\mathbf{E}[f(x)] = \hat{f}(\emptyset)$

The constant Fourier coefficient is the expectation!

**Fact 2.7**: $\mathbf{Var}[f] = \sum_{S \neq \emptyset} \hat{f}(S)^2$

For Boolean functions: $\mathbf{Var}[f] = 1 - \hat{f}(\emptyset)^2$

In [None]:
# Demonstrate expectation = f̂(∅) across many function families
functions = {
    # Basic gates
    "AND₃": bf.AND(3),
    "OR₃": bf.OR(3),
    "PARITY₃": bf.parity(3),
    # Threshold functions
    "MAJORITY₃": bf.majority(3),
    "THRESHOLD₃,₂": bf.threshold(3, 2),  # ≥2 of 3 bits
    # Structured functions
    "TRIBES(2,3)": bf.tribes(2, 3),  # OR of 3 ANDs of 2
    "DICTATOR₀": bf.dictator(3, 0),  # Just x₀
    # Random function for comparison
    "RANDOM₃": bf.random(3, seed=42),
}

print("Expectation = f̂(∅) verification:")
print("=" * 60)
print(f"{'Function':<15} | {'E[f]':<10} | {'f̂(∅)':<10} | {'Interpretation'}")
print("-" * 60)

for name, f in functions.items():
    # Direct expectation in ±1
    tt = np.asarray(f.get_representation("truth_table"))
    pm_vals = 1 - 2 * tt  # Convert to ±1
    expectation = np.mean(pm_vals)
    
    # Fourier coefficient at ∅
    f_hat_empty = f.fourier()[0]
    
    # Interpret the expectation
    if abs(expectation) < 0.01:
        interp = "balanced"
    elif expectation > 0:
        interp = f"biased to +1 ({(1+expectation)/2:.0%})"
    else:
        interp = f"biased to -1 ({(1-expectation)/2:.0%})"
    
    print(f"{name:<15} | {expectation:<10.4f} | {f_hat_empty:<10.4f} | {interp}")

In [None]:
# Demonstrate variance = Σ_{S≠∅} f̂(S)² = 1 - f̂(∅)²
print("Variance verification: Var[f] = 1 - f̂(∅)² = Σ_{S≠∅} f̂(S)²")
print("=" * 70)
print(f"{'Function':<15} | {'Var[f]':<10} | {'1-f̂(∅)²':<10} | {'Σ f̂(S)²':<10} | Match")
print("-" * 70)

for name, f in functions.items():
    tt = np.asarray(f.get_representation("truth_table"))
    pm_vals = 1 - 2 * tt
    var_direct = np.var(pm_vals)
    
    fourier = f.fourier()
    f_hat_empty = fourier[0]
    var_formula = 1 - f_hat_empty**2
    sum_sq = sum(c**2 for c in fourier[1:])
    
    match = "✓" if abs(var_direct - var_formula) < 1e-10 else "✗"
    print(f"{name:<15} | {var_direct:<10.4f} | {var_formula:<10.4f} | {sum_sq:<10.4f} | {match}")

print("\nKey insight: Balanced functions (E[f]=0) have maximum variance = 1")

## 2.1.1 Singleton Fourier Coefficients and Conditional Expectations

For singleton sets $S = \{i\}$, we have:

$$\hat{f}(\{i\}) = \mathbf{E}[f(x) \cdot x_i] = \frac{1}{2}\left(\mathbf{E}[f(x) | x_i = 1] - \mathbf{E}[f(x) | x_i = -1]\right)$$

**Interpretation**: $\hat{f}(\{i\})$ measures how much variable $i$ "pulls" the function:
- $\hat{f}(\{i\}) > 0$: $f$ tends to output +1 when $x_i = +1$
- $\hat{f}(\{i\}) < 0$: $f$ tends to output +1 when $x_i = -1$
- $\hat{f}(\{i\}) = 0$: Variable $i$ has no "net effect" on the output

In [None]:
# Demonstrate the conditional expectation interpretation of f̂({i})
def analyze_singleton_coefficients(f, name):
    """Show how f̂({i}) relates to conditional expectations."""
    n = f.n_vars
    fourier = f.fourier()
    
    print(f"\n{name} (n={n}):")
    print("-" * 70)
    print(f"{'Variable':<10} | {'f̂({i})':<10} | {'E[f|xᵢ=+1]':<12} | {'E[f|xᵢ=-1]':<12} | {'Diff/2':<10}")
    print("-" * 70)
    
    tt = np.asarray(f.get_representation("truth_table"))
    pm_vals = 1 - 2 * tt  # Convert to ±1: Boolean 0 → +1, Boolean 1 → -1
    
    for i in range(n):
        # f̂({i}) from Fourier expansion
        idx = 1 << i  # Index for singleton {i}
        f_hat_i = fourier[idx]
        
        # Compute conditional expectations directly
        # Key insight: bit=0 means xᵢ=+1, bit=1 means xᵢ=-1 (O'Donnell convention)
        mask_minus = np.array([(x >> i) & 1 == 1 for x in range(len(tt))])  # xᵢ = -1
        mask_plus = ~mask_minus  # xᵢ = +1
        
        E_given_plus = np.mean(pm_vals[mask_plus])   # E[f | xᵢ = +1]
        E_given_minus = np.mean(pm_vals[mask_minus]) # E[f | xᵢ = -1]
        
        diff_half = (E_given_plus - E_given_minus) / 2
        
        match = "✓" if abs(f_hat_i - diff_half) < 1e-10 else "✗"
        print(f"x_{i:<8} | {f_hat_i:<10.4f} | {E_given_plus:<12.4f} | {E_given_minus:<12.4f} | {diff_half:<10.4f} {match}")

# Analyze several functions
analyze_singleton_coefficients(bf.majority(5), "MAJORITY₅")
analyze_singleton_coefficients(bf.tribes(2, 3), "TRIBES(2,3)")
analyze_singleton_coefficients(bf.dictator(4, 1), "DICTATOR x₁")

## 2.2 Convolution

**Definition**: The convolution of $f$ and $g$ is:
$$f * g(x) = \mathbf{E}_{y}[f(y) \cdot g(x \cdot y)]$$

where $x \cdot y$ denotes coordinate-wise multiplication in $\{\pm 1\}^n$.

**Theorem 2.8 (Convolution Theorem)**: 
$$\widehat{f * g}(S) = \hat{f}(S) \cdot \hat{g}(S)$$

Convolution in the time domain = multiplication in the Fourier domain!

In [None]:
# Demonstrate the Convolution Theorem
f = bf.majority(3)
g = bf.parity(3)

# Get Fourier coefficients
f_fourier = f.fourier()
g_fourier = g.fourier()

# The Convolution Theorem says: (f*g)^(S) = f̂(S) · ĝ(S)
# convolution() returns a BooleanFunction, so we get its Fourier coefficients
conv_fn = convolution(f, g)
conv_fourier = conv_fn.fourier()

# The theorem: (f*g)^(S) = f̂(S) · ĝ(S)
print("Convolution Theorem: (f*g)^(S) = f̂(S) · ĝ(S)")
print("=" * 60)
print(f"{'S (index)':<10} | {'f̂(S)':<10} | {'ĝ(S)':<10} | {'(f*g)^(S)':<10} | {'f̂·ĝ':<10} | Match")
print("-" * 60)

for i, (fc, gc, conv_c) in enumerate(zip(f_fourier, g_fourier, conv_fourier)):
    if abs(fc) > 0.01 or abs(gc) > 0.01 or abs(conv_c) > 0.01:
        expected_c = fc * gc
        match = "✓" if abs(conv_c - expected_c) < 1e-10 else "✗"
        # Convert index to set notation
        bits = [j for j in range(3) if (i >> j) & 1]
        set_str = str(set(bits)) if bits else "∅"
        print(f"{set_str:<10} | {fc:<10.4f} | {gc:<10.4f} | {conv_c:<10.4f} | {expected_c:<10.4f} | {match}")

print()
print("Key insight: convolution(f, g) returns the convolved function.")
print("Its Fourier coefficients equal the pointwise product f̂(S) · ĝ(S)!")

### The PropertyTester

The `boofun` library provides `PropertyTester` - a class that implements many property testing algorithms from the literature. Property testing lets us efficiently check if a function has a property (or is close to having it) using only a small number of queries.

**Available tests:**
- `constant_test()` - Is $f$ constant?
- `blr_linearity_test()` - Is $f$ linear? (BLR algorithm)
- `dictator_test()` - Is $f$ a dictator function?
- `junta_test(k)` - Does $f$ depend on at most $k$ variables?
- `monotonicity_test()` - Is $f$ monotone?
- `symmetry_test()` - Is $f$ symmetric?
- `balanced_test()` - Does $f$ output 0 and 1 equally often?
- `affine_test()` - Is $f$ affine over GF(2)?
- `run_all_tests()` - Run everything!

In [None]:
# PropertyTester demonstration - run all tests on various functions
from boofun.analysis import PropertyTester

demo_functions = {
    "PARITY₄": bf.parity(4),
    "MAJORITY₅": bf.majority(5),
    "DICTATOR x₀": bf.dictator(4, 0),
    "AND₃": bf.AND(3),
    "TRIBES(2,2)": bf.tribes(2, 2),
}

print("PropertyTester: Comprehensive Function Analysis")
print("=" * 80)

for name, f in demo_functions.items():
    tester = PropertyTester(f)
    
    # Run individual tests
    is_const = tester.constant_test()
    is_balanced = tester.balanced_test()
    is_linear = tester.blr_linearity_test(100)
    is_monotone = tester.monotonicity_test(500)
    is_symmetric = tester.symmetry_test(500)
    is_dictator, dictator_var = tester.dictator_test(500)
    
    print(f"\n{name}:")
    props = []
    if is_const: props.append("constant")
    if is_balanced: props.append("balanced")
    if is_linear: props.append("linear")
    if is_monotone: props.append("monotone")
    if is_symmetric: props.append("symmetric")
    if is_dictator: props.append(f"dictator(x_{dictator_var})")
    
    print(f"  Properties: {', '.join(props) if props else 'none detected'}")
    
    # Show Fourier summary
    fourier = f.fourier()
    degree = max(bin(i).count('1') for i, c in enumerate(fourier) if abs(c) > 1e-10)
    sparsity = sum(1 for c in fourier if abs(c) > 1e-10)
    print(f"  Fourier degree: {degree}, Sparsity: {sparsity}/{len(fourier)}")

## 2.3 BLR Linearity Test

The **BLR Test** (Blum-Luby-Rubinfeld, 1993) tests whether a function $F: \mathbb{F}_2^n \to \{\pm 1\}$ is **linear** (a character $\chi_S$).

**Algorithm**:
1. Pick $x, y \sim \mathbb{F}_2^n$ uniformly at random
2. Query $F(x)$, $F(y)$, and $F(x + y)$
3. **Accept** if $F(x) \cdot F(y) = F(x + y)$

**Key Insight**: For a linear function $\chi_S$:
$$\chi_S(x) \cdot \chi_S(y) = \chi_S(x + y)$$
Always! So linear functions pass with probability 1.

**Theorem 2.10**: $\Pr[\text{BLR accepts } F] = \sum_{S \subseteq [n]} \hat{F}(S)^3$

**Corollary**: If $F$ is $\varepsilon$-close to linear, BLR rejects with probability $\leq \varepsilon$.

In [None]:
# BLR Linearity Testing on diverse function families
# The power of the library: test ANY function, even strange custom ones!

# Built-in functions
test_functions = {
    # Linear functions (should PASS)
    "PARITY₅ (linear)": bf.parity(5),
    "x₂ (dictator)": bf.dictator(5, 2),
    
    # Classic non-linear functions
    "MAJORITY₅": bf.majority(5),
    "AND₄": bf.AND(4),
    "OR₄": bf.OR(4),
    "TRIBES(2,3)": bf.tribes(2, 3),
    
    # Random functions
    "RANDOM₅": bf.random(5, seed=123),
    "RANDOM₅ balanced": bf.random(5, balanced=True, seed=456),
}

# Create a "hash-like" function (complex, non-linear)
def create_hash_like(n):
    """Create a pseudorandom-looking function that mixes inputs."""
    def hash_fn(x):
        # Simple hash: XOR of rotated products
        h = 0
        for i in range(n):
            for j in range(i+1, n):
                h ^= (x[i] & x[j])
        return h
    
    tt = [hash_fn([int(b) for b in format(i, f'0{n}b')]) for i in range(2**n)]
    return bf.create(tt)

test_functions["HASH-LIKE₅"] = create_hash_like(5)

print("BLR Linearity Test Results")
print("=" * 70)
print(f"{'Function':<20} | {'Linear?':<8} | {'BLR':<8} | {'Σf̂³':<10} | Interpretation")
print("-" * 70)

for name, f in test_functions.items():
    tester = PropertyTester(f)
    blr_result = tester.blr_linearity_test(200)
    is_linear = f.is_linear()
    
    # Compute theoretical acceptance probability
    fourier = f.fourier()
    sum_cubed = sum(c**3 for c in fourier)
    
    status = "PASS" if blr_result else "FAIL"
    
    # Interpret
    if is_linear:
        interp = "exact linear!"
    elif sum_cubed > 0.9:
        interp = "nearly linear"
    elif sum_cubed > 0.5:
        interp = "somewhat structured"
    else:
        interp = "far from linear"
    
    print(f"{name:<20} | {str(is_linear):<8} | {status:<8} | {sum_cubed:<10.4f} | {interp}")

In [None]:
# The POWER of the library: test ANY function you can define!
# This is incredibly useful for research and exploration.

print("Testing arbitrary functions with BLR:")
print("=" * 70)

# Example 1: A "noisy parity" - parity with some bits flipped
def create_noisy_linear(n, noise_rate):
    """Create parity with random noise."""
    parity = bf.parity(n)
    tt = list(parity.get_representation("truth_table"))
    np.random.seed(42)
    for i in range(len(tt)):
        if np.random.random() < noise_rate:
            tt[i] = 1 - tt[i]
    return bf.create(tt)

# Example 2: A "nearly majority" - perturb a few entries
def create_near_majority(n):
    """Majority with slight modifications."""
    maj = bf.majority(n)
    tt = list(maj.get_representation("truth_table"))
    tt[0] = 1 - tt[0]  # Flip one entry
    return bf.create(tt)

# Example 3: A completely custom function
def create_weird_function(n):
    """A strange custom function for testing."""
    def weird(x):
        # Returns 1 if popcount is prime, 0 otherwise
        popcount = sum(x)
        return 1 if popcount in [2, 3, 5, 7, 11, 13] else 0
    tt = [weird([int(b) for b in format(i, f'0{n}b')]) for i in range(2**n)]
    return bf.create(tt)

custom_functions = {
    "PARITY₅ (exact)": bf.parity(5),
    "NOISY PARITY (5%)": create_noisy_linear(5, 0.05),
    "NOISY PARITY (20%)": create_noisy_linear(5, 0.20),
    "NEAR MAJORITY₅": create_near_majority(5),
    "POPCOUNT IS PRIME": create_weird_function(5),
}

for name, f in custom_functions.items():
    tester = PropertyTester(f)
    
    # Run BLR multiple times to see variance
    results = [tester.blr_linearity_test(100) for _ in range(5)]
    pass_rate = sum(results) / len(results)
    
    fourier = f.fourier()
    sum_cubed = sum(c**3 for c in fourier)
    
    print(f"{name}:")
    print(f"  Pr[BLR accepts] theoretical = {sum_cubed:.4f}")
    print(f"  BLR pass rate (5 trials)    = {pass_rate:.1%}")
    print()

In [None]:
# Verify Theorem 2.10: Pr[accept] = Σ f̂(S)³
print("\nBLR Acceptance Probability = Σ f̂(S)³:")
print(f"{'Function':<25} | {'Σ f̂(S)³':<12} | {'Empirical'}")
print("-" * 60)

def blr_empirical_acceptance(f, trials=10000):
    """Run BLR test many times and compute acceptance rate."""
    n = f.n_vars
    accepts = 0
    
    for _ in range(trials):
        x = np.random.randint(0, 2, n)
        y = np.random.randint(0, 2, n)
        xy = (x + y) % 2  # XOR in F_2
        
        fx = 1 - 2 * f.evaluate(x)  # Convert to ±1
        fy = 1 - 2 * f.evaluate(y)
        fxy = 1 - 2 * f.evaluate(xy)
        
        if fx * fy == fxy:
            accepts += 1
    
    return accepts / trials

for name, f in test_functions.items():
    # Direct API: f.fourier()
    fourier = f.fourier()
    sum_cubed = sum(c**3 for c in fourier)
    empirical = blr_empirical_acceptance(f)
    print(f"{name:<25} | {sum_cubed:<12.4f} | {empirical:<12.4f}")

## 2.4 Local Correction

Once we know $F$ is $\varepsilon$-close to some linear function $\chi_S$, we can **locally correct** it!

**LocalCorrect(F, x)**:
1. Choose $y \sim \mathbb{F}_2^n$ uniformly
2. Query $F(y)$ and $F(x + y)$
3. Return $F(y) \cdot F(x + y)$

**Theorem 2.15**: For all $x$:
$$\Pr_y[\text{LocalCorrect}(F, x) = \chi_S(x)] \geq 1 - 2\varepsilon$$

This gives us access to $\chi_S$ even without knowing $S$!

In [None]:
# Local Correction: Recovering the true linear function from a noisy version.
# Key idea: we can compute χ_S(x) without knowing S.
# 
# Now available in the library: PropertyTester.local_correct()

from boofun.analysis import PropertyTester

def create_noisy_function(base_fn, noise_rate, seed=None):
    """Create a noisy version of any function."""
    if seed is not None:
        np.random.seed(seed)
    tt = list(base_fn.get_representation("truth_table"))
    for i in range(len(tt)):
        if np.random.random() < noise_rate:
            tt[i] = 1 - tt[i]
    return bf.create(tt)

print("Local Correction Demonstration (using boofun.PropertyTester)")
print("=" * 70)

# Test local correction at different noise levels
n = 5
true_parity = bf.parity(n)

for noise_rate in [0.05, 0.10, 0.15, 0.20, 0.30]:
    noisy_fn = create_noisy_function(true_parity, noise_rate, seed=42)
    
    # Measure actual noise level
    tt_true = np.asarray(true_parity.get_representation("truth_table"))
    tt_noisy = np.asarray(noisy_fn.get_representation("truth_table"))
    actual_noise = np.mean(tt_true != tt_noisy)
    
    # Use the library's local_correct method!
    tester = PropertyTester(noisy_fn, random_seed=42)
    
    # Test local correction accuracy
    correct = 0
    for x in range(2**n):
        corrected = tester.local_correct(x, repetitions=15)
        # True parity: χ_[n](x) = (-1)^popcount(x)
        true_val = 1 - 2 * (bin(x).count("1") % 2)
        if corrected == true_val:
            correct += 1
    
    accuracy = correct / (2**n)
    theoretical_bound = 1 - 2 * actual_noise  # Theorem 2.15
    
    print(f"Noise ε={actual_noise:.1%}: LocalCorrect accuracy = {accuracy:.1%} "
          f"(theory: ≥{max(0, theoretical_bound):.1%})")

print("\nKey insight: Even with 20% corruption, we recover >90% accuracy!")
print("This works because we query multiple random y's and take majority vote.")
print("\n# The library does the heavy lifting:")
print("# tester = PropertyTester(f)")
print("# tester.local_correct(x, repetitions=15)  # Returns ±1")

## Summary

### Key Takeaways from Lecture 2:

1. **Expectation & Variance**: 
   - $\mathbf{E}[f] = \hat{f}(\emptyset)$ (the constant Fourier coefficient)
   - $\mathbf{Var}[f] = 1 - \hat{f}(\emptyset)^2$ for Boolean functions
   - Balanced functions have $\mathbf{E}[f] = 0$ and maximum variance

2. **Singleton Coefficients**:
   - $\hat{f}(\{i\}) = \frac{1}{2}(\mathbf{E}[f|x_i=+1] - \mathbf{E}[f|x_i=-1])$
   - Measures how much variable $i$ "pulls" the function

3. **Convolution Theorem**: $\widehat{f * g}(S) = \hat{f}(S) \cdot \hat{g}(S)$
   - Convolution in time domain = multiplication in Fourier domain

4. **BLR Linearity Test**: 
   - Tests if $F$ is linear with only 3 queries per round
   - Acceptance probability = $\sum_S \hat{F}(S)^3$
   - Works on ANY function - built-ins, custom, hash-like, random!

5. **Local Correction**: 
   - Recover $\chi_S(x)$ without knowing $S$
   - $F(y) \cdot F(x \oplus y)$ gives correct answer with probability $\geq 1 - 2\varepsilon$

### Using boofun:

```python
import boofun as bf
from boofun.analysis import PropertyTester
from boofun.analysis.fourier import convolution

# Create functions easily
f = bf.majority(5)
g = bf.tribes(2, 3)
h = bf.random(4, balanced=True)
custom = bf.create([0, 1, 1, 0, 1, 0, 0, 1])  # Any truth table!

# Fourier analysis - simple!
coeffs = f.fourier()
expectation = coeffs[0]  # E[f] = f̂(∅)

# Property testing
tester = PropertyTester(f)
tester.blr_linearity_test(100)
tester.monotonicity_test(500)
tester.run_all_tests()  # Everything at once!

# Local correction (BLR self-correction)
tester.local_correct(x, repetitions=15)  # Returns ±1
tester.local_correct_all(repetitions=15)  # All 2^n inputs

# Convolution returns exact Fourier coefficients
conv_coeffs = convolution(f, g)
```

---
*Based on lecture notes by Austin Pechan. Notebook created by Gabriel Taboada using the `boofun` library.*