# Bonfernnoi Inequalities  


**Algebraic approach using taylor polynomial with remainder (and indicator random variables)**   

consider a degree n monic polynomial in $x$ written in factored form, where $x$ is a usual parameter for a single variable polynomial  
$p(x) = (x -  \cdot r_1)(x - \cdot r_2)...(x -  \cdot r_n)$  

now suppose we insert a slack parameter, $t$, to scale the roots

$p(x,t) = (x - t \cdot r_1)(x -t \cdot r_2)...(x - t \cdot r_n)$  

and if we fix $x =1$ we have  

$p(1,t) = (1 - t \cdot r_1)(1 -t \cdot r_2)...(1 - t \cdot r_n)$  

or just  
$p(t) = (1 - t \cdot r_1)(1 -t \cdot r_2)...(1 - t \cdot r_n)$  
for conciseness   

using some results (see notes for chapters 6 and 12 of Cauchy Schwarz Masterclass)     

$p(t) = (1 - t \cdot r_1)(1 -t \cdot r_2)...(1 - t \cdot r_n)  = 1 + (- t) e_1(\mathbf r) + (-t)^{2}e_2(\mathbf r) + .... + (-t)^{n-1}e_{n-1}(\mathbf r) + (-t)^n e_{n}(\mathbf r) $  

if we differentiate $k$ times, we get 

$\frac{d ^k p(t)}{d t^k} = (-1)^k k!\cdot e_k(\mathbf r) + (-1)^k\frac{(-t)^1(k+1)!}{1!} \cdot e_{k+1}(\mathbf r) + (-1)^k \frac{(-t)^2(k+2)!}{2!}  \cdot e_{k+2}(\mathbf r) + ... + (-1)^k \frac{(-t)^{n-k} n!}{(n-k)!}\cdot e_{n}(\mathbf r) $   

*remark: setting $t=0$ gives*  
$\frac{d ^k p(0)}{d t^k} = (-1)^k k!\cdot e_k(\mathbf r)$  


hence we have a Taylor polynomial of 

$p(b) = p(a) + p'(a)(b-a) + \frac{p''(a)}{2!}(b-a)^2 +... + \frac{p^{(k-1)}(a)}{(k-1)!}(b-a)^{k-1} + \text{Remainder}$  

where the remainder is in Lagrange form, given by  
$\text{Remainder}=   \frac{p^{(k)}(t^*)}{k!} (b-a)^k$  for $t^* \in [a,b]$  

selecting $a:=0$ and $b :=1$ this reads  

$(1 - r_1)(1 -r_2)...(1 -r_n) = p(1) = p(0) - p'(0) + \frac{p''(0)}{2!}... + \frac{p^{(k-1)}(0)}{(k-1)!} + \frac{p^{(k)}(t^*)}{k!}$  
$p(1) = 1 - e_1(\mathbf r) + e_2(\mathbf r)... + (-1)^{(k-1)} e_{k-1}(\mathbf r)+ \text{Remainder}$   
where 
$\text{Remainder}=   \frac{p^{(k)}(t^*)}{k!} (b-a)^k$  for $t^* \in [0,1]$ 

- - - - -  
*remarks:*  
If we set $k=n+1$, since polynomials are nilpotent, we will recover the exact forumla for inclusion-exclusion (see final paragraph at end which discusses inserting indicator random variables).   
However, we may recover Bonferonni inequalities for $1 \leq k \leq n-1$, in particular noticing that 

$1 - e_1(\mathbf r) + e_2(\mathbf r)... + (-1)^{(k-1)} e_{k-1}(\mathbf r) \leq p(1)$  
when k is even, and  
$p(1) \leq 1 - e_1(\mathbf r) + e_2(\mathbf r)... + (-1)^{(k-1)} e_{k-1}(\mathbf r)$   
when $k$ is odd  
- - - - -    
*bounding the remainder*  

if we isolate the effect of the sign function, we have 

$(-1)^k\text{Remainder}= \text{Remainder without sign} \geq 0$ 
hence the omitted remainder term  
is non-negative when $k$ is even, and nonpositive when $k$ is odd. 

it remains to prove that  
$\text{Remainder without sign} \geq 0$ for $t\in[0,1]$  
we can do this most easily by assuming each $r_i \in\{0,1\}$  (i.e. a two item set, *not* an interval)  
(see commentary at end on indicator random variables)  
And in particular there are $m$ different $r_i$'s with the value equal to $1$ and $(n-m)$ different $r_i$'s with the value $0$.    

if we re-visit our original polynomial, we now have   
$p(t) = (1 - t \cdot r_1)(1 -t \cdot r_2)...(1 - t \cdot r_n) = (1 - 1\cdot t)^m (1 - 0\cdot t)^{n-m} = (1 - t)^m$    

and if we differentiate once, we recover a polynomial with $(m-1)$ roots equal to one, and by induction / general knowledge (e.g. see Newton's Identities section of Vandermonde Matrix notebook in the Linear Algebra Folder) this holds if we differentiate $v$ times where $1 \leq v \leq m$ -- we then have a polynomial with $m-v$ roots all equal to one.  For differentiating $m$ times we have a constant polynomial, and for differentiating $\gt m$ times the polynomial is identically zero.   

We can now re-visit  

$\text{Remainder} = \frac{p^{(k)}(t^*)}{k!} (b-a)^k$  for $t\in [0,1]$  
and recall that if $t^* = 0$ then we have  

$\frac{p^{(k)}(t^*)}{k!} (b-a)^k = (-1)^k \mathbf e_k(\mathbf r)$   
which is non-negative if $k$ is even and non-positive if $k$ is positive (which we refer to as having a negative sign and positive sign)  

and looking back again at the remainder term  
$\frac{p^{(k)}(t^*)}{k!}$

we can conclude this is a polynomial with any and all roots equal to $1$, hence while the remainder may be 0 at $t^* = 1$, by Intermediate Value Theorem there cannot be a sign change for $t^* \in (0,1)$ -- equivalently, to figure out whether the remainder is nonpositive or nonnnegative, it is enough to evaluate at $t^* =0$ as that gives us all the information we require-- the inequality is completely determined by whether $(-1)^k \mathbf e_k(\mathbf r)$ is non-negative or non-positive.  

- - - - - 
*conclusion:*   

Thus Bonferonni's Inequalities maybe interpretted as saying 

$p(1) \leq p(1) + \text{non-negative number in remainder term}$  
and   
$p(1) - \text{non-negative number in remainder term} \leq p(1)$  


To handle general casese of $n$ events we set $r_i := \mathbb I_{A_i}$.  The above bound holds over any and all sample paths of these random variables (there are $2^n$ of them), so the inequality holds after we take the expectation which recovers the general case of Bonferonni Inequalities. Put differently, what we've proven above is the inequality holds for 




$E\big[(1-\mathbb I_{A_1})(1-\mathbb I_{A_2}...(1-\mathbb I_{A_n})\big \vert \big(\mathbb I_{A_1}, \mathbb I_{A_2}, ...,\mathbb I_{A_n}\big)\big]$

or less abstractly for 

$E\big[(1-\mathbb I_{A_1})(1-\mathbb I_{A_2}...(1-\mathbb I_{A_n})\big \vert \big(\mathbb I_{A_1}, \mathbb I_{A_2}, ...,\mathbb I_{A_n}\big)=\omega_l \big]$  

where $\omega_l$ is some arbitrarily selected bit tuple, e.g. $\big(1,0,...,0\big)$   
(there are $2^n$ of these)  

But we then take the expectation of the above, where the inequality holds over each sample path so  

$Pr\big\{A_1^C \cap A_2^C \cap .... \cap A_n^C \big \}=E\Big[ (1-\mathbb I_{A_1})(1-\mathbb I_{A_2}...(1-\mathbb I_{A_n})\Big] = E\Big[ E\big[(1-\mathbb I_{A_1})(1-\mathbb I_{A_2}...(1-\mathbb I_{A_n})\big \vert \big(\mathbb I_{A_1}, \mathbb I_{A_2}, ...,\mathbb I_{A_n}\big) =\omega_l \big]\Big]$  

and of course we can negate and add one to recall that 

$1 - Pr\big\{A_1^C \cap A_2^C \cap .... \cap A_n^C \big \} = Pr\big\{A_1 \cup A_2 \cup ....\cup A_n \big \}$  

