In [None]:
# 📊 Shannon Information Theory: Academic Demonstration

**From: Chapter 2 Literature Review - Shannon's Information Revolution (1948)**

This notebook provides an interactive academic exploration of Shannon's foundational formula and its connection to information physics.

## 📚 Historical Context

Claude Shannon's 1948 paper "A Mathematical Theory of Communication" {cite}`shannon1948mathematical` established the mathematical foundation for all subsequent information theory.

## 🧮 Shannon's Information Formula

Shannon defined the information content of an event as:

$$I(x) = -\log_2 p(x)$$

where:
- $I(x)$ = information content (in bits)
- $p(x)$ = probability of event $x$
- Base 2 logarithm gives units in bits

### Mathematical Properties

1. **Rare events carry more information:** As $p(x) \rightarrow 0$, $I(x) \rightarrow \infty$
2. **Certain events carry no information:** When $p(x) = 1$, $I(x) = 0$
3. **Additive for independent events:** $I(x,y) = I(x) + I(y)$ if independent

## 🔗 Connection to Information Physics

Shannon's formula becomes the **surprise component** of information voltage:

$$U_{\text{surprise}} = -\log_2 p(\text{message})$$

This directly implements Shannon's insight in our electrical framework.


: 

In [None]:
# Academic imports and setup
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from scipy import stats
import seaborn as sns

# Configure academic plotting style
plt.style.use('seaborn-v0_8-paper')
plt.rcParams.update({
    'figure.figsize': (10, 6),
    'font.size': 12,
    'axes.labelsize': 14,
    'axes.titlesize': 16,
    'xtick.labelsize': 12,
    'ytick.labelsize': 12,
    'legend.fontsize': 12
})

def shannon_information(p):
    """
    Calculate Shannon information content
    
    Parameters:
    p : float or array
        Probability values (0 < p <= 1)
        
    Returns:
    float or array
        Information content in bits
        
    References:
    Shannon, C. E. (1948). A mathematical theory of communication. 
    The Bell System Technical Journal, 27(3), 379-423.
    """
    return -np.log2(p)

print("📚 Shannon Information Theory - Academic Implementation")
print("✅ Libraries loaded and academic plotting configured")


In [None]:
# Academic Figure 1: Shannon Information vs Probability
# Reproducing key insights from Shannon (1948)

# Generate probability range
probabilities = np.linspace(0.001, 1.0, 1000)
information_content = shannon_information(probabilities)

# Create academic-quality figure
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 6))

# Main relationship plot
ax1.plot(probabilities, information_content, 'navy', linewidth=2.5, label='I(x) = -log₂(p(x))')
ax1.set_xlabel('Probability p(x)')
ax1.set_ylabel('Information Content I(x) [bits]')
ax1.set_title('Shannon Information Function\n(Shannon, 1948)')
ax1.grid(True, alpha=0.3)
ax1.legend()

# Key points annotation
key_points = [(0.001, shannon_information(0.001)), 
              (0.1, shannon_information(0.1)),
              (0.5, shannon_information(0.5)),
              (1.0, shannon_information(1.0))]

for p, i in key_points:
    ax1.plot(p, i, 'ro', markersize=8)
    ax1.annotate(f'p={p:.3f}\nI={i:.1f} bits', 
                xy=(p, i), xytext=(p+0.1, i+1),
                fontsize=10, ha='left',
                arrowprops=dict(arrowstyle='->', color='red', alpha=0.7))

# Examples with real-world context
examples = pd.DataFrame({
    'Event': ['Coin flip (heads)', 'Die roll (6)', 'Lottery win', 'Sunrise tomorrow', 'Random word'],
    'Probability': [0.5, 1/6, 1e-8, 0.999, 1/50000],
    'Context': ['Fair coin', '6-sided die', '1 in 100M', 'Very likely', 'English vocabulary']
})

examples['Information'] = shannon_information(examples['Probability'])

# Bar chart of examples
ax2.bar(range(len(examples)), examples['Information'], color='steelblue', alpha=0.7)
ax2.set_xticks(range(len(examples)))
ax2.set_xticklabels(examples['Event'], rotation=45, ha='right')
ax2.set_ylabel('Information Content [bits]')
ax2.set_title('Real-World Examples\nShannon Information Content')

# Add value labels
for i, (idx, row) in enumerate(examples.iterrows()):
    ax2.text(i, row['Information'] + 0.5, f'{row["Information"]:.1f}', 
            ha='center', va='bottom', fontweight='bold')

plt.tight_layout()
plt.show()

# Academic summary table
print("📊 TABLE 1: Shannon Information Content for Real-World Events")
print("="*70)
display(examples[['Event', 'Probability', 'Information', 'Context']])

print(f"\n📈 KEY INSIGHT: Information content varies from {examples['Information'].min():.1f} to {examples['Information'].max():.1f} bits")
print("💡 This 26-bit range demonstrates the enormous variation in cognitive 'voltage' across different messages.")
