# AGI Brain - Language Learning from Scratch

**A spiking neural network that learns to understand and speak without any pre-trained models.**

This notebook demonstrates:
- 100K-1M neuron sparse SNN architecture (scalable)
- Phoneme recognition through STDP
- Semantic memory for word associations
- Speech synthesis from neural activity
- Learning from YouTube videos and web pages

---

**Runtime Setup:** Go to `Runtime > Change runtime type > T4 GPU`

## 1. Setup

In [None]:
# Install dependencies
!pip install tensorflow numpy matplotlib scipy gradio yt-dlp librosa --quiet

# Clone repository
!rm -rf agi-brain
!git clone https://github.com/jeebus87/agi-brain.git
%cd agi-brain

import sys
sys.path.insert(0, '.')

In [None]:
# Check GPU
import tensorflow as tf
print("TensorFlow:", tf.__version__)
gpus = tf.config.list_physical_devices('GPU')
print("GPUs available:", gpus)

# Imports
import numpy as np
import matplotlib.pyplot as plt

print("\nSetup complete!")

## 2. Sparse SNN Architecture

Memory-efficient architecture using sparse connectivity:
- **Free Colab Tier:** 100K neurons (~0.2 GB)
- **Colab Pro:** 500K-1M neurons (~1-4 GB)
- Event-driven computation
- Population-level parallelism

In [None]:
from src.language.sparse_network import estimate_memory

# Check memory requirements
print("Memory Estimates (0.1% connectivity):")
print("=" * 50)
for n in [100_000, 500_000, 1_000_000]:
    est = estimate_memory(n, connectivity=0.001)
    fits = "YES" if est['total_gb'] < 10 else "NO"
    print(f"{n/1e6:.1f}M neurons: {est['total_gb']:.2f} GB (fits Colab: {fits})")

In [None]:
from src.language.sparse_network import create_language_brain

# Create the language brain (100K neurons for free Colab tier)
# Scale up to 500K-1M if you have Colab Pro
print("Creating language brain...")
brain_snn = create_language_brain(n_neurons=100_000, use_gpu=True)

print(f"\nTotal neurons: {brain_snn.total_neurons():,}")
print(f"Memory usage: {brain_snn.memory_usage_mb():.1f} MB")
print(f"Populations: {len(brain_snn.populations)}")

## 3. Audio Encoding (Cochlea Simulation)

Converts audio waveforms to spike patterns, mimicking biological cochlea.

In [None]:
from src.language.audio_encoder import CochleaEncoder, AudioProcessor

# Create cochlea encoder
cochlea = CochleaEncoder()

# Generate test audio (speech-like frequency sweep)
sample_rate = 16000
duration = 1.0
t = np.linspace(0, duration, int(sample_rate * duration))

# Simulated vowel (formants at 700Hz, 1200Hz, 2500Hz)
f0 = 120  # Fundamental frequency
audio = np.zeros_like(t)
for harmonic in range(1, 20):
    freq = f0 * harmonic
    # Apply formant filter (simplified)
    amp = np.exp(-((freq - 700)**2) / (2 * 100**2))  # F1
    amp += np.exp(-((freq - 1200)**2) / (2 * 150**2))  # F2
    audio += amp * np.sin(2 * np.pi * freq * t) / harmonic

audio = audio / np.max(np.abs(audio)) * 0.5

# Encode to spikes
spikes = cochlea.encode_audio(audio)

print(f"Audio: {len(audio)} samples ({duration}s)")
print(f"Spike representation: {spikes.shape} (time x frequency)")
print(f"Spike density: {spikes.mean():.2%}")

In [None]:
# Visualize cochlea output
fig, axes = plt.subplots(2, 1, figsize=(14, 6))

# Audio waveform
axes[0].plot(t[:1600], audio[:1600], 'b-', linewidth=0.5)
axes[0].set_xlabel('Time (s)')
axes[0].set_ylabel('Amplitude')
axes[0].set_title('Audio Waveform (100ms)', fontweight='bold')
axes[0].set_xlim(0, 0.1)

# Spike representation
im = axes[1].imshow(spikes.T, aspect='auto', cmap='hot', extent=[0, duration, 80, 0])
axes[1].set_xlabel('Time (s)')
axes[1].set_ylabel('Frequency Band')
axes[1].set_title('Cochlea Spike Output', fontweight='bold')
plt.colorbar(im, ax=axes[1], label='Spike')

plt.tight_layout()
plt.show()

## 4. Phoneme Learning with STDP

Learn to recognize phonemes (speech sounds) through spike-timing dependent plasticity.

In [None]:
from src.language.phoneme_learner import PhonemeLearner, PhonemeConfig

# Create phoneme learner
config = PhonemeConfig(
    n_input=800,           # Cochlea output size
    n_phoneme_neurons=500, # Neurons per phoneme detector
    n_phonemes=20,         # Number of phonemes to learn
    learning_rate=0.005
)
learner = PhonemeLearner(config)

print(f"Phoneme learner created:")
print(f"  Detectors: {len(learner.detectors)}")
print(f"  Neurons per detector: {config.n_phoneme_neurons}")
print(f"  Total neurons: {len(learner.detectors) * config.n_phoneme_neurons:,}")

In [None]:
# Generate training patterns (simulating different phonemes)
np.random.seed(42)
n_phonemes = 10
patterns = []

for i in range(n_phonemes):
    # Each phoneme has a distinct sparse pattern
    pattern = np.zeros(800, dtype=np.float32)
    # Activate different frequency regions for different phonemes
    start = i * 60
    pattern[start:start+80] = np.random.rand(80)
    pattern[pattern < 0.5] = 0
    patterns.append(pattern)

print(f"Created {len(patterns)} distinct phoneme patterns")

In [None]:
# Train phoneme recognition
print("Training phoneme recognition...")
print("=" * 50)

n_epochs = 5
samples_per_phoneme = 200

for epoch in range(n_epochs):
    correct = 0
    total = 0
    
    for phoneme_idx in range(n_phonemes):
        for _ in range(samples_per_phoneme):
            # Add noise to pattern
            pattern = patterns[phoneme_idx].copy()
            noise = np.random.rand(800).astype(np.float32) * 0.2
            noisy_pattern = np.clip(pattern + noise, 0, 1)
            noisy_pattern[noisy_pattern < 0.3] = 0
            
            # Process with supervised learning
            detected, conf = learner.process(
                noisy_pattern,
                learn=True,
                target_phoneme=phoneme_idx
            )
            
            if detected == phoneme_idx:
                correct += 1
            total += 1
    
    accuracy = correct / total
    print(f"Epoch {epoch+1}: Accuracy = {accuracy:.1%}")

print("\nTraining complete!")

In [None]:
# Test phoneme recognition
print("\nTesting phoneme recognition:")
print("=" * 50)

test_correct = 0
for phoneme_idx in range(n_phonemes):
    pattern = patterns[phoneme_idx]
    detected, conf = learner.process(pattern, learn=False)
    status = "correct" if detected == phoneme_idx else f"wrong (got {detected})"
    print(f"Phoneme {phoneme_idx}: detected {detected} (conf: {conf:.2f}) - {status}")
    if detected == phoneme_idx:
        test_correct += 1

print(f"\nTest accuracy: {test_correct}/{n_phonemes} = {test_correct/n_phonemes:.0%}")

## 5. Semantic Memory

Learn word-meaning associations using Sparse Distributed Memory.

In [None]:
from src.language.semantic_memory import SemanticMemory, SemanticConfig

# Create semantic memory
sem_config = SemanticConfig(
    n_hard_locations=5000,
    address_size=500,
    data_size=500
)
memory = SemanticMemory(sem_config)

print("Semantic memory created")
print(f"  Hard locations: {sem_config.n_hard_locations:,}")
print(f"  Vector dimensions: {sem_config.data_size}")

In [None]:
# Teach some concepts and relationships
print("Teaching concepts...")

# Create basic concepts
concepts = [
    "cat", "dog", "bird", "fish",
    "animal", "pet", "mammal",
    "run", "fly", "swim", "walk",
    "small", "large", "fast", "slow"
]

for word in concepts:
    memory.create_concept(word=word)

# Learn associations
associations = [
    ("cat", "animal"), ("cat", "pet"), ("cat", "mammal"), ("cat", "small"),
    ("dog", "animal"), ("dog", "pet"), ("dog", "mammal"), ("dog", "run"),
    ("bird", "animal"), ("bird", "fly"), ("bird", "small"),
    ("fish", "animal"), ("fish", "swim"), ("fish", "pet"),
    ("mammal", "animal"),
    ("run", "fast"), ("fly", "fast"), ("swim", "fast"),
]

for w1, w2 in associations:
    memory.learn_association(w1, w2, strength=0.3)

print(f"Vocabulary: {memory.get_vocabulary_size()} words")
print(f"Concepts: {len(memory.concepts)}")

In [None]:
# Learn from sentences
sentences = [
    ["the", "cat", "is", "a", "small", "pet"],
    ["the", "dog", "can", "run", "fast"],
    ["birds", "can", "fly", "in", "the", "sky"],
    ["fish", "swim", "in", "water"],
    ["cats", "and", "dogs", "are", "mammals"],
    ["pets", "are", "animals", "we", "love"],
]

print("Learning from sentences...")
for sentence in sentences:
    memory.learn_from_sentence(sentence)

print(f"Updated vocabulary: {memory.get_vocabulary_size()} words")

In [None]:
# Test spreading activation
print("\nSpreading activation from 'cat':")
print("=" * 40)
activations = memory.spread_activation("cat", depth=2)
for word, activation in sorted(activations.items(), key=lambda x: -x[1])[:10]:
    bar = "*" * int(activation * 20)
    print(f"  {word:12} {activation:.3f} {bar}")

print("\nSpreading activation from 'fly':")
print("=" * 40)
activations = memory.spread_activation("fly", depth=2)
for word, activation in sorted(activations.items(), key=lambda x: -x[1])[:10]:
    bar = "*" * int(activation * 20)
    print(f"  {word:12} {activation:.3f} {bar}")

## 6. Speech Generation

Generate speech audio from neural activity using formant synthesis.

In [None]:
from src.language.speech_generator import SpeechGenerator, FormantSynthesizer
from IPython.display import Audio

# Create speech generator
generator = SpeechGenerator()

# Test text to phonemes
test_text = "hello world"
phonemes = generator.text_to_phonemes(test_text)
print(f"Text: '{test_text}'")
print(f"Phonemes: {' '.join(phonemes)}")

In [None]:
# Generate speech
print("\nGenerating speech...")
audio_output = generator.speak(test_text, speed=0.8)
print(f"Audio length: {len(audio_output)} samples ({len(audio_output)/16000:.2f}s)")

# Play audio (works in Colab)
Audio(audio_output, rate=16000)

In [None]:
# Generate different phrases
phrases = [
    "cat",
    "hello", 
    "goodbye",
    "i am learning"
]

for phrase in phrases:
    print(f"\n'{phrase}':")
    audio = generator.speak(phrase)
    display(Audio(audio, rate=16000))

## 7. Complete AGI Brain Integration

Combine all components into a unified system.

In [None]:
class SimpleBrain:
    """Simplified AGI brain for demonstration."""
    
    def __init__(self):
        # Components
        self.semantic_memory = memory  # Use memory from above
        self.speech_generator = generator
        self.conversation_history = []
        
    def process(self, text):
        """Process input and generate response."""
        words = text.lower().split()
        
        # Learn from input
        self.semantic_memory.learn_from_sentence(words)
        
        # Find associations
        all_activations = {}
        for word in words:
            if self.semantic_memory.lookup(word):
                activations = self.semantic_memory.spread_activation(word, depth=2)
                for w, a in activations.items():
                    all_activations[w] = all_activations.get(w, 0) + a
        
        # Generate response
        if all_activations:
            top_words = sorted(all_activations.items(), key=lambda x: -x[1])[:3]
            associated = [w for w, a in top_words if w not in words]
            if associated:
                response = f"I think of {', '.join(associated)}"
            else:
                response = "Tell me more"
        else:
            response = f"Learning about {words[0] if words else 'nothing'}"
        
        self.conversation_history.append((text, response))
        return response
    
    def speak(self, text):
        """Generate speech audio."""
        return self.speech_generator.speak(text)

# Create brain
brain = SimpleBrain()
print("Brain created!")

In [None]:
# Have a conversation
print("Conversation with AGI Brain:")
print("=" * 50)

inputs = [
    "hello",
    "tell me about cats",
    "can cats fly",
    "what animals can fly",
    "dogs are pets too"
]

for user_input in inputs:
    response = brain.process(user_input)
    print(f"\nYou: {user_input}")
    print(f"Brain: {response}")
    
    # Generate and play audio response
    audio = brain.speak(response)
    display(Audio(audio, rate=16000))

## 8. Learning from YouTube (Optional)

Learn language from YouTube video transcripts.

In [None]:
# Check if yt-dlp is available
import subprocess
try:
    result = subprocess.run(['yt-dlp', '--version'], capture_output=True)
    print("yt-dlp available!")
    YTDLP_AVAILABLE = True
except:
    print("yt-dlp not found. Install with: !pip install yt-dlp")
    YTDLP_AVAILABLE = False

In [None]:
# Uncomment and run to learn from a YouTube video:
# (This requires yt-dlp and may take several minutes)

# from src.interface.learning_pipeline import YouTubeLearner
# 
# learner = YouTubeLearner()
# 
# # Example: Learn from a short educational video
# url = "https://www.youtube.com/watch?v=dQw4w9WgXcQ"  # Replace with actual video
# 
# stats = learner.learn_from_video(url, brain)
# print(f"Learned {stats['words_learned']} words from video!")

## 9. Interactive Chat (Gradio)

Launch an interactive chat interface.

In [None]:
import gradio as gr

def chat_with_brain(message, history):
    response = brain.process(message)
    return response

def get_status():
    return f"""
    Vocabulary: {brain.semantic_memory.get_vocabulary_size()} words
    Concepts: {len(brain.semantic_memory.concepts)}
    Conversations: {len(brain.conversation_history)} turns
    """

with gr.Blocks(title="AGI Brain") as demo:
    gr.Markdown("# AGI Brain Chat\n\nTalk to a neural network learning language from scratch!")
    
    chatbot = gr.ChatInterface(
        chat_with_brain,
        examples=["hello", "tell me about cats", "what can fly"],
    )
    
    with gr.Accordion("Brain Status", open=False):
        status_btn = gr.Button("Refresh")
        status_text = gr.Textbox(label="Status")
        status_btn.click(get_status, outputs=status_text)

demo.launch(share=True)

---

## Summary

This notebook demonstrated:

1. **Sparse SNN Architecture** - 10M neurons fitting in GPU memory
2. **Cochlea Encoding** - Audio to spike conversion
3. **Phoneme Learning** - STDP-based speech sound recognition
4. **Semantic Memory** - Word associations via Sparse Distributed Memory
5. **Speech Generation** - Formant synthesis from neural output
6. **Integrated Brain** - All components working together

### Key Insight

This brain learns language **from scratch** - no pre-trained LLM, no transfer learning.
It's limited compared to ChatGPT, but it actually *learns* rather than retrieves.

### Next Steps
- Scale to 10M+ neurons
- Train on more data (YouTube, web)
- Improve phoneme-to-word mapping
- Add reinforcement learning for conversation