Skip to content

Rayenbabdallah/Milo

Repository files navigation

Milo - Your AI Memory That Never Forgets

A Nova-powered cognitive extension that captures everything you learn, structures it into a living knowledge graph, and retrieves any memory through natural conversation.

Amazon Nova AI Hackathon · Category: Agentic AI · March 2026

%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#10B981', 'primaryTextColor': '#fff', 'primaryBorderColor': '#34D399', 'lineColor': '#6EE7B7', 'secondaryColor': '#065F46', 'tertiaryColor': '#064E3B', 'background': '#0a0a0a', 'mainBkg': '#111827', 'nodeBorder': '#10B981', 'clusterBkg': '#111827', 'clusterBorder': '#10B981', 'titleColor': '#fff', 'edgeLabelBackground': '#1f2937'}}}%%

flowchart TB
    subgraph CAPTURE["📥 CAPTURE SOURCES"]
        direction LR
        T["📝 Text / Notes"]
        W["👀 Desktop Watcher"]
        F["📄 PDF · DOCX · TXT"]
        U["🌐 Web URL"]
        V["🎤 Voice Memo"]
        M["🎙️ Meeting Transcript"]
        A["🎧 Audio / Video"]
        B["📦 Bulk Import"]
        EXT["🧩 Browser Extension\n13 Platforms"]
    end

    subgraph NOVA_LITE["⚡ AMAZON NOVA 2 LITE"]
        direction TB
        EX["Entity Extraction\nPeople · Orgs · Concepts"]
        REL["Relationship Detection\nSupports · Contrasts · Mentions"]
        TEMP["Temporal Classification\nDates · Event Types"]
        SUM["Summarization + Tagging"]
    end

    subgraph NOVA_MME["🔮 AMAZON NOVA MULTIMODAL EMBEDDINGS"]
        EMB["1024-dim Semantic Vectors\n512-token chunks · 64 overlap"]
    end

    subgraph STORAGE["💾 LOCAL STORAGE"]
        direction LR
        subgraph SQLITE["SQLite Knowledge Graph"]
            NODES["Nodes\nMemory · Concept\nEntity · Insight"]
            EDGES["Typed Weighted Edges\nMENTIONS · RELATES_TO\nEVOLVED_FROM · CONTRASTS"]
            FTS["FTS5 Full-Text Index"]
        end
        subgraph LANCE["LanceDB Vector Store"]
            VECS["Embedding Vectors\nfloat 1024"]
        end
    end

    subgraph NOVA_PRO["🧠 AMAZON NOVA PRO"]
        direction TB
        RAG["RAG Answers\nSemantic + Keyword + Graph"]
        PAT["Pattern Detection\nClusters · Bridges · Gaps"]
        DIG["Daily Knowledge Digest"]
        VIS["Vision Analysis\nPDF Images · Scanned Pages"]
        EVO["Evolution + Contradiction\nTracking"]
    end

    subgraph RETRIEVAL["🔍 TRIPLE HYBRID RETRIEVAL"]
        direction LR
        SEM["Semantic Search\nLanceDB cosine"]
        KW["Keyword Search\nFTS5"]
        GR["Graph Traversal\n2-hop neighborhood"]
        RRF["Reciprocal Rank\nFusion → Top 15"]
    end

    subgraph OUTPUT["✨ USER EXPERIENCE"]
        direction LR
        GRAPH["Interactive\nKnowledge Graph"]
        CHAT["Natural Language\nChat + Voice"]
        TL["Timeline +\nHeatmap"]
        PATT["Pattern\nInsights"]
        DIGEST["Daily\nDigest"]
    end

    CAPTURE --> NOVA_LITE
    NOVA_LITE --> NOVA_MME
    NOVA_LITE --> SQLITE
    NOVA_MME --> LANCE
    
    STORAGE --> RETRIEVAL
    SEM & KW & GR --> RRF
    RRF --> NOVA_PRO
    
    NOVA_PRO --> OUTPUT

    subgraph SECURITY["🔒 SECURITY"]
        direction LR
        AES["AES-256-GCM\nEncryption at Rest"]
        PBKDF["PBKDF2\n100k iterations"]
    end

    STORAGE -.- SECURITY

    style CAPTURE fill:#064E3B,stroke:#10B981,color:#fff
    style NOVA_LITE fill:#7C2D12,stroke:#F97316,color:#fff
    style NOVA_MME fill:#1E1B4B,stroke:#818CF8,color:#fff
    style STORAGE fill:#1C1917,stroke:#78716C,color:#fff
    style NOVA_PRO fill:#4C1D95,stroke:#A78BFA,color:#fff
    style RETRIEVAL fill:#172554,stroke:#60A5FA,color:#fff
    style OUTPUT fill:#052E16,stroke:#10B981,color:#fff
    style SECURITY fill:#1F2937,stroke:#F59E0B,color:#fff
    style SQLITE fill:#292524,stroke:#78716C,color:#fff
    style LANCE fill:#292524,stroke:#78716C,color:#fff
Loading

How Milo Uses Amazon Nova

Milo is built entirely on the Amazon Nova model family. Every intelligent behavior - extraction, understanding, retrieval, pattern detection - runs through Nova.

AI Model Usage

Model API Purpose
us.amazon.nova-2-lite-v1:0 Amazon Bedrock Extraction, summarization, classification, tagging - fast ops on every capture
amazon.nova-pro-v1:0 Amazon Bedrock RAG answers, pattern detection, bridge/gap analysis, daily digest, vision
amazon.nova-2-multimodal-embeddings-v1:0 Amazon Bedrock 1024-dimensional multimodal embeddings for semantic search

Nova is not bolted on - it IS the intelligence layer. Every memory flows through Nova 2 Lite for structuring, Nova Multimodal Embeddings for embedding, and Nova Pro for reasoning.


The Problem

Humans learn constantly - from papers, books, conversations, meetings, browsing - but forget most of it. Traditional note-taking is manual, fragmented, and disconnected. Search engines find other people's knowledge, not yours.

Milo solves this: it captures everything, structures it automatically, finds connections you missed, and retrieves any memory instantly through natural language.


Features

10 Capture Sources

  • Text / Notes - paste or type anything (tested)
  • Local Desktop Watcher - background agent that auto-ingests any file dropped into configured local system folders (like an inbox). Configurable directly from settings UI. ( tested before UI implimentation using npm run watch and feeding it file in the inbox )
  • PDF - 5-layer extraction: text + embedded images (Nova Pro vision) + scanned pages + tables + fallback (tested)
  • DOCX / TXT / Markdown - documents and notes
  • Web URL - scrapes articles, blog posts, documentation
  • Voice Memo - Web Speech API transcription
  • Meeting Transcript - paste raw transcript → Nova structures it
  • Audio/Video - AWS Transcribe for long-form media
  • Bulk Import - JSON / Markdown folder (Obsidian, Notion exports)
  • Browser Extension - captures from 13 platforms (see below)

Interactive Knowledge Graph

  • React Flow visualization with 4 node types (Memory, Concept, Entity, Insight)
  • Focus mode - 2-hop neighborhood on click
  • Memory Replay - animated graph growth over time
  • Type/date/search filters with smooth transitions

Natural Language Chat (RAG)

  • Triple hybrid retrieval: semantic (LanceDB) + keyword (FTS5) + graph traversal (2-hop)
  • Date-scoped: "what did I learn last month?"
  • Entity-scoped: "everything about machine learning"
  • Path queries: "how are consciousness and neural networks connected?"
  • Cited sources with collapsible node cards
  • Export any answer as Markdown

Pattern Detection (Nova Pro)

  • Cluster patterns - emerging topic areas
  • Evolution patterns - how concepts shift over time
  • Bridge discovery - concepts connecting separate knowledge clusters
  • Knowledge gap radar - ideas referenced often but never deeply explored
  • Contradiction detection - conflicting views in your own knowledge
  • Confirm/dismiss workflow for human-in-the-loop validation

Timeline

  • Day/week/month granularity with activity heatmap
  • On This Day - surfaces memories from the same date in past years
  • Entity filter for concept-specific timelines

Entity Profiles

  • Per-entity pages with stats, related concepts, memory grid
  • Concept Evolution tab with chronological co-occurrence analysis

Daily Digest (Nova Pro)

  • AI briefing of last 24 hours of captures
  • Cached per day, refresh on demand

Vault (Encryption by Default)

  • PBKDF2 (100k iterations, SHA-256) → AES-256-GCM session key
  • All raw content encrypted before DB write
  • Lock/unlock from header, change passphrase anytime

Browser Extension (13 Platforms)

Chrome MV3 extension that captures from:

Platform What It Captures
YouTube Video transcripts, comments
Google Meet Live meeting audio/transcript
Zoom Meeting recordings
Microsoft Teams Meeting transcript
Gmail Email threads
Outlook Email messages
Slack Channel messages, threads
Discord Chat messages
WhatsApp Web Conversations
Telegram Web Messages
LinkedIn Posts, profiles, messages
Twitter/X DMs
Any webpage Selected text or full page

Extension features: popup quick capture, side panel search, omnibox (milo [query]), offline queue, keyboard shortcuts.


Tech Stack

Layer Technology
Framework Next.js 16 (App Router, TypeScript strict, React 19)
AI Models Amazon Bedrock (Nova 2 Lite, Nova Pro, Nova Multimodal Embeddings)
Database SQLite via @libsql/client + Drizzle ORM
Vector Store LanceDB (local, serverless)
Graph Viz React Flow + Framer Motion
UI Tailwind CSS + custom design system (oklch, 4 fonts, glassmorphism)
Encryption AES-256-GCM + PBKDF2 key derivation
Audio AWS Transcribe (long-form), AWS Polly (TTS), Web Speech API
File Parsing pdf-parse, mammoth, cheerio, chokidar (watcher)
Extension Chrome MV3 (13 content scripts)

Architecture

Ingestion Pipeline

Every capture flows through: Extract → Chunk (512 tokens, 64 overlap) → Embed (Nova MME) → Structure (Nova 2 Lite) → Graph Update → Pattern Check

Query Pipeline

Every question triggers: Embed → Parallel Retrieval (semantic + keyword + graph 2-hop) → Reciprocal Rank Fusion → Nova Pro RAG Answer → Cited Response

Graph Algorithms (8 real algorithms)

  • PageRank (node importance)
  • Betweenness Centrality (bridge detection)
  • Label Propagation (community detection)
  • Louvain Clustering
  • Shortest Path (concept distance)
  • Neighborhood Expansion (2-hop traversal)
  • Temporal Decay (forgetting curve)
  • Co-occurrence Analysis

Data Model

nodes    - id, type, label, summary, raw_content (AES-GCM encrypted),
           tags (JSON), occurred_at, event_type, participants (JSON),
           created_at, updated_at, source_type, source_url

edges    - id, from_node_id, to_node_id, relation, weight (0.0-1.0),
           confidence, source_memory_id, created_at

sources  - id, type, name, uri, ingested_at, node_count, status

patterns - id, title, description, node_ids (JSON), pattern_type,
           confidence, detected_at, confirmed

queries  - id, question, answer, source_node_ids (JSON), created_at

vault    - salt, verification sentinel (PBKDF2 + AES-GCM)
vectors (LanceDB) - id, node_id, chunk_text, chunk_index,
                     embedding float[1024], created_at

Screenshots

Add screenshots before submission

Screen Description
Dashboard Home - activity feed + quick capture
Graph Interactive knowledge graph with 4 node types
Chat Natural language RAG with cited sources
Timeline Temporal view with heatmap
Patterns AI-detected patterns and insights
Capture Multi-source capture interface
Entity Entity profile with evolution tab
Extension Browser extension popup

Setup

Prerequisites

  • Node.js 20+
  • AWS account with Bedrock access (Nova 2 Lite, Nova Pro, Nova Multimodal Embeddings - us-east-1)

Install

git clone https://github.com/rayenbabdallah/milo.git
cd milo
npm install

Configure

Create .env.local:

AWS_ACCESS_KEY_ID=your_key
AWS_SECRET_ACCESS_KEY=your_secret
AWS_REGION=us-east-1

Run

npm run dev
# Open http://localhost:3000
# First launch: set your vault passphrase

Other Commands

npm run build       # Production build
npm run type-check  # tsc --noEmit
npm run lint        # ESLint

Browser Extension

  1. Open chrome://extensions → Enable Developer Mode
  2. Click Load unpacked → select the extension/ folder
  3. Pin Milo from the extensions menu
  4. The extension connects to your local Milo instance at http://localhost:3000

Keyboard Shortcuts

Shortcut Action
Ctrl/Cmd + Shift + Space Quick Capture
Ctrl/Cmd + K Search
Ctrl/Cmd + / Open Chat
Ctrl/Cmd + G Open Graph

Project Structure

milo/
├── src/
│   ├── app/
│   │   ├── (dashboard)/     # 12 pages (home, graph, chat, timeline, capture, patterns, entity, node, settings, vault)
│   │   └── api/             # 37 API routes (thin controllers → lib/)
│   ├── components/          # 24 components (capture, chat, graph, memory, patterns, timeline, vault, layout)
│   └── lib/
│       ├── ai/              # Nova client, extraction, RAG, patterns, embeddings
│       ├── db/              # Drizzle schema, SQLite client, graph queries (10 indexes, FTS5)
│       ├── capture/         # Pipeline, 9 source processors, temporal resolver
│       ├── encryption/      # AES-256-GCM, PBKDF2, key store
│       ├── graph/           # Builder, traversal, 8 algorithms
│       └── vector/          # LanceDB client, hybrid semantic search
└── extension/               # Chrome MV3 browser extension
    ├── manifest.json
    ├── background/          # Service worker + API client
    ├── content/             # 13 platform content scripts
    ├── popup/               # Quick capture popup
    ├── sidepanel/           # Search side panel
    ├── offscreen/           # Audio recorder
    └── icons/

What Makes Milo Different

Feature Traditional Notes Milo
Structure Manual folders/tags Auto-extracted knowledge graph
Connections You create links Nova discovers relationships
Retrieval Keyword search Natural language + semantic + graph
Patterns You notice them Nova detects clusters, bridges, gaps
Capture Copy-paste 10 sources + browser extension (13 platforms)
Security Cloud provider trust AES-256-GCM encryption by default

Hackathon

  • Competition: Amazon Nova AI Hackathon
  • Deadline: March 16, 2026
  • Category: Agentic AI
  • Tagline: Nova-powered agent that turns raw input into a living personal knowledge graph

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors