# Archive Taxonomy

## Why this notebook

This archive is not designed for perfect retrieval. It is designed to shape attention.

Search works well enough at this scale of all of my files (~10 TB); the real problem is not finding files, but remembering what is worth looking for. Flat abundance collapses curiosity. When everything is equally accessible, habit wins, and familiar choices dominate. This taxonomy exists to reintroduce texture, constraint, and adjacency so that discovery becomes possible again.

Folders answer the question “what kind of thing is this?” They do not answer what the thing means, why it matters, or how it relates to other things. Meaning lives in metadata, memory, and context. The folder structure is deliberately boring at the top and deliberately opinionated where browsing matters.

This system distinguishes between taxonomy and ontology. Ontology asks what something truly is; taxonomy asks where it should live so it can be encountered. The archive does not attempt to be a perfect model of reality. It attempts to be a usable map. Predictable misclassification is acceptable and even desirable. Search exists as a safety net, not as the primary interface.

Different media demand different depths of structure. Photos, documents, data, and projects benefit from shallow hierarchies and strong metadata because retrieval is goal-directed. Books, music, and films benefit from deeper, semantic hierarchies because browsing, recognition, and serendipity are the primary modes of engagement. In these domains, deeper folders are not bureaucracy; they are shelves.

Classification is based on how one wants to arrive at an artifact, not on an abstract definition of what it is. An album that spans multiple genres is placed in the location that matches the listening intention it best serves. Each artifact has one canonical home. That home reflects a choice about behavior, not truth.

Projects are explicitly separated from the archive. Projects are allowed to be messy, temporary, and mutable. When a project ends, its outputs are promoted into the archive and reclassified according to artifact type. This separation prevents entropy from spreading.

The archive is meant to be walked, not queried. Each directory should be small enough to browse without fatigue and rich enough to invite exploration. Constraint is not a limitation; it is the mechanism by which curiosity reappears.

This taxonomy is expected to evolve. Changes should be deliberate and documented, not reactive. The structure should age alongside its owner. Consistency matters more than correctness, and memory is reinforced through use, not optimization.

The success criterion of this system is not whether a file can be found instantly. The success criterion is whether the archive invites engagement, supports discovery, and gently resists the pull of the familiar.


## What this notebook does

This notebook defines and enforces the physical structure of the archive. It creates the directory taxonomy that all automated and manual classification must target. The notebook does not move files, interpret content, or decide meaning; it establishes the stable destinations that give those future decisions somewhere to land.

The structure produced here is intentionally explicit and repeatable. By materializing the taxonomy as directories, the notebook turns an abstract organizational philosophy into a concrete constraint. Any AI agent operating on the archive is required to place artifacts into one and only one canonical location within this structure.

The notebook serves as the contract between human intention and automated action. It makes the rules visible, inspectable, and versionable. An AI agent may misclassify, but it may not invent categories. Corrections happen by relocating artifacts within the existing structure, not by reshaping the structure itself.

This separation is deliberate. The notebook defines the “where.” The AI agent decides the “which.” Search remains the safety net, but browsing is the primary interface. Together, they form a system in which automation accelerates organization without erasing human judgment.


## Libraries Used

In [None]:
from pathlib import Path

# ------------------------------------------------------------
# Archive root
# ------------------------------------------------------------
ARCHIVE_ROOT = Path.cwd() / "Archive"
ARCHIVE_ROOT.mkdir(parents=True, exist_ok=True)

# ------------------------------------------------------------
# Helper: create directory + README.md (idempotent)
# ------------------------------------------------------------
def ensure_dir_with_readme(path: Path, text: str):
    """Create directory and README if they don't exist."""
    path.mkdir(parents=True, exist_ok=True)
    readme = path / "README.md"
    if not readme.exists():
        readme.write_text(text.strip() + "\n")

# ------------------------------------------------------------
# Level 1 taxonomy
# ------------------------------------------------------------
LEVEL_1 = [
    "Applications",
    "Audio",
    "Books",
    "Code",
    "Compressed",
    "Data",
    "Docs",
    "Images",
    "Intake",
    "Journal",
    "Movies",
    "Music",
    "Notes",
    "Projects",
    "Systems",
    "Video",
]

# ------------------------------------------------------------
# README explanations (contract text, not marketing)
# ------------------------------------------------------------
LEVEL_1_EXPLAINERS = {
    "Applications": (
        "Large, mode-shifting creative or technical environments.\n"
        "These are tool ecosystems, not source code or projects.\n"
        "Examples include game engines, DAWs, notation tools, and IDEs."
    ),
    "Audio": (
        "Non-music audio artifacts and audio work-in-progress.\n"
        "Recordings, experiments, stems, sound design, and cooked audio."
    ),
    "Books": (
        "Long-form written works intended to be read as books.\n"
        "PDF, EPUB, MOBI, and similar formats. Mostly read-only."
    ),
    "Code": (
        "Source code and scripts.\n"
        "Repositories, utilities, experiments, and executable logic."
    ),
    "Compressed": (
        "Archive containers such as zip, tar, and 7z files.\n"
        "This is a staging area, not a permanent home."
    ),
    "Data": (
        "Structured data artifacts.\n"
        "CSV, JSON, Parquet, logs, generated datasets, and analysis outputs."
    ),
    "Docs": (
        "Documents that are not books.\n"
        "Manuals, contracts, receipts, reference PDFs, and administrative files."
    ),
    "Images": (
        "Still images.\n"
        "Photography, scans, artwork, screenshots, and visual assets."
    ),
    "Intake": (
        "Temporary landing zone for unclassified material.\n"
        "Nothing here is considered organized or permanent."
    ),
    "Journal": (
        "Chronological personal writing.\n"
        "Reflections, logs, and time-ordered narrative entries."
    ),
    "Movies": (
        "Cinematic works.\n"
        "Feature films, documentaries, and shorts treated as cinema."
    ),
    "Music": (
        "Music intended for listening and browsing.\n"
        "Organized for discovery, adjacency, and strolling."
    ),
    "Notes": (
        "Short-form thinking artifacts.\n"
        "Scratch notes, research fragments, ideas, and provisional text."
    ),
    "Projects": (
        "Active and evolving workspaces.\n"
        "Messy by design. Completed outputs should be promoted elsewhere."
    ),
    "Systems": (
        "System-level artifacts.\n"
        "Backups, disk images, installers, and configuration snapshots."
    ),
    "Video": (
        "Non-cinematic moving images.\n"
        "Home videos, lectures, screen recordings, and clips."
    ),
}

# ------------------------------------------------------------
# Build Level 1 + READMEs
# ------------------------------------------------------------
for name in LEVEL_1:
    ensure_dir_with_readme(
        ARCHIVE_ROOT / name,
        LEVEL_1_EXPLAINERS.get(
            name,
            "No description provided. This directory exists by design."
        ),
    )

print(f"Archive root: {ARCHIVE_ROOT}")
print(f"Level 1 categories: {len(LEVEL_1)}")

### Applications Organization

In [None]:
applications_root = ARCHIVE_ROOT / "Applications"

# ------------------------------------------------------------
# Applications taxonomy
# ------------------------------------------------------------
APPLICATIONS = {
    "3D": ["Blender"],
    "Game_Engines": ["Unreal", "Godot"],
    "Audio": ["Audacity", "Reaper"],
    "Music_Theory": ["MuseScore"],
    "Visual_Design": ["GIMP"],
    "Video": ["DaVinci_Resolve"],
    "Writing": ["Obsidian", "LaTeX_Toolchain"],
    "Electronics": ["Arduino_IDE", "PlatformIO"],
    "IDEs": ["VSCode", "Atom"],
}

# ------------------------------------------------------------
# Task-level explanations
# ------------------------------------------------------------
APPLICATION_TASK_EXPLAINERS = {
    "3D": (
        "Three-dimensional modeling and procedural world-building environments.\n"
        "Tools for spatial thinking, geometry, and form."
    ),
    "Game_Engines": (
        "Interactive simulation and game development environments.\n"
        "Tools for building real-time systems and virtual worlds."
    ),
    "Audio": (
        "Audio production and manipulation environments.\n"
        "Recording, editing, mixing, and sound experimentation."
    ),
    "Music_Theory": (
        "Music notation and theory-focused tools.\n"
        "Used for analysis, composition, and score-based thinking."
    ),
    "Visual_Design": (
        "Visual composition and design tools.\n"
        "Raster and vector-based image creation and editing."
    ),
    "Video": (
        "Video editing and post-production environments.\n"
        "Used for assembling, grading, and rendering moving images."
    ),
    "Writing": (
        "Writing and thinking environments.\n"
        "Tools that shape how long-form text is produced and organized."
    ),
    "Electronics": (
        "Embedded systems and hardware programming environments.\n"
        "Tools tied to physical devices and microcontrollers."
    ),
    "IDEs": (
        "Integrated development environments.\n"
        "General-purpose coding workspaces, kept intentionally minimal."
    ),
}

# ------------------------------------------------------------
# App-level explanations
# ------------------------------------------------------------
APPLICATION_EXPLAINERS = {
    "Blender": "3D creation suite for modeling, animation, rendering, and procedural workflows.",
    "Unreal": "High-performance game engine for real-time simulation and interactive worlds.",
    "Godot": "Lightweight, open-source game engine emphasizing simplicity and rapid iteration.",
    "Audacity": "Audio editor for recording, trimming, and basic sound manipulation.",
    "Reaper": "Highly configurable digital audio workstation focused on precision and scripting.",
    "MuseScore": "Music notation software for score writing, analysis, and playback.",
    "GIMP": "Raster-based image editor for photo manipulation and visual design.",
    "DaVinci_Resolve": "Professional video editing and color grading environment.",
    "Obsidian": "Markdown-based writing environment for linked notes and long-form thinking.",
    "LaTeX_Toolchain": "Document preparation environment for structured, typeset writing.",
    "Arduino_IDE": "Integrated environment for programming Arduino-compatible microcontrollers.",
    "PlatformIO": "Embedded development ecosystem supporting multiple boards and frameworks.",
    "VSCode": "Extensible code editor and development environment.",
    "Atom": "Hackable text editor used as a lightweight IDE.",
}

# ------------------------------------------------------------
# Build Applications tree with READMEs
# ------------------------------------------------------------
for task, apps in APPLICATIONS.items():
    task_dir = applications_root / task

    ensure_dir_with_readme(
        task_dir,
        APPLICATION_TASK_EXPLAINERS.get(task, "Application task category."),
    )

    for app in apps:
        ensure_dir_with_readme(
            task_dir / app,
            APPLICATION_EXPLAINERS.get(app, "Application environment."),
        )

print(f"Applications: {sum(len(apps) for apps in APPLICATIONS.values())} apps in {len(APPLICATIONS)} categories")

### Audio

In [None]:
audio_root = ARCHIVE_ROOT / "Audio"

# ------------------------------------------------------------
# Audio Level 2 taxonomy (Music lives at Level 1, not here)
# ------------------------------------------------------------
AUDIO_LEVEL_2 = [
    "Recordings",
    "Experiments",
    "Programmatic",
    "AI",
]

# ------------------------------------------------------------
# Audio explainers
# ------------------------------------------------------------
AUDIO_EXPLAINERS = {
    "Recordings": (
        "Captured audio from the physical world.\n"
        "Live recordings, voice takes, instrument captures, and raw source material."
    ),
    "Experiments": (
        "Exploratory and provisional audio work.\n"
        "Half-formed ideas, sound tests, sketches, and throwaway explorations."
    ),
    "Programmatic": (
        "Audio generated or manipulated through code.\n"
        "Algorithmic composition, DSP experiments, scripts, and computational sound."
    ),
    "AI": (
        "Audio created or transformed using machine learning systems.\n"
        "Model outputs, training artifacts, prompts, and AI-assisted sound work."
    ),
}

# ------------------------------------------------------------
# Build Audio tree with READMEs
# ------------------------------------------------------------
for name in AUDIO_LEVEL_2:
    ensure_dir_with_readme(
        audio_root / name,
        AUDIO_EXPLAINERS.get(name, "Audio category."),
    )

print(f"Audio: {len(AUDIO_LEVEL_2)} categories")

### Books

In [None]:
books_root = ARCHIVE_ROOT / "Books"

# ------------------------------------------------------------
# Books taxonomy: Level 2 + Level 3
# ------------------------------------------------------------
BOOKS_TAXONOMY = {
    "Mathematics": [
        "Algebra",
        "Analysis",
        "Topology",
        "Graph_Theory",
        "Probability",
        "Number_Theory",
        "Logic",
        "Discrete_Math",
        "Geometry",
    ],
    "Science": [
        "Physics",
        "Biology",
        "Chemistry",
        "Complexity",
        "Systems",
        "Earth_Science",
    ],
    "Philosophy": [
        "Metaphysics",
        "Epistemology",
        "Ethics",
        "Daoism",
        "Philosophy_of_Science",
        "Logic",
        "Aesthetics",
    ],
    "Literature": [
        "Classicism",
        "Modernism",
        "Postmodernism",
        "Magical_Realism",
        "Mythic",
        "Speculative",
        "Essays",
        "Literary_Criticism",
        "Drama",
        "Poetry",
        "Fantasy",
        "Science_Fiction",
        "Historical_Fiction",
        "Mystery",
        "Thriller",
        "Horror",
        "Romance",
    ],
    "History": [
        "Ancient",
        "Medieval",
        "Early_Modern",
        "Modern",
        "History_of_Science",
        "History_of_Philosophy",
    ],
    "Computing": [
        "Programming",
        "Algorithms",
        "Systems",
        "AI",
        "Graphics",
        "Game_Development",
        "Software_Engineering",
    ],
    "Psychology": [
        "Cognitive",
        "Behavioral",
        "Neuroscience",
        "Developmental",
    ],
    "Art": [
        "Art_History",
        "Theory",
        "Visual_Arts",
        "Architecture",
    ],
    "Music": [
        "Music_Theory",
        "History",
        "Composition",
        "Ethnomusicology",
    ],
    "Misc": [
        "Unsorted",
    ],
}

# ------------------------------------------------------------
# README content for Books taxonomy
# ------------------------------------------------------------
BOOKS_READMES = {
    # Level 2 domains
    "Books": (
        "Long-form works intended for slow reading and sustained thought.\n"
        "This collection is organized to support browsing, wandering, and curiosity,\n"
        "not obligation or completion tracking."
    ),
    "Mathematics": (
        "Mathematical texts organized by domain and mode of thinking.\n"
        "These shelves are meant to be browsed, revisited, and explored non-linearly."
    ),
    "Science": (
        "Scientific works exploring the natural world.\n"
        "Organized by discipline and conceptual focus rather than textbook sequence."
    ),
    "Philosophy": (
        "Philosophical works organized by questions, traditions, and modes of inquiry.\n"
        "This space supports reflection rather than systematic study."
    ),
    "Literature": (
        "Literary works organized by aesthetic tradition and narrative mode.\n"
        "These shelves invite wandering based on tone, voice, and imaginative posture."
    ),
    "History": (
        "Historical works organized by era and thematic focus.\n"
        "Intended for contextual understanding rather than chronological completeness."
    ),
    "Computing": (
        "Books about computation, software, and digital systems.\n"
        "Focused on ideas, architecture, and practice rather than specific tools."
    ),
    "Psychology": (
        "Works exploring mind, behavior, and cognition.\n"
        "Organized by perspective rather than clinical or academic taxonomy."
    ),
    "Art": (
        "Books about visual art, aesthetics, and built environments.\n"
        "These shelves support visual thinking and historical context."
    ),
    "Music": (
        "Books about music as structure, history, and cultural practice.\n"
        "Separate from listening collections; focused on understanding and reflection."
    ),
    "Misc": (
        "Books that do not yet have a clear home.\n"
        "This is a temporary holding space, not a permanent category."
    ),
    # Literature (Level 3)
    "Classicism": (
        "Canonical works that shaped literary tradition.\n"
        "Often slow, foundational, and historically influential."
    ),
    "Modernism": (
        "Works marked by formal experimentation and interiority.\n"
        "Literature responding to rupture, fragmentation, and new ways of seeing."
    ),
    "Postmodernism": (
        "Literature that interrogates narrative, authority, and meaning itself.\n"
        "Often playful, ironic, recursive, or self-aware."
    ),
    "Magical_Realism": (
        "Narratives where the extraordinary is woven seamlessly into the ordinary.\n"
        "Neither fantasy nor realism, but a deliberate suspension between them."
    ),
    "Mythic": (
        "Stories rooted in archetype, legend, and inherited narrative forms.\n"
        "Includes epics, retellings, and works operating at symbolic scale."
    ),
    "Speculative": (
        "Literature that explores alternate realities and hypothetical worlds.\n"
        "Focused on ideas and consequences rather than genre conventions."
    ),
    "Essays": (
        "Short-form literary and philosophical reflections.\n"
        "Well-suited to digital reading and casual exploration."
    ),
    "Literary_Criticism": (
        "Works analyzing literature, narrative form, and aesthetic theory.\n"
        "Intended to deepen engagement rather than prescribe interpretation."
    ),
    "Drama": (
        "Plays and dramatic texts.\n"
        "Written for performance, dialogue, and embodied speech."
    ),
    "Poetry": (
        "Poetic works organized for slow reading and rereading.\n"
        "These texts reward attention, rhythm, and pause."
    ),
    "Fantasy": (
        "Narratives built around imagined worlds and mythic structures.\n"
        "Used here as a browsing shelf, not a promise of escapism."
    ),
    "Science_Fiction": (
        "Stories driven by technological, scientific, or speculative premises.\n"
        "Focused on consequences and ideas rather than genre tropes."
    ),
    "Historical_Fiction": (
        "Narratives set within real historical periods.\n"
        "Used for immersive context rather than strict historical analysis."
    ),
    "Mystery": (
        "Narratives structured around investigation, secrecy, and revelation.\n"
        "Read for tension, structure, and puzzle-solving."
    ),
    "Thriller": (
        "Fast-paced narratives built around urgency and high stakes.\n"
        "Selected for momentum and narrative drive."
    ),
    "Horror": (
        "Works that explore fear, unease, and the unknown.\n"
        "Includes psychological, cosmic, and existential horror."
    ),
    "Romance": (
        "Narratives centered on relationships and emotional development.\n"
        "Organized separately to support intentional browsing."
    ),
}

# ------------------------------------------------------------
# Build Books taxonomy + READMEs
# ------------------------------------------------------------
ensure_dir_with_readme(books_root, BOOKS_READMES.get("Books"))

for level2, level3_list in BOOKS_TAXONOMY.items():
    level2_path = books_root / level2
    ensure_dir_with_readme(level2_path, BOOKS_READMES.get(level2))

    for level3 in level3_list:
        level3_path = level2_path / level3
        ensure_dir_with_readme(level3_path, BOOKS_READMES.get(level3))

print(f"Books: {len(BOOKS_TAXONOMY)} domains, {sum(len(v) for v in BOOKS_TAXONOMY.values())} subcategories")

### Code

In [None]:
code_root = ARCHIVE_ROOT / "Code"

# ------------------------------------------------------------
# Code taxonomy: organized by purpose, not language
# ------------------------------------------------------------
CODE_TAXONOMY = {
    "Libraries": [
        "Python",
        "JavaScript",
        "Rust",
        "Go",
        "C_CPP",
    ],
    "Tools": [
        "CLI",
        "Scripts",
        "Automation",
    ],
    "Learning": [
        "Tutorials",
        "Exercises",
        "Courses",
    ],
    "Experiments": [
        "Prototypes",
        "Sketches",
        "Spikes",
    ],
    "Reference": [
        "Snippets",
        "Templates",
        "Configs",
    ],
}

# ------------------------------------------------------------
# Code READMEs
# ------------------------------------------------------------
CODE_READMES = {
    "Libraries": (
        "Reusable code organized by language.\n"
        "Packages, modules, and standalone libraries intended for import."
    ),
    "Python": "Python libraries, packages, and reusable modules.",
    "JavaScript": "JavaScript/TypeScript libraries and npm packages.",
    "Rust": "Rust crates and reusable Rust code.",
    "Go": "Go modules and reusable Go packages.",
    "C_CPP": "C and C++ libraries and header-based code.",
    "Tools": (
        "Executable utilities and scripts.\n"
        "Code that does something when run, not imported."
    ),
    "CLI": "Command-line tools and terminal utilities.",
    "Scripts": "Standalone scripts for specific tasks.",
    "Automation": "Scripts for automated workflows, deployment, and maintenance.",
    "Learning": (
        "Educational code and exercises.\n"
        "Code written to learn, not to ship."
    ),
    "Tutorials": "Code from tutorials and walkthroughs.",
    "Exercises": "Practice problems and coding challenges.",
    "Courses": "Course materials and structured learning code.",
    "Experiments": (
        "Exploratory and throwaway code.\n"
        "Ideas tested quickly without polish or permanence."
    ),
    "Prototypes": "Early-stage implementations testing feasibility.",
    "Sketches": "Quick, rough code exploring an idea.",
    "Spikes": "Time-boxed investigations of specific technical questions.",
    "Reference": (
        "Code kept for reference, not execution.\n"
        "Patterns, templates, and configuration examples."
    ),
    "Snippets": "Small, reusable code fragments.",
    "Templates": "Starter templates and boilerplate code.",
    "Configs": "Configuration files and dotfile examples.",
}

# ------------------------------------------------------------
# Build Code taxonomy
# ------------------------------------------------------------
for level2, level3_list in CODE_TAXONOMY.items():
    level2_path = code_root / level2
    ensure_dir_with_readme(level2_path, CODE_READMES.get(level2))

    for level3 in level3_list:
        level3_path = level2_path / level3
        ensure_dir_with_readme(level3_path, CODE_READMES.get(level3))

print(f"Code: {len(CODE_TAXONOMY)} categories, {sum(len(v) for v in CODE_TAXONOMY.values())} subcategories")

### Compressed

In [None]:
compressed_root = ARCHIVE_ROOT / "Compressed"

# ------------------------------------------------------------
# Compressed taxonomy: staging by origin, not file type
# ------------------------------------------------------------
COMPRESSED_TAXONOMY = [
    "Downloads",
    "Backups",
    "Exports",
    "Unknown",
]

# ------------------------------------------------------------
# Compressed READMEs
# ------------------------------------------------------------
COMPRESSED_READMES = {
    "Downloads": (
        "Compressed files from the web.\n"
        "Downloaded archives awaiting extraction and classification."
    ),
    "Backups": (
        "Compressed backup archives.\n"
        "Snapshots and exports from other systems or services."
    ),
    "Exports": (
        "Compressed exports from applications.\n"
        "Project bundles, database dumps, and tool-specific archives."
    ),
    "Unknown": (
        "Compressed files of unclear origin.\n"
        "Awaiting inspection before classification or deletion."
    ),
}

# ------------------------------------------------------------
# Build Compressed taxonomy
# ------------------------------------------------------------
for name in COMPRESSED_TAXONOMY:
    ensure_dir_with_readme(
        compressed_root / name,
        COMPRESSED_READMES.get(name, "Compressed staging area."),
    )

print(f"Compressed: {len(COMPRESSED_TAXONOMY)} staging areas")

### Data

In [None]:
data_root = ARCHIVE_ROOT / "Data"

# ------------------------------------------------------------
# Data taxonomy: structured data by domain and format
# ------------------------------------------------------------
DATA_TAXONOMY = {
    "Tabular": [
        "CSV",
        "Parquet",
        "Excel",
    ],
    "Structured": [
        "JSON",
        "YAML",
        "XML",
    ],
    "Databases": [
        "SQLite",
        "Dumps",
        "Exports",
    ],
    "Logs": [
        "Application",
        "System",
        "Analytics",
    ],
    "Generated": [
        "Model_Outputs",
        "Simulations",
        "Scraped",
    ],
    "Geospatial": [
        "GeoJSON",
        "Shapefiles",
        "Rasters",
    ],
}

# ------------------------------------------------------------
# Data READMEs
# ------------------------------------------------------------
DATA_READMES = {
    "Tabular": (
        "Row-and-column data formats.\n"
        "Data that fits naturally into spreadsheets and dataframes."
    ),
    "CSV": "Comma-separated value files and delimited text data.",
    "Parquet": "Columnar storage format for efficient analytics.",
    "Excel": "Microsoft Excel workbooks and spreadsheets.",
    "Structured": (
        "Hierarchical and nested data formats.\n"
        "Configuration files, API responses, and structured documents."
    ),
    "JSON": "JavaScript Object Notation files.",
    "YAML": "YAML configuration and data files.",
    "XML": "Extensible Markup Language documents.",
    "Databases": (
        "Database files and exports.\n"
        "Portable database containers and migration artifacts."
    ),
    "SQLite": "SQLite database files.",
    "Dumps": "Database dumps and backup exports.",
    "Exports": "Exported data from database tools.",
    "Logs": (
        "Log files and event records.\n"
        "Time-series data from applications and systems."
    ),
    "Application": "Application-level logs and debug output.",
    "System": "Operating system and infrastructure logs.",
    "Analytics": "Analytics events and tracking data.",
    "Generated": (
        "Programmatically created data.\n"
        "Outputs from models, simulations, and automated collection."
    ),
    "Model_Outputs": "Data generated by ML models and algorithms.",
    "Simulations": "Output from simulations and synthetic data generation.",
    "Scraped": "Data collected via web scraping and crawling.",
    "Geospatial": (
        "Geographic and spatial data.\n"
        "Maps, coordinates, and location-based datasets."
    ),
    "GeoJSON": "GeoJSON geographic data files.",
    "Shapefiles": "ESRI shapefiles and related vector data.",
    "Rasters": "Raster-based geographic data and imagery.",
}

# ------------------------------------------------------------
# Build Data taxonomy
# ------------------------------------------------------------
for level2, level3_list in DATA_TAXONOMY.items():
    level2_path = data_root / level2
    ensure_dir_with_readme(level2_path, DATA_READMES.get(level2))

    for level3 in level3_list:
        level3_path = level2_path / level3
        ensure_dir_with_readme(level3_path, DATA_READMES.get(level3))

print(f"Data: {len(DATA_TAXONOMY)} categories, {sum(len(v) for v in DATA_TAXONOMY.values())} subcategories")

### Docs

In [None]:
docs_root = ARCHIVE_ROOT / "Docs"

# ------------------------------------------------------------
# Docs taxonomy: non-book documents by purpose
# ------------------------------------------------------------
DOCS_TAXONOMY = {
    "Reference": [
        "Manuals",
        "Datasheets",
        "Specifications",
        "API_Docs",
    ],
    "Administrative": [
        "Contracts",
        "Receipts",
        "Invoices",
        "Tax",
        "Legal",
    ],
    "Personal": [
        "Identity",
        "Medical",
        "Education",
        "Employment",
    ],
    "Correspondence": [
        "Letters",
        "Emails",
        "Messages",
    ],
    "Presentations": [
        "Slides",
        "Posters",
        "Handouts",
    ],
}

# ------------------------------------------------------------
# Docs READMEs
# ------------------------------------------------------------
DOCS_READMES = {
    "Reference": (
        "Technical reference materials.\n"
        "Documentation meant to be consulted, not read cover-to-cover."
    ),
    "Manuals": "User manuals and instruction guides.",
    "Datasheets": "Technical datasheets and component specifications.",
    "Specifications": "Formal specifications and standards documents.",
    "API_Docs": "API documentation and integration guides.",
    "Administrative": (
        "Business and administrative paperwork.\n"
        "Financial, legal, and transactional documents."
    ),
    "Contracts": "Signed agreements and contracts.",
    "Receipts": "Purchase receipts and proof of payment.",
    "Invoices": "Invoices sent and received.",
    "Tax": "Tax returns, forms, and related documents.",
    "Legal": "Legal documents, filings, and official correspondence.",
    "Personal": (
        "Personal identity and life documents.\n"
        "Records that define identity, history, and credentials."
    ),
    "Identity": "ID documents, passports, and identity records.",
    "Medical": "Medical records and health documents.",
    "Education": "Diplomas, transcripts, and educational records.",
    "Employment": "Employment records, contracts, and pay stubs.",
    "Correspondence": (
        "Written communication archives.\n"
        "Letters, emails, and messages worth preserving."
    ),
    "Letters": "Physical and formal letters.",
    "Emails": "Archived email correspondence.",
    "Messages": "Saved messages from various platforms.",
    "Presentations": (
        "Visual presentation materials.\n"
        "Slides, posters, and materials designed for display."
    ),
    "Slides": "Slide decks and presentation files.",
    "Posters": "Posters and large-format visual documents.",
    "Handouts": "Handouts and supplementary presentation materials.",
}

# ------------------------------------------------------------
# Build Docs taxonomy
# ------------------------------------------------------------
for level2, level3_list in DOCS_TAXONOMY.items():
    level2_path = docs_root / level2
    ensure_dir_with_readme(level2_path, DOCS_READMES.get(level2))

    for level3 in level3_list:
        level3_path = level2_path / level3
        ensure_dir_with_readme(level3_path, DOCS_READMES.get(level3))

print(f"Docs: {len(DOCS_TAXONOMY)} categories, {sum(len(v) for v in DOCS_TAXONOMY.values())} subcategories")

### Images

In [None]:
images_root = ARCHIVE_ROOT / "Images"

# ------------------------------------------------------------
# Images taxonomy: still images by origin and purpose
# ------------------------------------------------------------
IMAGES_TAXONOMY = {
    "Photography": [
        "Personal",
        "Travel",
        "Events",
        "Street",
        "Nature",
    ],
    "Scans": [
        "Documents",
        "Artwork",
        "Film",
        "Prints",
    ],
    "Screenshots": [
        "Desktop",
        "Mobile",
        "Web",
        "Games",
    ],
    "Artwork": [
        "Digital",
        "Traditional",
        "AI_Generated",
        "Reference",
    ],
    "Assets": [
        "Icons",
        "Textures",
        "UI",
        "Stock",
    ],
}

# ------------------------------------------------------------
# Images READMEs
# ------------------------------------------------------------
IMAGES_READMES = {
    "Photography": (
        "Photographs taken with cameras.\n"
        "Organized by context and subject matter."
    ),
    "Personal": "Personal and family photographs.",
    "Travel": "Travel and location photography.",
    "Events": "Events, gatherings, and occasions.",
    "Street": "Street photography and urban scenes.",
    "Nature": "Nature, landscapes, and wildlife.",
    "Scans": (
        "Digitized physical materials.\n"
        "Documents, artwork, and film converted to digital format."
    ),
    "Documents": "Scanned documents and papers.",
    "Artwork": "Scanned traditional artwork and illustrations.",
    "Film": "Scanned film negatives and slides.",
    "Prints": "Scanned photographic prints.",
    "Screenshots": (
        "Screen captures from devices.\n"
        "Visual records of digital interfaces and content."
    ),
    "Desktop": "Desktop and application screenshots.",
    "Mobile": "Mobile device screenshots.",
    "Web": "Web page captures and browser screenshots.",
    "Games": "Video game screenshots and captures.",
    "Artwork": (
        "Created visual art.\n"
        "Original artwork, both digital and digitized traditional."
    ),
    "Digital": "Digitally created artwork and illustrations.",
    "Traditional": "Traditional artwork (digitized).",
    "AI_Generated": "Images created using AI tools and models.",
    "Reference": "Reference images for artistic work.",
    "Assets": (
        "Reusable visual components.\n"
        "Icons, textures, and design elements."
    ),
    "Icons": "Icon sets and symbolic graphics.",
    "Textures": "Textures and patterns for design use.",
    "UI": "User interface elements and components.",
    "Stock": "Stock images and licensed visual assets.",
}

# ------------------------------------------------------------
# Build Images taxonomy
# ------------------------------------------------------------
for level2, level3_list in IMAGES_TAXONOMY.items():
    level2_path = images_root / level2
    ensure_dir_with_readme(level2_path, IMAGES_READMES.get(level2))

    for level3 in level3_list:
        level3_path = level2_path / level3
        ensure_dir_with_readme(level3_path, IMAGES_READMES.get(level3))

print(f"Images: {len(IMAGES_TAXONOMY)} categories, {sum(len(v) for v in IMAGES_TAXONOMY.values())} subcategories")

### Intake

In [None]:
intake_root = ARCHIVE_ROOT / "Intake"

# ------------------------------------------------------------
# Intake taxonomy: temporary staging by source
# ------------------------------------------------------------
INTAKE_TAXONOMY = [
    "Downloads",
    "Desktop",
    "Mobile",
    "Email",
    "Transfers",
    "Unsorted",
]

# ------------------------------------------------------------
# Intake READMEs
# ------------------------------------------------------------
INTAKE_READMES = {
    "Downloads": (
        "Files downloaded from the web.\n"
        "Browser downloads awaiting classification."
    ),
    "Desktop": (
        "Files saved to desktop.\n"
        "Quick saves and temporary desktop clutter."
    ),
    "Mobile": (
        "Files transferred from mobile devices.\n"
        "Photos, screenshots, and downloads from phones and tablets."
    ),
    "Email": (
        "Attachments saved from email.\n"
        "Files extracted from email awaiting proper placement."
    ),
    "Transfers": (
        "Files received from other people or systems.\n"
        "Shared files, AirDrop, and external transfers."
    ),
    "Unsorted": (
        "Files of unknown origin.\n"
        "Catch-all for anything that doesn't fit elsewhere in Intake."
    ),
}

# ------------------------------------------------------------
# Build Intake taxonomy
# ------------------------------------------------------------
for name in INTAKE_TAXONOMY:
    ensure_dir_with_readme(
        intake_root / name,
        INTAKE_READMES.get(name, "Intake staging area."),
    )

print(f"Intake: {len(INTAKE_TAXONOMY)} staging areas")

### Journal

In [None]:
journal_root = ARCHIVE_ROOT / "Journal"

# ------------------------------------------------------------
# Journal taxonomy: chronological personal writing by type
# ------------------------------------------------------------
JOURNAL_TAXONOMY = [
    "Daily",
    "Weekly",
    "Monthly",
    "Annual",
    "Dreams",
    "Gratitude",
    "Freewrite",
]

# ------------------------------------------------------------
# Journal READMEs
# ------------------------------------------------------------
JOURNAL_READMES = {
    "Daily": (
        "Daily journal entries.\n"
        "Day-to-day reflections and observations."
    ),
    "Weekly": (
        "Weekly reviews and reflections.\n"
        "Summaries and patterns across the week."
    ),
    "Monthly": (
        "Monthly reviews and retrospectives.\n"
        "Longer-arc reflections and goal tracking."
    ),
    "Annual": (
        "Annual reviews and year-end reflections.\n"
        "Big-picture thinking and year-over-year patterns."
    ),
    "Dreams": (
        "Dream journal entries.\n"
        "Records of dreams and nocturnal experiences."
    ),
    "Gratitude": (
        "Gratitude entries.\n"
        "Intentional acknowledgment of positive experiences."
    ),
    "Freewrite": (
        "Unstructured freewriting.\n"
        "Stream-of-consciousness writing without format."
    ),
}

# ------------------------------------------------------------
# Build Journal taxonomy
# ------------------------------------------------------------
for name in JOURNAL_TAXONOMY:
    ensure_dir_with_readme(
        journal_root / name,
        JOURNAL_READMES.get(name, "Journal category."),
    )

print(f"Journal: {len(JOURNAL_TAXONOMY)} categories")

### Movies

In [None]:
movies_root = ARCHIVE_ROOT / "Movies"

# ------------------------------------------------------------
# Movies taxonomy: cinematic works by tradition and tone
# Deeper hierarchy supports browsing and serendipity
# ------------------------------------------------------------
MOVIES_TAXONOMY = {
    "Drama": [
        "Character_Study",
        "Family",
        "Historical",
        "Legal",
        "Political",
        "Social",
    ],
    "Comedy": [
        "Absurdist",
        "Dark_Comedy",
        "Romantic_Comedy",
        "Satire",
        "Slapstick",
    ],
    "Thriller": [
        "Crime",
        "Espionage",
        "Mystery",
        "Psychological",
        "Neo_Noir",
    ],
    "Science_Fiction": [
        "Cyberpunk",
        "Dystopian",
        "Hard_SF",
        "Space_Opera",
        "Time_Travel",
    ],
    "Fantasy": [
        "Dark_Fantasy",
        "Epic",
        "Fairy_Tale",
        "Mythic",
        "Urban_Fantasy",
    ],
    "Horror": [
        "Body_Horror",
        "Cosmic",
        "Folk",
        "Psychological",
        "Slasher",
        "Supernatural",
    ],
    "Action": [
        "Martial_Arts",
        "Military",
        "Heist",
        "Disaster",
    ],
    "Documentary": [
        "Nature",
        "Music",
        "Political",
        "Portrait",
        "Science",
        "True_Crime",
    ],
    "Animation": [
        "Anime",
        "Stop_Motion",
        "Traditional",
        "CGI",
    ],
    "Art_House": [
        "Avant_Garde",
        "Minimalist",
        "Surrealist",
        "Slow_Cinema",
    ],
    "World_Cinema": [
        "Asian",
        "European",
        "Latin_American",
        "Middle_Eastern",
        "African",
    ],
    "Classic": [
        "Silent",
        "Golden_Age",
        "New_Hollywood",
        "Pre_Code",
    ],
}

# ------------------------------------------------------------
# Movies READMEs
# ------------------------------------------------------------
MOVIES_READMES = {
    "Drama": (
        "Character-driven narratives exploring human experience.\n"
        "Stories that emphasize emotional depth over spectacle."
    ),
    "Character_Study": "Films focused on individual psychology and transformation.",
    "Family": "Family dynamics and generational stories.",
    "Historical": "Films set in specific historical periods.",
    "Legal": "Courtroom dramas and legal system narratives.",
    "Political": "Political intrigue and power dynamics.",
    "Social": "Social issues and cultural commentary.",
    "Comedy": (
        "Films designed to provoke laughter.\n"
        "Humor as primary mode of engagement."
    ),
    "Absurdist": "Comedy rooted in absurdity and illogic.",
    "Dark_Comedy": "Comedy that finds humor in dark subjects.",
    "Romantic_Comedy": "Love stories with comedic elements.",
    "Satire": "Comedy that critiques society or institutions.",
    "Slapstick": "Physical comedy and visual gags.",
    "Thriller": (
        "Suspense-driven narratives.\n"
        "Tension and uncertainty as core experience."
    ),
    "Crime": "Criminal underworld and heist narratives.",
    "Espionage": "Spy stories and intelligence operations.",
    "Mystery": "Whodunits and investigative narratives.",
    "Psychological": "Mind games and unreliable perceptions.",
    "Neo_Noir": "Modern films in the noir tradition.",
    "Science_Fiction": (
        "Speculative futures and technological premises.\n"
        "Ideas and consequences over action."
    ),
    "Cyberpunk": "High tech, low life. Digital dystopias.",
    "Dystopian": "Oppressive futures and societal collapse.",
    "Hard_SF": "Science fiction grounded in plausible science.",
    "Space_Opera": "Epic space adventures and galactic scope.",
    "Time_Travel": "Temporal manipulation and paradox.",
    "Fantasy": (
        "Magical worlds and mythic structures.\n"
        "Stories that transcend mundane reality."
    ),
    "Dark_Fantasy": "Fantasy with horror or tragic elements.",
    "Epic": "Large-scale fantasy with world-shaping stakes.",
    "Fairy_Tale": "Folk and fairy tale adaptations.",
    "Mythic": "Stories operating at archetypal scale.",
    "Urban_Fantasy": "Magic in contemporary settings.",
    "Horror": (
        "Films designed to frighten and unsettle.\n"
        "Fear as aesthetic experience."
    ),
    "Body_Horror": "Horror focused on bodily transformation.",
    "Cosmic": "Lovecraftian and existential horror.",
    "Folk": "Horror rooted in folklore and tradition.",
    "Slasher": "Killer-focused horror narratives.",
    "Supernatural": "Ghosts, demons, and the uncanny.",
    "Action": (
        "Spectacle-driven physical conflict.\n"
        "Movement, combat, and kinetic energy."
    ),
    "Martial_Arts": "Combat-focused films emphasizing fighting arts.",
    "Military": "War films and military operations.",
    "Heist": "Elaborate theft and con narratives.",
    "Disaster": "Catastrophe and survival narratives.",
    "Documentary": (
        "Non-fiction filmmaking.\n"
        "Reality captured and shaped by a point of view."
    ),
    "Nature": "Nature and wildlife documentaries.",
    "Music": "Music documentaries and concert films.",
    "Political": "Political subjects and social movements.",
    "Portrait": "Individual and biographical documentaries.",
    "Science": "Science and technology documentaries.",
    "True_Crime": "Criminal cases and investigations.",
    "Animation": (
        "Animated films across traditions.\n"
        "Movement created frame by frame."
    ),
    "Anime": "Japanese animation tradition.",
    "Stop_Motion": "Frame-by-frame physical animation.",
    "Traditional": "Hand-drawn and 2D animation.",
    "CGI": "Computer-generated animation.",
    "Art_House": (
        "Experimental and auteur-driven cinema.\n"
        "Films that prioritize vision over convention."
    ),
    "Avant_Garde": "Experimental and non-narrative cinema.",
    "Minimalist": "Stripped-down, essentialist filmmaking.",
    "Surrealist": "Dream logic and unconscious imagery.",
    "Slow_Cinema": "Contemplative pacing and long takes.",
    "World_Cinema": (
        "Films organized by region of origin.\n"
        "Browsing by cultural tradition."
    ),
    "Asian": "Cinema from Asia (excluding anime).",
    "European": "European cinema traditions.",
    "Latin_American": "Cinema from Latin America.",
    "Middle_Eastern": "Cinema from the Middle East.",
    "African": "Cinema from Africa.",
    "Classic": (
        "Historical periods of cinema.\n"
        "Films from defined eras of filmmaking."
    ),
    "Silent": "Silent era cinema.",
    "Golden_Age": "Classic Hollywood studio era.",
    "New_Hollywood": "1960s-1980s American auteur period.",
    "Pre_Code": "Hollywood films before the Hays Code.",
}

# ------------------------------------------------------------
# Build Movies taxonomy
# ------------------------------------------------------------
for level2, level3_list in MOVIES_TAXONOMY.items():
    level2_path = movies_root / level2
    ensure_dir_with_readme(level2_path, MOVIES_READMES.get(level2))

    for level3 in level3_list:
        level3_path = level2_path / level3
        ensure_dir_with_readme(level3_path, MOVIES_READMES.get(level3))

print(f"Movies: {len(MOVIES_TAXONOMY)} genres, {sum(len(v) for v in MOVIES_TAXONOMY.values())} subgenres")

### Music

In [None]:
music_root = ARCHIVE_ROOT / "Music"

# ------------------------------------------------------------
# Music taxonomy: deep semantic hierarchy for browsing
# "These domains invite wandering based on tone, voice, and imaginative posture"
# Expanded based on Musicmap, Wikipedia, and genre research
# ------------------------------------------------------------
MUSIC_TAXONOMY = {
    "Classical": [
        "Baroque",
        "Classical_Period",
        "Romantic",
        "Modern",
        "Contemporary",
        "Opera",
        "Chamber",
        "Orchestral",
        "Solo_Piano",
        "Choral",
    ],
    "Jazz": [
        "Bebop",
        "Cool",
        "Free_Jazz",
        "Fusion",
        "Hard_Bop",
        "Modal",
        "Smooth",
        "Swing",
        "Vocal",
        "Latin_Jazz",
        "Acid_Jazz",
        "Nu_Jazz",
        "Gypsy_Jazz",
        "Big_Band",
    ],
    "Blues": [
        "Delta",
        "Chicago",
        "Texas",
        "Electric",
        "Acoustic",
        "British_Blues",
        "Jump_Blues",
        "Piedmont",
        "Country_Blues",
        "Blues_Rock",
    ],
    "Country": [
        "Traditional",
        "Honky_Tonk",
        "Outlaw",
        "Nashville_Sound",
        "Countrypolitan",
        "Country_Rock",
        "Alt_Country",
        "Americana",
        "Country_Pop",
        "Bro_Country",
        "Western_Swing",
        "Bakersfield_Sound",
        "Texas_Country",
        "Red_Dirt",
        "Bluegrass",
        "Newgrass",
    ],
    "Rock": [
        "Alternative",
        "Art_Rock",
        "Classic_Rock",
        "Garage",
        "Grunge",
        "Hard_Rock",
        "Indie",
        "Post_Punk",
        "Progressive",
        "Psychedelic",
        "Punk",
        "Shoegaze",
        "Post_Rock",
        "Math_Rock",
        "Emo",
        "Stoner_Rock",
        "Noise_Rock",
        "Surf",
        "Glam",
        "Southern_Rock",
        "Britpop",
    ],
    "Metal": [
        "Black",
        "Death",
        "Doom",
        "Heavy",
        "Power",
        "Progressive_Metal",
        "Sludge",
        "Thrash",
        "Symphonic",
        "Metalcore",
        "Nu_Metal",
        "Djent",
        "Folk_Metal",
        "Gothic_Metal",
        "Speed",
        "Groove",
        "Deathcore",
    ],
    "Punk": [
        "Classic_Punk",
        "Hardcore",
        "Post_Hardcore",
        "Pop_Punk",
        "Ska_Punk",
        "Crust",
        "D_Beat",
        "Oi",
        "Anarcho",
        "Melodic_Hardcore",
        "Screamo",
    ],
    "Electronic": [
        "Ambient",
        "Breakbeat",
        "Downtempo",
        "Drum_and_Bass",
        "House",
        "IDM",
        "Industrial",
        "Synthwave",
        "Techno",
        "Trance",
        "Dubstep",
        "UK_Garage",
        "Electro",
        "Chillwave",
        "Future_Bass",
        "Hardstyle",
        "Gabber",
        "Trip_Hop",
        "Vaporwave",
        "Electronica",
        "Chiptune",
        "Glitch",
    ],
    "Hip_Hop": [
        "Boom_Bap",
        "Conscious",
        "Experimental_Hip_Hop",
        "Gangsta",
        "Jazz_Rap",
        "Lo_Fi_Hip_Hop",
        "Southern",
        "Trap",
        "Underground",
        "Drill",
        "Crunk",
        "Grime",
        "Abstract",
        "Cloud_Rap",
        "Horrorcore",
        "Chopped_and_Screwed",
        "Phonk",
        "UK_Hip_Hop",
        "G_Funk",
        "Hyphy",
    ],
    "Pop": [
        "Synth_Pop",
        "Dance_Pop",
        "Indie_Pop",
        "Art_Pop",
        "Dream_Pop",
        "Electropop",
        "Teen_Pop",
        "Power_Pop",
        "Baroque_Pop",
        "Chamber_Pop",
        "Hyperpop",
        "K_Pop",
        "J_Pop",
        "City_Pop",
        "Sophisti_Pop",
        "Jangle_Pop",
    ],
    "Reggae": [
        "Roots",
        "Dub",
        "Ska",
        "Rocksteady",
        "Dancehall",
        "Lovers_Rock",
        "Ragga",
        "Reggae_Fusion",
        "Dub_Poetry",
        "Digital_Reggae",
        "Reggaeton",
    ],
    "Soul_RnB": [
        "Classic_Soul",
        "Contemporary_RnB",
        "Funk",
        "Gospel",
        "Motown",
        "Neo_Soul",
        "Quiet_Storm",
        "New_Jack_Swing",
        "Philadelphia_Soul",
        "Northern_Soul",
        "Southern_Soul",
        "Psychedelic_Soul",
        "Blue_Eyed_Soul",
    ],
    "Folk": [
        "Acoustic",
        "Americana",
        "Celtic",
        "Contemporary_Folk",
        "Traditional",
        "Singer_Songwriter",
        "Anti_Folk",
        "Neofolk",
        "Psych_Folk",
        "Freak_Folk",
        "Indie_Folk",
    ],
    "World": [
        "African",
        "Asian",
        "Caribbean",
        "Latin",
        "Middle_Eastern",
        "Nordic",
        "Afrobeat",
        "Flamenco",
        "Fado",
        "Bossa_Nova",
        "Samba",
        "Tango",
        "Klezmer",
        "Bollywood",
        "Highlife",
        "Soukous",
        "Cumbia",
        "Salsa",
        "Merengue",
        "Bachata",
    ],
    "Experimental": [
        "Avant_Garde",
        "Drone",
        "Field_Recordings",
        "Musique_Concrete",
        "Noise",
        "Sound_Collage",
        "Electroacoustic",
        "Minimalism",
        "Spectral",
        "Free_Improvisation",
        "Dark_Ambient",
        "Power_Electronics",
    ],
    "New_Age": [
        "Meditation",
        "Healing",
        "Space",
        "Nature_Sounds",
        "Relaxation",
        "Neoclassical",
    ],
    "Soundtrack": [
        "Film",
        "Television",
        "Video_Game",
        "Anime",
        "Musical_Theater",
        "Library_Music",
    ],
}

# ------------------------------------------------------------
# Music READMEs
# ------------------------------------------------------------
MUSIC_READMES = {
    # === CLASSICAL ===
    "Classical": (
        "Western art music tradition.\n"
        "Organized by period, form, and instrumentation."
    ),
    "Baroque": "1600-1750. Bach, Vivaldi, Handel. Ornate and contrapuntal.",
    "Classical_Period": "1750-1820. Mozart, Haydn, early Beethoven. Balance and form.",
    "Romantic": "1820-1900. Emotional expression, virtuosity, and nationalism.",
    "Modern": "1900-1975. Atonality, serialism, neoclassicism.",
    "Contemporary": "1975-present. Living composers and new works.",
    "Opera": "Staged dramatic works combining music, singing, and theater.",
    "Chamber": "Small ensemble works for intimate performance.",
    "Orchestral": "Full orchestra compositions: symphonies, concertos, tone poems.",
    "Solo_Piano": "Works for piano alone across all periods.",
    "Choral": "Vocal ensemble works, sacred and secular.",

    # === JAZZ ===
    "Jazz": (
        "American improvisational tradition.\n"
        "Organized by era, style, and approach."
    ),
    "Bebop": "1940s virtuosic, complex jazz. Parker, Gillespie.",
    "Cool": "Relaxed, understated jazz. Miles, Chet Baker.",
    "Free_Jazz": "Avant-garde improvisation without harmonic constraints.",
    "Fusion": "Jazz mixed with rock, funk, and electronic elements.",
    "Hard_Bop": "Bluesy, soulful extension of bebop. Art Blakey, Horace Silver.",
    "Modal": "Improvisation based on modes rather than chord changes.",
    "Smooth": "Accessible, radio-friendly jazz. Contemporary instrumental.",
    "Swing": "Big band era dance music. 1930s-40s.",
    "Vocal": "Jazz featuring prominent vocals. Standards and scat.",
    "Latin_Jazz": "Afro-Cuban rhythms meet jazz harmony. Tito Puente, Mongo Santamaria.",
    "Acid_Jazz": "Jazz-funk fusion with electronic production. 1980s-90s UK.",
    "Nu_Jazz": "Modern electronic-influenced jazz. Contemporary productions.",
    "Gypsy_Jazz": "Django Reinhardt tradition. Acoustic, virtuosic guitar.",
    "Big_Band": "Large jazz ensembles. Swing era and beyond.",

    # === BLUES ===
    "Blues": (
        "African American roots music.\n"
        "Foundation of rock, jazz, and soul."
    ),
    "Delta": "Mississippi acoustic blues. Robert Johnson, Son House.",
    "Chicago": "Electric, urban blues. Muddy Waters, Howlin' Wolf.",
    "Texas": "Distinct regional style. Stevie Ray Vaughan, T-Bone Walker.",
    "Electric": "Amplified blues across regions and eras.",
    "Acoustic": "Unplugged blues, traditional and contemporary.",
    "British_Blues": "UK interpretation of American blues. Clapton, Mayall.",
    "Jump_Blues": "Upbeat, swing-influenced. Precursor to rock and roll.",
    "Piedmont": "East Coast fingerpicking style. Blind Blake, Rev. Gary Davis.",
    "Country_Blues": "Rural acoustic blues tradition.",
    "Blues_Rock": "Blues-influenced rock. Led Zeppelin, Cream.",

    # === COUNTRY ===
    "Country": (
        "American roots music from the rural South and West.\n"
        "Storytelling, tradition, and evolution."
    ),
    "Traditional": "Classic country sound. Hank Williams, Patsy Cline.",
    "Honky_Tonk": "Jukebox country. Drinking songs and heartbreak.",
    "Outlaw": "Rebellious spirit. Waylon, Willie, and the boys.",
    "Nashville_Sound": "Polished studio production. 1960s crossover.",
    "Countrypolitan": "Orchestral country. Pop sophistication.",
    "Country_Rock": "Country meets rock. Eagles, Gram Parsons.",
    "Alt_Country": "Alternative and indie sensibility. Uncle Tupelo, Wilco.",
    "Americana": "Roots music umbrella. Folk, country, blues fusion.",
    "Country_Pop": "Radio-friendly crossover. Shania, Taylor Swift.",
    "Bro_Country": "Modern party-focused style. Trucks and tailgates.",
    "Western_Swing": "Jazz-influenced dance music. Bob Wills.",
    "Bakersfield_Sound": "California country. Buck Owens, Merle Haggard.",
    "Texas_Country": "Lone Star state tradition. Independent spirit.",
    "Red_Dirt": "Oklahoma roots. Cross Canadian Ragweed.",
    "Bluegrass": "Acoustic string band music. Bill Monroe tradition.",
    "Newgrass": "Progressive bluegrass. Bela Fleck, Nickel Creek.",

    # === ROCK ===
    "Rock": (
        "Guitar-driven popular music.\n"
        "Organized by subgenre, era, and attitude."
    ),
    "Alternative": "Non-mainstream rock sensibilities.",
    "Art_Rock": "Ambitious, experimental rock. Roxy Music, Bowie.",
    "Classic_Rock": "1960s-1980s foundational rock.",
    "Garage": "Raw, stripped-down rock. Back to basics.",
    "Grunge": "Seattle sound. Nirvana, Pearl Jam, Soundgarden.",
    "Hard_Rock": "Heavy, riff-driven rock. AC/DC, Led Zeppelin.",
    "Indie": "Independent label aesthetic and ethos.",
    "Post_Punk": "After punk, darker and more experimental.",
    "Progressive": "Complex compositions and concept albums.",
    "Psychedelic": "Mind-expanding, experimental rock.",
    "Punk": "Fast, aggressive, DIY ethos.",
    "Shoegaze": "Wall of sound, dreamy. My Bloody Valentine, Slowdive.",
    "Post_Rock": "Crescendo-based, cinematic. Godspeed, Mogwai.",
    "Math_Rock": "Complex time signatures and angular riffs.",
    "Emo": "Emotional hardcore derivative. Confessional lyrics.",
    "Stoner_Rock": "Heavy, fuzzy, desert rock. Kyuss, Queens.",
    "Noise_Rock": "Abrasive, feedback-driven. Sonic Youth, Lightning Bolt.",
    "Surf": "Reverb-drenched instrumentals. Dick Dale.",
    "Glam": "Theatrical, androgynous. Bowie, T. Rex.",
    "Southern_Rock": "American South influences. Skynyrd, Allman Brothers.",
    "Britpop": "1990s UK guitar pop. Oasis, Blur, Pulp.",

    # === METAL ===
    "Metal": (
        "Heavy, distorted, intense.\n"
        "Organized by subgenre and extremity."
    ),
    "Black": "Atmospheric, tremolo picking, blast beats. Mayhem, Darkthrone.",
    "Death": "Extreme vocals, technical playing. Death, Morbid Angel.",
    "Doom": "Slow, heavy, dark. Black Sabbath, Electric Wizard.",
    "Heavy": "Traditional heavy metal. Iron Maiden, Judas Priest.",
    "Power": "Melodic, anthemic, fast. Helloween, Blind Guardian.",
    "Progressive_Metal": "Complex, technical metal. Dream Theater, Tool.",
    "Sludge": "Doom meets hardcore. Eyehategod, Crowbar.",
    "Thrash": "Fast, aggressive metal. Metallica, Slayer.",
    "Symphonic": "Orchestral elements. Nightwish, Epica.",
    "Metalcore": "Metal and hardcore fusion. Killswitch Engage.",
    "Nu_Metal": "1990s-2000s mainstream metal. Korn, Limp Bizkit.",
    "Djent": "Modern progressive style. Meshuggah influence.",
    "Folk_Metal": "Traditional folk elements. Korpiklaani, Finntroll.",
    "Gothic_Metal": "Dark, romantic atmosphere. Type O Negative.",
    "Speed": "Fast, technical precursor to thrash.",
    "Groove": "Mid-tempo, riff-focused. Pantera.",
    "Deathcore": "Death metal meets metalcore. Extreme breakdowns.",

    # === PUNK ===
    "Punk": (
        "Fast, loud, anti-establishment.\n"
        "DIY ethos and subcultural identity."
    ),
    "Classic_Punk": "Original 1970s punk. Ramones, Sex Pistols, Clash.",
    "Hardcore": "Faster, more aggressive. Black Flag, Minor Threat.",
    "Post_Hardcore": "Experimental evolution. Fugazi, At the Drive-In.",
    "Pop_Punk": "Melodic, accessible. Blink-182, Green Day.",
    "Ska_Punk": "Punk meets ska. Operation Ivy, Reel Big Fish.",
    "Crust": "Anarcho-punk meets metal. Extreme and political.",
    "D_Beat": "Discharge-influenced. Raw, distorted, fast.",
    "Oi": "Working class punk. Street-level anthems.",
    "Anarcho": "Politically charged. Crass, Conflict.",
    "Melodic_Hardcore": "Tuneful hardcore. Descendents, Bad Religion.",
    "Screamo": "Emotional hardcore with screamed vocals.",

    # === ELECTRONIC ===
    "Electronic": (
        "Synthesized and computer-generated music.\n"
        "Organized by tempo, texture, and scene."
    ),
    "Ambient": "Atmospheric, background-compatible. Eno, Aphex Twin.",
    "Breakbeat": "Syncopated, sample-based rhythms.",
    "Downtempo": "Slow, relaxed electronic. Chill-out.",
    "Drum_and_Bass": "Fast breakbeats, heavy bass. Jungle evolution.",
    "House": "Four-on-the-floor dance music. Chicago origins.",
    "IDM": "Intelligent dance music. Experimental, cerebral.",
    "Industrial": "Harsh, mechanical sounds. NIN, Ministry.",
    "Synthwave": "1980s-inspired synthesizer music. Retrowave.",
    "Techno": "Repetitive, driving electronic. Detroit origins.",
    "Trance": "Euphoric, melodic electronic. Build and release.",
    "Dubstep": "Heavy bass drops, half-time rhythms. UK origins.",
    "UK_Garage": "2-step rhythms, London scene. Precursor to grime.",
    "Electro": "Drum machine driven. Distinct from electro-house.",
    "Chillwave": "Lo-fi, nostalgic aesthetic. Washed Out, Toro y Moi.",
    "Future_Bass": "Contemporary style. Flume, Marshmello.",
    "Hardstyle": "Hard dance music. Fast, distorted kicks.",
    "Gabber": "Extreme tempo hardcore techno. Rotterdam.",
    "Trip_Hop": "Downtempo, atmospheric. Portishead, Massive Attack.",
    "Vaporwave": "Internet-era aesthetic. Slowed samples, nostalgia.",
    "Electronica": "Broad electronic listening music.",
    "Chiptune": "Video game sound chip music. 8-bit aesthetics.",
    "Glitch": "Errors as aesthetic. Clicks, cuts, digital artifacts.",

    # === HIP HOP ===
    "Hip_Hop": (
        "Beat-driven vocal music.\n"
        "Organized by era, region, and style."
    ),
    "Boom_Bap": "Classic East Coast production. Sampled drums.",
    "Conscious": "Socially aware, lyric-focused.",
    "Experimental_Hip_Hop": "Boundary-pushing productions and flows.",
    "Gangsta": "Street life narratives. West Coast origins.",
    "Jazz_Rap": "Jazz samples and sensibilities. Tribe, Guru.",
    "Lo_Fi_Hip_Hop": "Low fidelity, relaxed production. Study beats.",
    "Southern": "Southern regional styles. Houston, Atlanta, Memphis.",
    "Trap": "Heavy 808s, hi-hat rolls. Atlanta origins.",
    "Underground": "Non-commercial, independent.",
    "Drill": "Dark, aggressive. Chicago/UK scenes.",
    "Crunk": "Southern party style. Lil Jon energy.",
    "Grime": "UK evolution. Fast, electronic, aggressive. Skepta, Wiley.",
    "Abstract": "Art-focused, experimental. Anti-Pop Consortium.",
    "Cloud_Rap": "Hazy, atmospheric production. Yung Lean.",
    "Horrorcore": "Horror-themed lyrics. Dark imagery.",
    "Chopped_and_Screwed": "Slowed, pitched-down. Houston DJ Screw.",
    "Phonk": "Memphis revival, dark samples, cowbell.",
    "UK_Hip_Hop": "British hip hop tradition.",
    "G_Funk": "West Coast funk samples. Dr. Dre, Snoop.",
    "Hyphy": "Bay Area style. E-40, Mac Dre.",

    # === POP ===
    "Pop": (
        "Popular music crafted for broad appeal.\n"
        "Melody, hooks, and production polish."
    ),
    "Synth_Pop": "Synthesizer-driven pop. Depeche Mode, Pet Shop Boys.",
    "Dance_Pop": "Club-oriented pop. Madonna, Dua Lipa.",
    "Indie_Pop": "Independent aesthetic. Belle and Sebastian, Vampire Weekend.",
    "Art_Pop": "Experimental, conceptual. Kate Bush, Bjork.",
    "Dream_Pop": "Ethereal, atmospheric. Cocteau Twins, Beach House.",
    "Electropop": "Electronic production focus. La Roux, Robyn.",
    "Teen_Pop": "Youth-targeted. Manufactured appeal.",
    "Power_Pop": "Guitar-driven, melodic. Big Star, Cheap Trick.",
    "Baroque_Pop": "Orchestral arrangements. Pet Sounds, Sgt. Pepper.",
    "Chamber_Pop": "Classical instrumentation in pop context.",
    "Hyperpop": "Maximalist, digital, extreme. 100 gecs, SOPHIE.",
    "K_Pop": "Korean pop industry. BTS, BLACKPINK.",
    "J_Pop": "Japanese pop tradition.",
    "City_Pop": "1980s Japanese urban pop. Nostalgic revival.",
    "Sophisti_Pop": "Jazzy, polished 1980s pop. Sade, Prefab Sprout.",
    "Jangle_Pop": "Bright, chiming guitars. REM, The Smiths.",

    # === REGGAE ===
    "Reggae": (
        "Jamaican music tradition.\n"
        "Offbeat rhythms, bass-heavy, culturally rich."
    ),
    "Roots": "Rastafari spirituality. Marley, Burning Spear.",
    "Dub": "Studio manipulation, effects. King Tubby, Lee Perry.",
    "Ska": "1960s upbeat predecessor. Skatalites.",
    "Rocksteady": "Slower, soulful transition. 1966-1968.",
    "Dancehall": "Electronic, DJ-focused. Digital riddims.",
    "Lovers_Rock": "Romantic style. UK origin, smooth.",
    "Ragga": "Digital dancehall. Shabba Ranks.",
    "Reggae_Fusion": "Reggae mixed with other genres.",
    "Dub_Poetry": "Spoken word over dub rhythms. Linton Kwesi Johnson.",
    "Digital_Reggae": "Computer-produced reggae. Modern productions.",
    "Reggaeton": "Puerto Rican evolution. Dembow rhythm.",

    # === SOUL/R&B ===
    "Soul_RnB": (
        "African American vocal traditions.\n"
        "Emotion, groove, and expression."
    ),
    "Classic_Soul": "1960s-1970s soul music. Otis Redding, Aretha.",
    "Contemporary_RnB": "Modern R&B production. Beyoncé, The Weeknd.",
    "Funk": "Rhythmic, bass-heavy grooves. James Brown, Parliament.",
    "Gospel": "Religious vocal music. Spiritual roots.",
    "Motown": "Detroit soul sound. Temptations, Supremes.",
    "Neo_Soul": "1990s-2000s soul revival. D'Angelo, Erykah Badu.",
    "Quiet_Storm": "Smooth, romantic R&B. Late-night radio.",
    "New_Jack_Swing": "Hip hop meets R&B. Teddy Riley, Bobby Brown.",
    "Philadelphia_Soul": "Lush orchestration. Gamble and Huff.",
    "Northern_Soul": "UK scene celebrating rare American soul.",
    "Southern_Soul": "Deep soul from the South. Stax, Muscle Shoals.",
    "Psychedelic_Soul": "Mind-expanding soul. Sly Stone, Temptations.",
    "Blue_Eyed_Soul": "White artists in soul tradition. Hall & Oates.",

    # === FOLK ===
    "Folk": (
        "Acoustic, traditional-rooted music.\n"
        "Stories, simplicity, and heritage."
    ),
    "Acoustic": "Unplugged, natural sound.",
    "Americana": "American roots fusion. Alt-country adjacent.",
    "Celtic": "Irish and Scottish traditions.",
    "Contemporary_Folk": "Modern folk sensibilities. Ani DiFranco.",
    "Traditional": "Historical folk songs preserved.",
    "Singer_Songwriter": "Personal, confessional songwriting.",
    "Anti_Folk": "Irreverent, lo-fi. Moldy Peaches, Kimya Dawson.",
    "Neofolk": "Dark, European folk revival. Current 93, Death in June.",
    "Psych_Folk": "Psychedelic folk experimentation.",
    "Freak_Folk": "Eccentric, avant-garde folk. Devendra Banhart.",
    "Indie_Folk": "Independent folk aesthetic. Fleet Foxes, Bon Iver.",

    # === WORLD ===
    "World": (
        "Music from global traditions.\n"
        "Organized by region and style."
    ),
    "African": "Pan-African musical traditions.",
    "Asian": "Asian musical traditions across the continent.",
    "Caribbean": "Caribbean rhythms and styles beyond reggae.",
    "Latin": "Latin American music traditions.",
    "Middle_Eastern": "Middle Eastern and North African music.",
    "Nordic": "Scandinavian and Nordic traditions.",
    "Afrobeat": "Nigerian fusion. Fela Kuti legacy.",
    "Flamenco": "Andalusian guitar and dance tradition.",
    "Fado": "Portuguese melancholy. Lisbon tradition.",
    "Bossa_Nova": "Brazilian jazz fusion. João Gilberto, Tom Jobim.",
    "Samba": "Brazilian rhythm and dance.",
    "Tango": "Argentine dance music tradition.",
    "Klezmer": "Eastern European Jewish music.",
    "Bollywood": "Indian film music industry.",
    "Highlife": "West African popular music. Ghana, Nigeria.",
    "Soukous": "Congolese dance music.",
    "Cumbia": "Colombian/Latin American dance rhythm.",
    "Salsa": "Cuban-derived dance music. NYC evolution.",
    "Merengue": "Dominican dance rhythm.",
    "Bachata": "Dominican romantic guitar music.",

    # === EXPERIMENTAL ===
    "Experimental": (
        "Boundary-pushing sound art.\n"
        "Music that challenges definition."
    ),
    "Avant_Garde": "Deliberately unconventional composition.",
    "Drone": "Sustained tones and textures. Sunn O))), Earth.",
    "Field_Recordings": "Captured environmental sound.",
    "Musique_Concrete": "Composed from recorded sounds. Pierre Schaeffer.",
    "Noise": "Intentionally harsh and abrasive. Merzbow.",
    "Sound_Collage": "Assembled from disparate sources.",
    "Electroacoustic": "Electronic and acoustic fusion. Academic tradition.",
    "Minimalism": "Repetitive, gradual processes. Reich, Glass, Riley.",
    "Spectral": "Composition based on acoustic properties of sound.",
    "Free_Improvisation": "Spontaneous creation without structure.",
    "Dark_Ambient": "Ominous atmospheric soundscapes.",
    "Power_Electronics": "Extreme, confrontational noise.",

    # === NEW AGE ===
    "New_Age": (
        "Relaxation and spiritual music.\n"
        "Ambient, healing, and contemplative."
    ),
    "Meditation": "Music for meditative practice.",
    "Healing": "Therapeutic and wellness-oriented.",
    "Space": "Cosmic and celestial themes. Vangelis, Tangerine Dream.",
    "Nature_Sounds": "Natural soundscapes. Birds, water, wind.",
    "Relaxation": "General relaxation and stress relief.",
    "Neoclassical": "Modern classical-influenced. Ludovico Einaudi.",

    # === SOUNDTRACK ===
    "Soundtrack": (
        "Music composed for visual media.\n"
        "Organized by medium and purpose."
    ),
    "Film": "Film scores and soundtracks. Williams, Zimmer.",
    "Television": "TV scores and themes.",
    "Video_Game": "Video game music. Interactive and adaptive.",
    "Anime": "Anime soundtracks and themes.",
    "Musical_Theater": "Broadway and stage musicals.",
    "Library_Music": "Production music for licensing.",
}

# ------------------------------------------------------------
# Build Music taxonomy
# ------------------------------------------------------------
for level2, level3_list in MUSIC_TAXONOMY.items():
    level2_path = music_root / level2
    ensure_dir_with_readme(level2_path, MUSIC_READMES.get(level2))

    for level3 in level3_list:
        level3_path = level2_path / level3
        ensure_dir_with_readme(level3_path, MUSIC_READMES.get(level3))

print(f"Music: {len(MUSIC_TAXONOMY)} genres, {sum(len(v) for v in MUSIC_TAXONOMY.values())} subgenres")

### Notes

In [None]:
notes_root = ARCHIVE_ROOT / "Notes"

# ------------------------------------------------------------
# Notes taxonomy: short-form thinking by domain
# ------------------------------------------------------------
NOTES_TAXONOMY = {
    "Research": [
        "Literature",
        "Technical",
        "Observations",
        "Questions",
    ],
    "Ideas": [
        "Projects",
        "Writing",
        "Creative",
        "Business",
    ],
    "Learning": [
        "Courses",
        "Books",
        "Tutorials",
        "Concepts",
    ],
    "Work": [
        "Meetings",
        "Planning",
        "Reviews",
        "Decisions",
    ],
    "Personal": [
        "Goals",
        "Reflections",
        "Health",
        "Relationships",
    ],
    "Reference": [
        "Cheatsheets",
        "Procedures",
        "Contacts",
        "Lists",
    ],
}

# ------------------------------------------------------------
# Notes READMEs
# ------------------------------------------------------------
NOTES_READMES = {
    "Research": (
        "Notes from research and investigation.\n"
        "Raw material gathered during inquiry."
    ),
    "Literature": "Notes from reading papers and articles.",
    "Technical": "Technical research and findings.",
    "Observations": "Recorded observations and data points.",
    "Questions": "Open questions awaiting answers.",
    "Ideas": (
        "Seeds and sparks of possibility.\n"
        "Thoughts worth capturing before they fade."
    ),
    "Projects": "Project ideas and proposals.",
    "Writing": "Writing ideas and fragments.",
    "Creative": "Creative concepts and inspirations.",
    "Business": "Business ideas and opportunities.",
    "Learning": (
        "Notes from learning activities.\n"
        "What was absorbed, summarized, or captured."
    ),
    "Courses": "Notes from courses and classes.",
    "Books": "Notes from book reading.",
    "Tutorials": "Notes from tutorials and guides.",
    "Concepts": "Concept explanations and mental models.",
    "Work": (
        "Work-related notes.\n"
        "Professional context and decisions."
    ),
    "Meetings": "Meeting notes and minutes.",
    "Planning": "Planning notes and roadmaps.",
    "Reviews": "Review notes and feedback.",
    "Decisions": "Decision logs and rationale.",
    "Personal": (
        "Personal notes and reflections.\n"
        "Private thinking not for external use."
    ),
    "Goals": "Goal setting and tracking notes.",
    "Reflections": "Personal reflections and insights.",
    "Health": "Health-related notes and tracking.",
    "Relationships": "Notes about relationships and people.",
    "Reference": (
        "Quick reference materials.\n"
        "Information kept for easy lookup."
    ),
    "Cheatsheets": "Quick reference cheatsheets.",
    "Procedures": "Step-by-step procedures.",
    "Contacts": "Contact information and details.",
    "Lists": "Lists and inventories.",
}

# ------------------------------------------------------------
# Build Notes taxonomy
# ------------------------------------------------------------
for level2, level3_list in NOTES_TAXONOMY.items():
    level2_path = notes_root / level2
    ensure_dir_with_readme(level2_path, NOTES_READMES.get(level2))

    for level3 in level3_list:
        level3_path = level2_path / level3
        ensure_dir_with_readme(level3_path, NOTES_READMES.get(level3))

print(f"Notes: {len(NOTES_TAXONOMY)} categories, {sum(len(v) for v in NOTES_TAXONOMY.values())} subcategories")

### Projects

In [None]:
projects_root = ARCHIVE_ROOT / "Projects"

# ------------------------------------------------------------
# Projects taxonomy: active workspaces by domain
# "Messy by design. Completed outputs should be promoted elsewhere."
# ------------------------------------------------------------
PROJECTS_TAXONOMY = {
    "Software": [
        "Active",
        "Paused",
        "Archived",
    ],
    "Writing": [
        "Active",
        "Paused",
        "Archived",
    ],
    "Creative": [
        "Active",
        "Paused",
        "Archived",
    ],
    "Research": [
        "Active",
        "Paused",
        "Archived",
    ],
    "Learning": [
        "Active",
        "Paused",
        "Archived",
    ],
    "Personal": [
        "Active",
        "Paused",
        "Archived",
    ],
}

# ------------------------------------------------------------
# Projects READMEs
# ------------------------------------------------------------
PROJECTS_READMES = {
    "Software": (
        "Software development projects.\n"
        "Code-heavy work with build artifacts and dependencies."
    ),
    "Writing": (
        "Writing projects.\n"
        "Long-form writing with drafts, research, and revisions."
    ),
    "Creative": (
        "Creative projects.\n"
        "Art, music, video, and other creative work."
    ),
    "Research": (
        "Research projects.\n"
        "Investigations, experiments, and exploratory work."
    ),
    "Learning": (
        "Learning projects.\n"
        "Structured learning efforts and skill development."
    ),
    "Personal": (
        "Personal projects.\n"
        "Life projects, organization, and self-improvement."
    ),
    "Active": (
        "Currently active projects.\n"
        "Projects receiving regular attention and effort."
    ),
    "Paused": (
        "Temporarily paused projects.\n"
        "Projects on hold but intended to resume."
    ),
    "Archived": (
        "Completed or abandoned projects.\n"
        "Projects no longer active. Outputs should be promoted to the archive."
    ),
}

# ------------------------------------------------------------
# Build Projects taxonomy
# ------------------------------------------------------------
for level2, level3_list in PROJECTS_TAXONOMY.items():
    level2_path = projects_root / level2
    ensure_dir_with_readme(level2_path, PROJECTS_READMES.get(level2))

    for level3 in level3_list:
        level3_path = level2_path / level3
        ensure_dir_with_readme(level3_path, PROJECTS_READMES.get(level3))

print(f"Projects: {len(PROJECTS_TAXONOMY)} domains, {sum(len(v) for v in PROJECTS_TAXONOMY.values())} states")

### Systems

In [None]:
systems_root = ARCHIVE_ROOT / "Systems"

# ------------------------------------------------------------
# Systems taxonomy: system-level artifacts and backups
# ------------------------------------------------------------
SYSTEMS_TAXONOMY = {
    "Backups": [
        "Full_System",
        "Home_Directory",
        "Application_Data",
        "Cloud_Exports",
    ],
    "Disk_Images": [
        "Boot_Drives",
        "Virtual_Machines",
        "Recovery",
    ],
    "Installers": [
        "Operating_Systems",
        "Applications",
        "Drivers",
        "Firmware",
    ],
    "Configuration": [
        "Dotfiles",
        "System_Settings",
        "Application_Configs",
        "Scripts",
    ],
    "Logs": [
        "System",
        "Security",
        "Application",
    ],
}

# ------------------------------------------------------------
# Systems READMEs
# ------------------------------------------------------------
SYSTEMS_READMES = {
    "Backups": (
        "Backup archives and snapshots.\n"
        "Point-in-time copies of systems and data."
    ),
    "Full_System": "Complete system backups and images.",
    "Home_Directory": "User home directory backups.",
    "Application_Data": "Application-specific data backups.",
    "Cloud_Exports": "Exports from cloud services.",
    "Disk_Images": (
        "Disk and partition images.\n"
        "Bootable and restorable disk captures."
    ),
    "Boot_Drives": "Bootable system images.",
    "Virtual_Machines": "Virtual machine images and snapshots.",
    "Recovery": "Recovery images and rescue media.",
    "Installers": (
        "Installation media and packages.\n"
        "Software needed for system setup."
    ),
    "Operating_Systems": "OS installation media and ISOs.",
    "Applications": "Application installers and packages.",
    "Drivers": "Device drivers and hardware support.",
    "Firmware": "Firmware updates and BIOS images.",
    "Configuration": (
        "System and application configuration.\n"
        "Settings, dotfiles, and setup scripts."
    ),
    "Dotfiles": "User configuration dotfiles.",
    "System_Settings": "System-level configuration.",
    "Application_Configs": "Application configuration files.",
    "Scripts": "Setup and configuration scripts.",
    "Logs": (
        "Archived system logs.\n"
        "Historical logs preserved for reference."
    ),
    "System": "Operating system logs.",
    "Security": "Security and audit logs.",
    "Application": "Application-level logs.",
}

# ------------------------------------------------------------
# Build Systems taxonomy
# ------------------------------------------------------------
for level2, level3_list in SYSTEMS_TAXONOMY.items():
    level2_path = systems_root / level2
    ensure_dir_with_readme(level2_path, SYSTEMS_READMES.get(level2))

    for level3 in level3_list:
        level3_path = level2_path / level3
        ensure_dir_with_readme(level3_path, SYSTEMS_READMES.get(level3))

print(f"Systems: {len(SYSTEMS_TAXONOMY)} categories, {sum(len(v) for v in SYSTEMS_TAXONOMY.values())} subcategories")

### Video

In [None]:
video_root = ARCHIVE_ROOT / "Video"

# ------------------------------------------------------------
# Video taxonomy: non-cinematic moving images
# Distinct from Movies - this is for non-film video content
# ------------------------------------------------------------
VIDEO_TAXONOMY = {
    "Personal": [
        "Family",
        "Events",
        "Travel",
        "Memories",
    ],
    "Educational": [
        "Lectures",
        "Tutorials",
        "Courses",
        "Talks",
    ],
    "Screen_Recordings": [
        "Demos",
        "Walkthroughs",
        "Debugging",
        "Presentations",
    ],
    "Creative": [
        "Projects",
        "Experiments",
        "Raw_Footage",
        "Exports",
    ],
    "Clips": [
        "Saved",
        "Memes",
        "Reference",
        "Inspiration",
    ],
    "Streams": [
        "Archived",
        "Highlights",
        "VODs",
    ],
}

# ------------------------------------------------------------
# Video READMEs
# ------------------------------------------------------------
VIDEO_READMES = {
    "Personal": (
        "Personal video recordings.\n"
        "Life moments captured on video."
    ),
    "Family": "Family videos and home movies.",
    "Events": "Videos from events and gatherings.",
    "Travel": "Travel videos and location footage.",
    "Memories": "Personal memory captures.",
    "Educational": (
        "Learning and educational videos.\n"
        "Videos watched or saved for learning."
    ),
    "Lectures": "Academic and educational lectures.",
    "Tutorials": "How-to and tutorial videos.",
    "Courses": "Videos from online courses.",
    "Talks": "Conference talks and presentations.",
    "Screen_Recordings": (
        "Captured screen activity.\n"
        "Recordings of computer or device screens."
    ),
    "Demos": "Demonstration recordings.",
    "Walkthroughs": "Step-by-step walkthrough recordings.",
    "Debugging": "Debugging and troubleshooting recordings.",
    "Presentations": "Recorded presentations and slideshows.",
    "Creative": (
        "Creative video work.\n"
        "Video projects and raw materials."
    ),
    "Projects": "Video project files and edits.",
    "Experiments": "Experimental video work.",
    "Raw_Footage": "Unedited raw video footage.",
    "Exports": "Final exported video files.",
    "Clips": (
        "Short video clips.\n"
        "Saved clips from various sources."
    ),
    "Saved": "Clips saved for later viewing.",
    "Memes": "Meme videos and funny clips.",
    "Reference": "Reference clips for creative work.",
    "Inspiration": "Inspirational video clips.",
    "Streams": (
        "Streamed content.\n"
        "Live stream recordings and archives."
    ),
    "Archived": "Full archived streams.",
    "Highlights": "Stream highlights and clips.",
    "VODs": "Video on demand recordings.",
}

# ------------------------------------------------------------
# Build Video taxonomy
# ------------------------------------------------------------
for level2, level3_list in VIDEO_TAXONOMY.items():
    level2_path = video_root / level2
    ensure_dir_with_readme(level2_path, VIDEO_READMES.get(level2))

    for level3 in level3_list:
        level3_path = level2_path / level3
        ensure_dir_with_readme(level3_path, VIDEO_READMES.get(level3))

print(f"Video: {len(VIDEO_TAXONOMY)} categories, {sum(len(v) for v in VIDEO_TAXONOMY.values())} subcategories")

## Summary

In [None]:
# ------------------------------------------------------------
# Taxonomy Summary
# ------------------------------------------------------------

def count_dirs(path: Path) -> tuple[int, int]:
    """Count directories and README files under a path."""
    dirs = 0
    readmes = 0
    for item in path.rglob("*"):
        if item.is_dir():
            dirs += 1
        elif item.name == "README.md":
            readmes += 1
    return dirs, readmes

def print_tree(path: Path, prefix: str = "", max_depth: int = 2, current_depth: int = 0):
    """Print directory tree up to max_depth."""
    if current_depth > max_depth:
        return
    
    items = sorted([p for p in path.iterdir() if p.is_dir()])
    for i, item in enumerate(items):
        is_last = i == len(items) - 1
        connector = "└── " if is_last else "├── "
        print(f"{prefix}{connector}{item.name}/")
        
        if current_depth < max_depth:
            extension = "    " if is_last else "│   "
            print_tree(item, prefix + extension, max_depth, current_depth + 1)

# Count totals
total_dirs, total_readmes = count_dirs(ARCHIVE_ROOT)

print("=" * 60)
print("ARCHIVE TAXONOMY SUMMARY")
print("=" * 60)
print(f"\nArchive root: {ARCHIVE_ROOT}")
print(f"Total directories: {total_dirs}")
print(f"Total README files: {total_readmes}")
print(f"\nLevel 1 categories: {len(LEVEL_1)}")
print("-" * 40)

# Per-category stats
for name in LEVEL_1:
    cat_path = ARCHIVE_ROOT / name
    if cat_path.exists():
        dirs, _ = count_dirs(cat_path)
        print(f"  {name}: {dirs} subdirectories")

print("\n" + "=" * 60)
print("TAXONOMY TREE (depth=2)")
print("=" * 60)
print(f"\n{ARCHIVE_ROOT.name}/")
print_tree(ARCHIVE_ROOT, max_depth=2)

## Export Taxonomy as JSON

In [None]:
import json

# ------------------------------------------------------------
# Combine all taxonomies into a single structure
# ------------------------------------------------------------

COMPLETE_TAXONOMY = {
    "meta": {
        "version": "1.0",
        "description": "Complete archive taxonomy with structure and descriptions",
        "level_1_categories": LEVEL_1,
    },
    "categories": {
        "Applications": {
            "description": LEVEL_1_EXPLAINERS["Applications"],
            "structure": APPLICATIONS,
            "descriptions": {
                "tasks": APPLICATION_TASK_EXPLAINERS,
                "apps": APPLICATION_EXPLAINERS,
            },
        },
        "Audio": {
            "description": LEVEL_1_EXPLAINERS["Audio"],
            "structure": AUDIO_LEVEL_2,
            "descriptions": AUDIO_EXPLAINERS,
        },
        "Books": {
            "description": LEVEL_1_EXPLAINERS["Books"],
            "structure": BOOKS_TAXONOMY,
            "descriptions": BOOKS_READMES,
        },
        "Code": {
            "description": LEVEL_1_EXPLAINERS["Code"],
            "structure": CODE_TAXONOMY,
            "descriptions": CODE_READMES,
        },
        "Compressed": {
            "description": LEVEL_1_EXPLAINERS["Compressed"],
            "structure": COMPRESSED_TAXONOMY,
            "descriptions": COMPRESSED_READMES,
        },
        "Data": {
            "description": LEVEL_1_EXPLAINERS["Data"],
            "structure": DATA_TAXONOMY,
            "descriptions": DATA_READMES,
        },
        "Docs": {
            "description": LEVEL_1_EXPLAINERS["Docs"],
            "structure": DOCS_TAXONOMY,
            "descriptions": DOCS_READMES,
        },
        "Images": {
            "description": LEVEL_1_EXPLAINERS["Images"],
            "structure": IMAGES_TAXONOMY,
            "descriptions": IMAGES_READMES,
        },
        "Intake": {
            "description": LEVEL_1_EXPLAINERS["Intake"],
            "structure": INTAKE_TAXONOMY,
            "descriptions": INTAKE_READMES,
        },
        "Journal": {
            "description": LEVEL_1_EXPLAINERS["Journal"],
            "structure": JOURNAL_TAXONOMY,
            "descriptions": JOURNAL_READMES,
        },
        "Movies": {
            "description": LEVEL_1_EXPLAINERS["Movies"],
            "structure": MOVIES_TAXONOMY,
            "descriptions": MOVIES_READMES,
        },
        "Music": {
            "description": LEVEL_1_EXPLAINERS["Music"],
            "structure": MUSIC_TAXONOMY,
            "descriptions": MUSIC_READMES,
        },
        "Notes": {
            "description": LEVEL_1_EXPLAINERS["Notes"],
            "structure": NOTES_TAXONOMY,
            "descriptions": NOTES_READMES,
        },
        "Projects": {
            "description": LEVEL_1_EXPLAINERS["Projects"],
            "structure": PROJECTS_TAXONOMY,
            "descriptions": PROJECTS_READMES,
        },
        "Systems": {
            "description": LEVEL_1_EXPLAINERS["Systems"],
            "structure": SYSTEMS_TAXONOMY,
            "descriptions": SYSTEMS_READMES,
        },
        "Video": {
            "description": LEVEL_1_EXPLAINERS["Video"],
            "structure": VIDEO_TAXONOMY,
            "descriptions": VIDEO_READMES,
        },
    },
}

# ------------------------------------------------------------
# Save to JSON
# ------------------------------------------------------------
taxonomy_path = Path.cwd() / "taxonomy.json"

with open(taxonomy_path, "w", encoding="utf-8") as f:
    json.dump(COMPLETE_TAXONOMY, f, indent=2, ensure_ascii=False)

print(f"Taxonomy exported to: {taxonomy_path}")
print(f"File size: {taxonomy_path.stat().st_size / 1024:.1f} KB")

# Quick stats
total_subcategories = 0
for cat_name, cat_data in COMPLETE_TAXONOMY["categories"].items():
    struct = cat_data["structure"]
    if isinstance(struct, dict):
        count = sum(len(v) if isinstance(v, list) else 1 for v in struct.values())
        total_subcategories += len(struct) + count
    elif isinstance(struct, list):
        total_subcategories += len(struct)

print(f"Total categories: {len(COMPLETE_TAXONOMY['categories'])}")
print(f"Total subcategories: {total_subcategories}")