Before opening this repository, disable all C# extensions (C# Dev Kit, Omnisharp, etc.). With 161K files and 3.8M lines of code, active C# extensions will attempt to load the entire symbol tree and language analysis, which will make your machine very unhappy.
To disable: Extensions → Search "C#" → Click ⚙️ → Disable
A C# taxonomy codebase with 160,000+ species files — built to demonstrate GitHub Copilot on large repositories.
Large repos challenge AI coding assistants. This one tests how well Copilot handles:
- Massive file counts (161K files!)
- Deep folder hierarchies (taxonomic ranks)
- Consistent but varied code patterns
The key insight: Copilot customizations matter more at scale. See .github/copilot-instructions.md for the repo-specific context that makes Copilot effective here.
| Language | Files | Lines | Code | Comments | Blanks |
|---|---|---|---|---|---|
| C# | 146,082 | 3,587,144 | 1,390,952 | 1,457,176 | 739,016 |
| MSBuild | 15,579 | 218,103 | 218,103 | 0 | 0 |
| Markdown | 2 | 162 | 122 | 0 | 40 |
| Total | 161,663 | 3,805,409 | 1,609,177 | 1,457,176 | 739,056 |
📦 123.6 MB of source code
💰 Estimated Cost to Develop: $62.9M • ⏱️ Schedule: 66 months • 👥 Team: 84 people
Metrics via scc using COCOMO model
root/Metazoa/Chordata/Mammalia/Carnivora/Canidae/Canis/
├── Canis.cs # Abstract genus class
├── ICanis.cs # Genus interface
├── Canis_lupus.cs # Species (Wolf)
└── Canis_latrans.cs # Species (Coyote)
The folder hierarchy mirrors the actual biological taxonomic rank exactly. This means knowledge of the rank is encoded in the filesystem itself — Copilot doesn't randomize or need to look up hierarchy.
Given any file path, Copilot can deterministically derive:
- The taxonomic rank (folder depth: Kingdom → Phylum → Class → Order → Family → Genus → Species)
- The namespace (path converted to dotted notation)
- The parent class (folder one level up)
- Sibling genera and families (folder neighbors)
For example, from root/Metazoa/Chordata/Mammalia/Carnivora/Canidae/Canis/Canis_lupus.cs:
| Knowledge | Derivation |
|---|---|
| Rank | 7 levels deep = Species |
| Parent | Folder ../ = Canis class |
| Family | Folder ../../ = Canidae |
| Order | Folder ../../../ = Carnivora |
| Namespace | Path with / → . = AnimalKingdom.root.Metazoa.Chordata.Mammalia.Carnivora.Canidae.Canis |
This DRY principle eliminates ambiguity — there's no metadata lookup needed to understand hierarchical relationships, just filesystem traversal. It makes navigation predictable and enables Copilot to reliably construct paths, derive inheritance, and understand scope without randomized searches.
| Customization | Purpose |
|---|---|
copilot-instructions.md |
Explains structure, file patterns, key fields |
| Consistent naming | Genus_species.cs pattern aids completion |
| XML doc comments | Rich context for Copilot to reference |
IsEnriched field |
Distinguishes stubs from real data |
Instructions provide file-type-specific guidance for editing and maintaining the codebase.
| Instruction | Applies To | Purpose |
|---|---|---|
| Breadcrumb Instructions | **/breadcrumb.md |
YAML metadata structure, navigation patterns, taxonomy-level field conventions |
| C# File Instructions | **/*.cs |
Namespace conventions, file type patterns, inheritance hierarchy, common patterns |
| Interface Instructions | **/I[A-Z]*.cs |
Interface contracts, behavioral patterns, genus/family interface design |
| Species Instructions | **/*_*.cs |
Species file structure, property definitions, enrichment flags, conservation status |
Usage: When editing a file, check the applicable instruction for conventions, patterns, and required fields.
Skills are task-focused utilities for solving specific problems within the repository.
| Skill | Purpose | When to Use |
|---|---|---|
| Pet Lookup | Find species commonly kept as pets | Searching for domestic/pet animals, comparing amenability-to-captivity across families |
| Species Lookup | Find specific species by name, common name, or TaxId | Locating individual species, checking properties like conservation status or lifespan |
| Interface Validation | Validate that species and genus classes implement required interfaces | Ensuring interface compliance across large taxa, bulk validation tasks |
| Breadcrumb Traversal | Use breadcrumb metadata for efficient navigation | Finding related taxa, cross-cutting queries, avoiding deep file scans |
| Breadcrumb Creation | Generate and maintain breadcrumb metadata | Creating new taxa, updating taxonomy levels, aggregating species data |
Usage: When facing a repository task, check the relevant skill for the recommended approach and query strategies.
This repository includes two production-grade harnesses for measuring and benchmarking Copilot behavior on large codebases:
| Harness | Technology | Best For |
|---|---|---|
| CLI-Based | PowerShell + Copilot CLI | Cross-platform, detailed resource metrics |
| SDK-Based | Node.js + Copilot SDK | Direct programmatic access, lower latency |
Both harnesses:
- Execute identical scenarios across your codebase
- Collect identical metrics (tool calls, file access, tokens, execution time)
- Track whether breadcrumbs were used
- Measure Copilot's efficiency navigating your repository
- Generate comparable JSON/CSV results
See copilot-harness.md for detailed setup, architecture comparison, and how to run benchmarks. If you've already run the harnesses, check the comparison guide and results analysis.
See Copilot navigate 160K files in seconds!
This repo includes a pet lookup feature that demonstrates breadcrumb-based navigation. Instead of scanning thousands of files, Copilot uses metadata tags to instantly find pet species:
# Find all pet-containing taxa in one command
grep "has-pets" root/**/breadcrumb.mdResult: 14 breadcrumbs tagged, covering dogs, cats, hamsters, rabbits, guinea pigs, chinchillas, ferrets, goldfish, and budgerigars.
| Query | Method | Time |
|---|---|---|
| "Find pet mammals" | grep "has-pets" |
<1s |
| "Is Felis catus a pet?" | Read genus breadcrumb | <1s |
Try it: Ask Copilot "Can you recommend a pet for my kid that lives in an apartment?"
📄 Read the full pets.md writeup →
.csproj files. Extensions will try to scan them all, causing VS Code to hang.
Run the included script to disable problematic extensions for this workspace:
Windows (PowerShell):
.\.vscode\disable-extensions.ps1Linux/macOS:
chmod +x .vscode/disable-extensions.sh
./.vscode/disable-extensions.shUse
--globalor-Globalflag to disable extensions globally instead of per-workspace.
| File | Purpose |
|---|---|
settings.json |
Disables C# project discovery, OmniSharp, file watchers |
extensions.json |
Lists 38 extensions to disable for this workspace |
disable-extensions.ps1 |
PowerShell script to apply extension disabling |
disable-extensions.sh |
Bash script for Linux/macOS |
The scripts disable these extension categories:
| Category | Extensions |
|---|---|
| C#/.NET | ms-dotnettools.csharp, csdevkit, vscode-dotnet-runtime |
| Azure | All ms-azuretools.* extensions (12 total) |
| Python | ms-python.python, pylance, debugpy, ruff |
| Web/JS | vscode.typescript-language-features, eslint, prettier, tailwindcss |
| Other | vue.volar, playwright, emmet, and more |
See .vscode/extensions.json for the complete list.
Breadcrumb Navigation: This repo's breadcrumb metadata approach was inspired by @ekuris-repos's excellent markdown-frontmatter pattern in the Swarm project. Their implementation demonstrates how YAML frontmatter can elegantly organize and aggregate hierarchical data across large codebases.
AnimalKingdomGenerator — uses NCBI taxonomy + Wikidata + Copilot SDK.
MIT