Skip to content

ancel1x/Hunter-Hacks

Repository files navigation

πŸ›οΈ Vantage β€” The Vantage Chronicle

Next.js React TypeScript Tailwind CSS A-Frame Node.js License: Private

A server-rendered Next.js app that turns real New York City locations into immersive historical scenes you can explore on desktop, phone, or in Google Cardboard. Open the map, choose a landmark, step into a 360Β° AI-aged panorama of that place in a specific year β€” with hotspots, voice clips, and editorial period notes anchored to the world around you.

πŸ—žοΈ "See New York through time."

The current build ships with 12 scenes spanning Manhattan, Brooklyn, the Bronx, Queens, Staten Island, and New York harbor β€” anchored on years between 1873 and 1917. Each panorama is a real modern Mapillary 360Β° photo of the actual location, edited by Google Gemini 2.5 Flash Image (Nano Banana) to look period-correct using web-grounded historical research. Character voice clips are synthesised at runtime via a server-side ElevenLabs proxy with disk caching, so each unique line costs once.


πŸ—ΊοΈ System Architecture

flowchart TB
    USER["πŸ‘€ User<br/><i>Desktop / Phone / Cardboard</i>"]

    subgraph FE["πŸ–₯️ Frontend β€” Next.js App Router"]
        LANDING["πŸ“„ Landing Page<br/><code>/</code>"]
        MAP["πŸ—ΊοΈ Map Page<br/><code>/map</code>"]
        VIEWER["πŸ₯½ VR Viewer<br/><code>/viewer/[locId]/[eraId]</code>"]
        EDITOR["✏️ Hotspot Editor<br/><code>/editor/[locId]/[eraId]</code>"]
    end

    subgraph API["⚑ API Routes β€” Server-side"]
        TTS["πŸ”Š TTS Proxy<br/><code>/api/tts/[sceneId]/[hotspotId]</code>"]
        TTS_INTRO["🎀 Intro TTS<br/><code>/api/tts/intro/[locId]/[eraId]</code>"]
        CACHE["πŸ’Ύ Disk Cache<br/><code>cache/tts/{sha256}.mp3</code>"]
    end

    subgraph BUILD["πŸ”§ Build-time Pipeline"]
        B1["πŸ“Έ scrape-panorama"]
        B2["πŸ“š gather-period-research"]
        B3["🎨 age-panorama"]
        B4["🎯 generate-hotspots-vision"]
        B5["πŸ“ generate-summaries"]
        B6["πŸ–ΌοΈ bake-infopanels"]
    end

    subgraph EXT["☁️ External APIs"]
        MAPILLARY["πŸ—ΊοΈ Mapillary API<br/><i>360Β° Street Imagery</i>"]
        GEMINI["✨ Google Gemini 2.5 Flash<br/><i>Image Gen + Search + Vision</i>"]
        ELEVEN["πŸŽ™οΈ ElevenLabs API<br/><i>Text-to-Speech</i>"]
    end

    %% User β†’ Frontend
    USER -->|"Browse"| LANDING
    USER -->|"Explore map"| MAP
    USER -->|"Enter scene"| VIEWER
    VIEWER -->|"Gaze hotspot"| TTS
    VIEWER -->|"Landmark intro"| TTS_INTRO

    %% Frontend β†’ API
    TTS -->|"Cache miss?"| ELEVEN
    TTS_INTRO -->|"Cache miss?"| ELEVEN
    TTS -->|"Cache hit"| CACHE
    TTS_INTRO -->|"Cache hit"| CACHE
    ELEVEN -->|"Write MP3"| CACHE

    %% Build pipeline β†’ External APIs
    B1 -->|"Download 360Β° photos"| MAPILLARY
    B2 -->|"Search-grounded research"| GEMINI
    B3 -->|"Image-to-image aging"| GEMINI
    B4 -->|"Vision + bbox detection"| GEMINI
    B5 -->|"Summarize"| GEMINI

    %% Build outputs feed the frontend
    B1 -.->|"public/panoramas-modern/"| B3
    B2 -.->|"data/period-research.json"| B3
    B3 -.->|"public/panoramas/"| VIEWER
    B4 -.->|"data/scenes/"| VIEWER
    B5 -.->|"data/location-summaries.json"| MAP
    B6 -.->|"public/infopanels/"| VIEWER

    %% Styling
    classDef user fill:#f4efe6,stroke:#8a2a1f,color:#333,stroke-width:2px
    classDef frontend fill:#fff3e0,stroke:#e65100,color:#333,stroke-width:2px
    classDef api fill:#e8f5e9,stroke:#2e7d32,color:#333,stroke-width:2px
    classDef build fill:#e3f2fd,stroke:#1565c0,color:#333,stroke-width:2px
    classDef external fill:#f3e5f5,stroke:#7b1fa2,color:#333,stroke-width:2px

    class USER user
    class LANDING,MAP,VIEWER,EDITOR frontend
    class TTS,TTS_INTRO,CACHE api
    class B1,B2,B3,B4,B5,B6 build
    class MAPILLARY,GEMINI,ELEVEN external
Loading

🎬 The 12 Scenes

# πŸ“ Location πŸ™οΈ Borough πŸ“… Year βš“ Anchor
I Times Square Manhattan 1904 IRT subway opens; Longacre→Times rename
II Brooklyn Bridge Brooklyn 1883 May 24 opening day
III Liberty Island NY Harbor 1907 Peak immigration year across the bay
IV Central Park Manhattan 1873 Bethesda Terrace unveiled
V Lower East Side Manhattan 1911 Triangle Shirtwaist fire; Hester Street market peak
VI Coney Island Brooklyn 1904 Luna Park's million-bulb opening
VII Prospect Park Brooklyn 1873 Olmsted/Vaux's Brooklyn masterpiece
VIII DUMBO Brooklyn 1900 Empire Stores warehouses, working waterfront
IX Grand Concourse Bronx 1909 Risse's Champs-Γ‰lysΓ©es-modeled boulevard opens
X South Bronx Hub Bronx 1909 Third Ave El + commercial peak
XI Jackson Heights Queens 1917 First cooperative garden apartments rise
XII St. George Ferry Terminal Staten Island 1908 New terminal opens after the 1905 fire

βš™οΈ How It Works

build-time pipeline (run once)                 runtime (Next.js server)
──────────────────────────────                 ─────────────────────────
1. πŸ“Έ scrape-panorama       Mapillary 360 β†’   GET  /              πŸ“„ landing
   β†’ public/panoramas-modern/                  GET  /map           πŸ—ΊοΈ NYC map + accordion sidebar
                                               GET  /viewer/[loc]/[era]
2. πŸ“š gather-period-research  Gemini Search β†’                      πŸ₯½ A-Frame 360 + hotspots
                              ~200 words/scene                     (gaze 1.5s β†’ InfoPanel)

3. 🎨 age-panorama           Nano Banana β†’    GET  /api/tts/[scene]/[hotspot]
                              (modern + research)                  πŸ”Š ElevenLabs proxy + disk cache
   β†’ public/panoramas/                            (only runtime API call)

4. 🎯 generate-hotspots-vision Gemini vision +
                              bbox detection β†’
   β†’ data/scenes/

5. πŸ“ generate-summaries     Wikipedia + Gemini
                              Flash-Lite β†’
   β†’ data/location-summaries.json

6. πŸ–ΌοΈ bake-infopanels        @napi-rs/canvas β†’
   β†’ public/infopanels/             (PNG textures used inside Cardboard mode where
                                     HTML/CSS doesn't reach into the WebGL scene)

The only runtime API call is /api/tts/[sceneId]/[hotspotId] β€” it proxies ElevenLabs, hashes the (text + voiceId) pair with SHA-256, and caches the resulting MP3 on disk under cache/tts/{sha}.mp3. Each unique line burns the API once for the lifetime of the cache.


πŸ› οΈ Tech Stack

Category Technology
⚑ Framework Next.js 15 App Router + TypeScript, server-rendered (with API routes β€” not a static export)
🎨 Styling Tailwind for layout primitives + hand-written editorial CSS in app/globals.css
🌊 Motion motion library + CSS keyframes + IntersectionObserver for scroll reveals
πŸ—ΊοΈ Map react-leaflet with CartoDB light_nolabels tiles, run through an SVG feColorMatrix + feComponentTransfer color-lookup filter that maps tile luminance through a 6-stop gradient (ink β†’ 4 oxblood intermediates β†’ paper) so the map reads in the site palette instead of fighting it
πŸ₯½ VR / 360 viewer A-Frame (loaded via next/script), with built-in stereoscopic Cardboard mode and magic-window fallback
πŸ–ΌοΈ Build-time canvas @napi-rs/canvas for InfoPanel PNG bakes
πŸ”€ Fonts Playfair Display + EB Garamond + Inter Tight + JetBrains Mono (self-hosted via @fontsource)

πŸ“° Design System β€” The Vantage Chronicle

Cream paper background (#f4efe6), oxblood accent (#8a2a1f), generous serif typography, hairline rules, animated rule-line draws, letter-by-letter title reveals, drop caps, kicker labels, paper grain overlay. Thematically the UI should feel like opening the morning paper to read about Times Square in 1904 β€” not like a generic Tailwind site.

CSS tokens live in app/globals.css. Editorial primitives live in components/ui/.


πŸš€ Setup

πŸ“‹ Prerequisites

πŸ“¦ Install + First Run

# install dependencies
npm install

# create env file (do not commit)
cp .env.local.example .env.local
# then edit .env.local and fill in:
#   MAPILLARY_ACCESS_TOKEN=...
#   GEMINI_API_KEY=...
#   ELEVENLABS_API_KEY=...

# run the build-time pipeline (in order)
npm run scrape-panorama          # πŸ“Έ Mapillary β†’ public/panoramas-modern/
npm run gather-period-research   # πŸ“š Gemini search β†’ data/period-research.json
npm run age-panorama             # 🎨 Nano Banana β†’ public/panoramas/
npm run generate-hotspots-vision # 🎯 Gemini vision β†’ data/scenes/
npm run generate-summaries       # πŸ“ Wikipedia + Gemini β†’ data/location-summaries.json
npm run bake-infopanels          # πŸ–ΌοΈ canvas β†’ public/infopanels/

# start the dev server
npm run dev

Open http://localhost:3000. πŸŽ‰

πŸ“± Phone + Cardboard Testing (HTTPS Required)

iOS Safari requires HTTPS for DeviceOrientationEvent.requestPermission(). Plain LAN HTTP from npm run dev won't trigger the gyroscope on iPhone. Use Cloudflare Tunnel:

# from the project directory, with the dev server running on port 3000
cloudflared tunnel --url http://localhost:3000 --protocol http2

The --protocol http2 flag is important β€” without it cloudflared defaults to QUIC over UDP/443, which many networks block. Cloudflared prints a https://*.trycloudflare.com URL β€” open that on your phone, navigate to a viewer route, tap A-Frame's VR icon, grant motion permission, drop in headset. πŸ₯½


πŸ”§ Build-time Pipeline Scripts

All scripts are idempotent β€” safe to re-run. Most accept --force to regenerate ignoring cached state.

Script πŸ“– What it does πŸ’° Cost (full 12-scene run)
npm run scrape-panorama Calls Mapillary Graph API for each location, picks the highest-resolution 360Β° photo, downloads to public/panoramas-modern/{locId}.jpg. Records contributor attribution in data/panorama-sources.json. Free
npm run gather-period-research Calls Gemini 2.5 Flash with Google Search grounding to produce ~200 words of web-sourced visual context per (loc, era) β€” buildings present that year, signage by name, vehicles, dress, surfaces, lighting. Cited URLs preserved in the JSON. ~$0.42
npm run age-panorama Sends each modern 360 + the locked aging prompt + research blob to Nano Banana (gemini-2.5-flash-image); writes public/panoramas/{locId}__{eraId}.jpg. Append -- --force to overwrite. ~$0.47
npm run generate-hotspots-vision Sends each aged JPG to Gemini 2.5 Flash with vision; uses bbox detection (box_2d in [ymin, xmin, ymax, xmax]/[0, 1000]) to spatially place 3–5 period-grounded hotspots per scene. Computes box centers and converts to (yaw, pitch) via the equirectangular projection. ~$0.05
npm run generate-summaries Wikipedia REST + Gemini Flash-Lite β†’ ~50-word period blurbs that show in the sidebar accordion. Negligible
npm run bake-infopanels Renders each hotspot's editorial paper-card to a PNG via @napi-rs/canvas. The PNGs are loaded as <a-image> planes inside Cardboard mode, where HTML/CSS can't reach into the WebGL scene. Free

Total cost for a full re-build of all 12 scenes: under $1 πŸ’΅

For the older Wikidata-based hotspot generator (kept as a fallback), run npm run generate-hotspots.

πŸ—ΊοΈ Mapillary Coverage Gaps

If a lat/lon search returns nothing (common in dense central NYC and outer-borough residential areas), the scrape script tries progressively wider bboxes (150m β†’ 300m β†’ 500m β†’ 800m β†’ 1200m). If still empty, you have two escape hatches:

  1. πŸ” Find a Mapillary 360 image manually at https://www.mapillary.com/, copy its image ID from the URL, set MAPILLARY_FALLBACK_<LOCID_UPPER>=<id> in .env.local, and re-run.
  2. πŸ“‚ Drop a hand-sourced equirectangular JPG into public/panoramas-modern/{locId}.jpg. The aging step picks it up regardless of source.

The current build uses fallback IDs for LIBERTY_ISLAND, DUMBO, SOUTH_BRONX_HUB, and JACKSON_HEIGHTS.


βž• Adding a New Scene

  1. πŸ“ Add an entry to data/locations.json with id, name, lat, lon, numeral, blurb, and one or more eras (each with id, year, label, labelLong, and an event anchor sentence).
  2. πŸ—ΊοΈ If the lat/lon has no Mapillary 360 coverage, set the fallback env var (above).
  3. πŸ“… Add an entry to ERA_NOTES in scripts/age-panorama.ts for any new year (vehicles, dress, signage cues for the period).
  4. πŸ”„ Run the pipeline scripts in order.

A hidden /editor/[locId]/[eraId] route exists for manual hotspot fine-tuning if you want pixel-perfect placement; it's gated behind NODE_ENV === 'development'. πŸ› οΈ


🎯 Hotspot Positioning + Gaze Hit Detection

Hotspots are placed by Gemini's vision bounding-box detection β€” the model returns box_2d in [ymin, xmin, ymax, xmax]/[0, 1000] (the format Gemini was trained on for grounding) and the script computes the box center, then maps to (yaw, pitch) via:

yaw   = (x - 0.5) Γ— 360Β°    // x in [0, 1] from the equirect
pitch = -(y - 0.5) Γ— 180Β°

The viewer's visible gaze marker is a thin oxblood ring (cosmetic only, ~0.04m thick annulus). The actual raycaster hit target is an invisible 0.45m circle at each hotspot β€” gives ~10Γ— the catch area so the gaze reticle reliably triggers the hotspot when the user holds their gaze in the rough vicinity. πŸ‘οΈ


πŸ”Š How Runtime Audio Works

The browser plays MP3s via the <audio> element. The bytes come from /api/tts/{sceneId}/{hotspotId} β€” a Next.js server route that:

  1. πŸ”Ž Looks up the hotspot's voice.text + voice.voiceId in the scene JSON
  2. πŸ” Hashes that pair (SHA-256) and checks cache/tts/{hash}.mp3
  3. πŸ’¨ On miss: calls ElevenLabs, writes the bytes to disk, returns them
  4. ⚑ On hit: streams the cached file

Each unique line burns its character cost once for the lifetime of the cache. Twelve scenes Γ— ~120 chars/line β‰ˆ 1500 chars total, well under the 10k chars/month free tier. 🎢


☁️ Vercel Deploy (Optional)

npm run dev on a laptop is the demo target. If you want a public share link:

vercel

⚠️ Caveat: Vercel functions are stateless, so the cache/tts/ directory does not survive cold starts in production. Either commit the prewarmed MP3s into the repo (remove cache/ from .gitignore and git add cache/tts/*.mp3), or move the cache to Vercel Blob storage (free tier).


πŸ“ Project Structure

app/                          β€” ⚑ Next.js routes (page.tsx, viewer/, editor/, api/tts/)
components/                   β€” 🧩 UI components (LandingClient, MapPageClient, VRScene, InfoPanel, etc.)
components/ui/                β€” ✨ Editorial design primitives (Rule, SplitText, Kicker, Button, PaperGrain)
data/locations.json           β€” πŸ“ 12 NYC locations
data/scenes/                  β€” 🎯 Per-scene hotspot JSON (generated)
data/period-research.json     β€” πŸ“š Web-grounded visual research per (loc, era) (generated)
data/location-summaries.json  β€” πŸ“ Sidebar period blurbs (generated)
data/panorama-sources.json    β€” πŸ—ΊοΈ Mapillary contributor attribution (generated)
scripts/                      β€” πŸ”§ Build-time pipelines
public/panoramas-modern/      β€” πŸ“Έ Modern Mapillary 360s (input to aging step)
public/panoramas/             β€” 🎨 Aged 360 JPGs (final output)
public/infopanels/            β€” πŸ–ΌοΈ Baked editorial paper-card PNGs (generated)
public/fonts/raw/             β€” πŸ”€ TTF files for the canvas bake (auto-downloaded)
public/ambient/               β€” 🎡 Looping ambient MP3s (drop in CC0 audio from freesound.org)
cache/tts/                    β€” πŸ’Ύ Runtime ElevenLabs cache (gitignored)
lib/                          β€” πŸ“¦ Server-side helpers (scenes loader, ElevenLabs client, types)

🚧 Hard Constraints (Intentional)

  • πŸ’° $0 to ship the demo locally beyond the one-time build pipeline (~$1 total). Runtime spend is bounded by the number of unique voice lines, not the number of users, because of the disk cache.
  • ⚠️ A-Frame + Next.js SSR breaks on first import. All A-Frame mounts must use dynamic({ ssr: false }).
  • πŸ–ΌοΈ Image-to-image aging is bounded by the modern Mapillary frame's geometry. Nano Banana can re-skin facades, swap signage, change dress and vehicles, and remove modern displays β€” but it cannot add buildings that weren't in the source photo, can't move the camera, and can't recreate literal historical geometry.

πŸ™ Attribution

  • πŸ—ΊοΈ 360Β° street-level imagery Β© Mapillary contributors, used per their open license.
  • ✨ Period imagery aged via Google Gemini 2.5 Flash Image (Nano Banana).
  • πŸŽ™οΈ Voice clips synthesised with ElevenLabs.
  • πŸ—ΊοΈ Map tiles Β© OpenStreetMap contributors, served via CARTO.
  • πŸ“š Period research grounded in citations returned by Google Search via Gemini's grounding tool.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors