PaperPlanes is a sequential pipeline designed to dismantle the static nature of 2D paintings and reconstruct them as stratified, multi-layered 3D paper theater layers. This is not a "filter"; it is a multimodal reconstruction and spatial reconciliation engine.
A monochromatic, "Glitch-Noir" interface that manages the pipeline’s state.
- Object Permanence: Tracks unique
run_idstate changes to prevent the ingestion of stale artifacts from the GitHub cache. - Manual Overrides: Provides vertex-dragging for perspective warping, text-injection for semantic reconstruction, and a 5-axis affine transformation suite for depth reconciliation.
A serverless bridge that bypasses GitHub’s aggressive 5-minute CDN cache.
- Binary Reconstruction: Fetches file metadata, retrieves raw Base64 strings, and decodes them in-flight into a
Uint8Arrayto serve fresh image bytes directly to the Wizard.
The compute-heavy viscera of the operation.
- Serial Sanity: Uses
concurrencygroups to ensure a single lane of execution, preventing repository corruption during simultaneous triggers. - Multimodal Engine: Leverages Gemini 1.5 Flash to deconstruct and reconstruct imagery, and Depth-Anything-V2 for spatial inference.
- CROP: Auto-detection or manual 4-point vector warp. Result:
cropped_image.png. - GENERATE: Multimodal reconstruction. Painting becomes photograph. Result:
generated_image.png. - DEPTH: Neural spatial inference. Result:
raw_depth_map.png. - ALIGN: Human-in-the-loop affine sync. Result:
realigned_depth_map.png. - SEGMENT: Linear binning and mask smoothing. Result:
paper_planes_layers.zip.