AOG (Attentions on Gaussians)

Project Status

I am currently in the process of cleaning up the codebase. In the meantime, if you're interested, you can check out the project report here: 📄 Project Report

Results

🎥 Video Demonstration

👉 Click the image above to watch the video demonstration!

Overview

AOG is a novel approach to 3D text-guided editing that enhances multi-view consistency when editing images using diffusion models like Stable Diffusion. The core idea behind AOG is that a set of multi-view images represents a single 3D environment and should not be edited independently.

Our approach follows a similar methodology to Instruct-GS2GS, where we first build a Gaussian Splatting Model from a set of multi-view images. We then use InstructPix2Pix, a guided text-editing diffusion model, to edit the original images.

Inspired by Prompt2Prompt, our approach differs by leveraging the geometry obtained from the Gaussian model to model cross-attention maps during image editing. As the images are edited, we back-project them onto the 3D geometry and efficiently render them for the next camera view. This rendered view is then injected into the UNet of the diffusion model, enforcing better 3D consistency in the edited images.

To achieve this, we introduced several changes to the Gaussian Splatting implementation:

Added extra attributes to each Gaussian to store cross-attention values 🔍.
Utilized the fast rasterizer for rendering attention maps more efficiently ⚡.

Editing Paradigm

Maintaining a 3D cross-attention model from tens or hundreds of images is computationally expensive 🖥️. Instead, we select a set of key frames to build the 3D cross-attention model.

The key frames are edited sequentially by:

Rendering the latest cross-attention model from the current camera view 🎭.
Injecting these attention maps into the diffusion model with a weight of 0.6, allowing it to propagate new edits to previously unseen areas. Check out the project repo to know more about the importance of the injection weights 🔄.
After editing each key frame, extracting its cross-attention maps and updating the 3D attention model for improved consistency 📌.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AOG (Attentions on Gaussians)

Project Status

Results

🎥 Video Demonstration

Overview

Editing Paradigm

About

Uh oh!

Releases

Packages

mo-sameh/AOG

Folders and files

Latest commit

History

Repository files navigation

AOG (Attentions on Gaussians)

Project Status

Results

🎥 Video Demonstration

Overview

Editing Paradigm

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages