Merge pull request #70 from arpg/scenesense_update

Scenesense update
arpg · Mar 24, 2024 · bbd0e05 · bbd0e05
2 parents 6bd886d + 56f01f0
commit bbd0e05
Show file tree

Hide file tree

Showing 5 changed files with 57 additions and 8 deletions.
diff --git a/img/scenesense/example_results_h1.png b/img/scenesense/example_results_h1.png
diff --git a/img/scenesense/example_results_h2.png b/img/scenesense/example_results_h2.png
diff --git a/img/scenesense/framework.png b/img/scenesense/framework.png
diff --git a/scenesense.md b/scenesense.md
@@ -39,7 +39,7 @@ arxiv_link_text: "ArXiv"
 # bibtex_link_text: "BiTex"
 ---
 
-<!-- This is the js script that generates random points on page borders -->
+<!-- This is the js script that generates random points on page borders (trying to make a visual nod to diffusion) -->
 <script>
     // Gen a rand int bw min and max (inclusive)
     function getRandomInt(min, max) {
@@ -80,17 +80,66 @@ arxiv_link_text: "ArXiv"
     }
 </script>
 
+<div style="text-align: center;">
+    We present <b>SceneSense</b>, a novel generative 3D diffusion model for synthesizing 3D occupancy information from observations. SceneSense uses a running occupancy map and a single RGB-D camera to generate predicted geometry around the platform, even when the geometry is occluded or out of view. The architecture of our framework ensures that the generative model never overwrites observed free or occupied space, making SceneSense a low risk addition to any robotic planning stack.
+</div>
 
-<div style="text-align:center;">
-    We present ***SceneSense***, a novel generative 3D diffusion model for synthesizing 3D occupancy information from observations. SceneSense uses a running occupancy map and a single RGB-D camera to generate predicted geometry around the platform, even when the geometry is occluded or out of view. The architecture of our framework ensures that the generative model never overwrites observed free or occupied space, making SceneSense a low risk addition to any robotic planning stack.
+<br>
+
+<div style="overflow: auto; text-align: center; width: 80%; margin: 0 auto;">
+    <img src="/img/scenesense/example_results_h1.png" alt="Photo example results" style="display: inline-block; margin-right: 10px; width: 40%;" height="600">
+    <img src="/img/scenesense/example_results_h2.png" alt="Photo example results" style="display: inline-block; margin-left: 10px; width: 40%;" height="600">
+</div>
+
+<br>
+
+## Method
+
+Our occupancy in-painting method ensures that observed space remains intact while integrating SceneSense predictions. Drawing inspiration from image inpainting techniques like image diffusion and guided image synthesis, our approach continuously incorporates known occupancy information during inference. To execute occupancy in-painting, we select a portion of the occupancy map for diffusion, generating masks for occupied and unoccupied voxels. These masks guide the diffusion process to modify only relevant voxels while introducing noise at each step. This iterative process, depicted below, enhances scene predictions' accuracy while preventing the model from altering observed geometry.
+
+<div style="overflow: auto; text-align: center;">
+    <img src="/img/scenesense/framework.png" alt="SceneSense Framework" style="margin-right: auto; margin-left: auto;" height="500">
 </div>
 
 <br>
 
-<div style="overflow: auto;">
-    <img src="/img/scenesense/example_results_h2.png" alt="Photo example results" style="float:right; margin-right:100px;margin-left:100px;" height="600">
-    <p>
-        We present <strong>SceneSense</strong>, a novel generative 3D diffusion model for synthesizing 3D occupancy information from observations. SceneSense uses a running occupancy map and a single RGB-D camera to generate predicted geometry around the platform, even when the geometry is occluded or out of view. The architecture of our framework ensures that the generative model never overwrites observed free or occupied space, making SceneSense a low risk addition to any robotic planning stack.
-    </p>
+## Presentation Video
+
+<div style="text-align:center;">
+  <video width="80%" controls>
+    <source src="/video/scenesense/iros_video.mp4" type="video/mp4">
+    Your browser does not support the video tag.
+  </video>
 </div>
 
+<br>
+
+## Citation
+
+```bib
+@misc{reed2024scenesense,
+      title={SceneSense: Diffusion Models for 3D Occupancy Synthesis from Partial Observation}, 
+      author={Alec Reed and Brendan Crowe and Doncey Albin and Lorin Achey and Bradley Hayes and Christoffer Heckman},
+      year={2024},
+      eprint={2403.11985},
+      archivePrefix={arXiv},
+      primaryClass={cs.RO}
+}
+```
+
+<!-- For styling above Bibtex -->
+<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.19.0/themes/prism-okaidia.min.css"
+      integrity="sha512-pGi87NmT0VeSbmZBK40y3wF4H2DlpCYc5lrO/3F/RPhnwn262NReW3jFtG2iZWhbpoWT5MDzBzawpOri+jcUTw==" crossorigin="anonymous" />
+
+<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.19.0/prism.min.js"
+        integrity="sha512-9ndS8HgVHWQq2A/kpIxygbIZQ7oljc9/AvoEv8SQDy192nAuCGSdk7OdAfCZLDkbRJLZMsrV0NXycMSLLNTWCw==" crossorigin="anonymous">
+</script>
+
+<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.19.0/plugins/autolinker/prism-autolinker.min.js"
+        integrity="sha512-/uypNVmpEQdCQLYz3mq7J2HPBpHkkg23FV4i7/WSUyEuTJrWJ2uZ3gXx1IBPUyB3qbIAY+AODbanXLkIar0NBQ==" crossorigin="anonymous">
+</script>
+
+<script src="https://cdn.jsdelivr.net/npm/prismjs-bibtex@2.1.0/prism-bibtex.js"
+        integrity="sha256-A5GMUmGHpY8mVpfcaRLQFeHtmdjZLumKBOMpf81FXX0="
+        crossorigin="anonymous" referrerpolicy="no-referrer">
+</script>
diff --git a/video/scenesense/iros_video.mp4 b/video/scenesense/iros_video.mp4