2026.04.15- #67 -

## ZED X Nano

- StereoLabs의 소형 카메라
- Manipulation을 위한 wrist-mount 카메라
- 1920 * 1200 글로벌 셔터 센서 + RGB + Depth 이미지 (120 FPS) 
- GMSL2 카메라 인터페이스 -> 15m + locking 커넥터 사용 가능
- Vibration-proof IMU
- 센서부터 GPU 까지 Zero-copy 가능
- Neural Depth Engine을 이용해서 sub-millimeter 정확도 취득 가능
- https://www.businesswire.com/news/home/20260413750144/en/Ouster-Launches-Stereolabs-ZED-X-Nano-A-Wrist-Mount-Stereo-Camera-Built-for-Robotic-Manipulation-and-Physical-AI

<img width="560" height="315" alt="Image" src="https://github.com/user-attachments/assets/a009ab3f-88cf-4099-9f42-967a9edb02f9" />

## Boxer: Robust Lifting of Open-World 2D Bounding Boxes to 3D

- Open-vocabulary 2D detector (OWLv2) -> BoxerNet lift 2D detection to 3D oriented bounding box -> Temporal fusion
- Camera intrinsic, Gravity direction 필요
- Optional input: Semi-dense depth.
- Temporal fusion -> Hungarian algorithm
- https://facebookresearch.github.io/boxer/

<img width="1152" height="408" alt="Image" src="https://github.com/user-attachments/assets/5771e9e1-8552-424e-b57a-aa47b795cc24" />

![](https://facebookresearch.github.io/boxer/images/boxer_system.jpg)

## SLAM and VIO in Egocentric Data: Where Long-Horizon Tracking Breaks

- Egocentric VLA 모델 개발 시 long-horizon 데이터를 만들기 어려운 이유에 대해 SLAM/VIO 관점에서 설명
- 1. Long horizon task는 정확도가 굉장히 중요하다
- 2. Degenerate case 풀기 상당히 어렵다
- 3. Head motion이 굉장히 빨라서 어렵다
- 4. IMU 매우 중요하다
- https://www.fpvlabs.ai/essays/slam-and-vio-in-egocentric-data-where-long-horizon-tracking-breaks 

<img width="644" height="408" alt="Image" src="https://github.com/user-attachments/assets/c92f0827-dfb2-43eb-b70c-8ee7a7e27da6" />

## Rust robotics

- Python robotics를 Rust로 포팅!
- 'Rust 한입 잡사봐'
- https://github.com/rsasaki0109/rust_robotics

## EUPE - Efficient Universal Perception Encoder

- Meta의 새로운 Foundation vision encoder
- https://github.com/facebookresearch/EUPE

![](https://github.com/facebookresearch/EUPE/blob/main/assets/teaser.png)


## Google Gemini ER1.6

- https://deepmind.google/blog/gemini-robotics-er-1-6/
- Agentic Physical AI를 위한 한걸음
- Gemini 3.0 Flash, Gemini ER1.6 보다 발전
- This model specializes in reasoning capabilities critical for robotics, including visual and spatial understanding, task planning and success detection. It acts as the high-level reasoning model for a robot, capable of executing tasks by natively calling tools like Google Search to find information, vision-language-action models (VLAs) or any other third-party user-defined functions.

<img width="1920" height="1080" alt="Image" src="https://github.com/user-attachments/assets/1832f3dd-d170-4286-9d49-7e6e025f1322" />

<img width="2068" height="1200" alt="Image" src="https://github.com/user-attachments/assets/c0ad8ef4-05f9-41f2-a986-f5458de82491" />

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2026.04.15- #67 - #70

ZED X Nano

Boxer: Robust Lifting of Open-World 2D Bounding Boxes to 3D

SLAM and VIO in Egocentric Data: Where Long-Horizon Tracking Breaks

Rust robotics

EUPE - Efficient Universal Perception Encoder

Google Gemini ER1.6

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

2026.04.15- #67 - #70

Description

ZED X Nano

Boxer: Robust Lifting of Open-World 2D Bounding Boxes to 3D

SLAM and VIO in Egocentric Data: Where Long-Horizon Tracking Breaks

Rust robotics

EUPE - Efficient Universal Perception Encoder

Google Gemini ER1.6

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions