UniSHARP:
Universal Sharp Monocular View Synthesis

Meixi Song¹ · Dizhe Zhang^1,* · Hao Ren^1,2 · Ruiyang Zhang^1,3 · Bo Du⁴ · Ming-Hsuan Yang⁵ · Lu Qi^1,4,*
¹Insta360 Research · ²Sun Yat-sen University · ³Beihang University · ⁴Wuhan University · ⁵University of California, Merced

UniSHARP extends SHARP-style photorealistic monocular view synthesis to universal camera systems. Given a single image from a perspective, wide-FoV, fisheye, or panoramic camera, UniSHARP predicts a 3D Gaussian representation and renders high-quality novel views.

🔨 Installation

Clone this repository and enter the project directory:

git clone https://github.com/Insta360-Research-Team/UniSHARP.git
cd Unisharp

Create a fresh conda environment:

conda create -n unisharp python=3.12 -y
conda activate unisharp

Install PyTorch for your CUDA version. The code was smoke-tested with PyTorch 2.8 and torchvision 0.23:

pip install torch==2.8.0 torchvision==0.23.0 torchaudio==2.8.0

Install the remaining Python dependencies:

pip install -r requirements.txt

🧩 External Dependencies

UniK3D

UniSHARP uses UniK3D for universal camera ray and feature prediction. Clone the official repository into Unisharp/UniK3D:

git clone https://github.com/lpiccinelli-eth/UniK3D.git UniK3D

3DGEER

Fisheye rendering depends on the GEER CUDA rasterizer from 3DGEER. Clone the repository into Unisharp/3dgeer:

git clone https://github.com/boschresearch/3dgeer.git 3dgeer

If you only use perspective or panoramic inference, the GEER rasterizer may not be needed. It is required for fisheye rendering paths.

🖼️ Dataset

The released dataset is hosted on Hugging Face:

Dataset: Insta360-Research/OmniRooms
Training manifests: Insta360-Research/OmniRooms/manifests/train
Validation manifests: Insta360-Research/OmniRooms/manifests/validation

OmniRooms is a panoramic simulation dataset highly suitable for 3D reconstruction, especially for 3DGS tasks. It consists of 16 large indoor scenes, each containing multiple rooms, and 300k RGB images covering both small and large pose movements with corresponding depth information. OmniRooms is collected via AirSim, with OmniRooms-Wide derived by projecting these panoramas into 130-degree equidistant fisheye views. For each anchor point on a 0.5 m voxel grid, we render one central camera and 29 cameras randomly sampled within a local axis-aligned 30 cm cube centered on the source camera. To isolate translation-induced synthesis, all cameras share a fixed orientation. Each frame is rendered as a 1024 x 2048 ERP image.

The code supports the following data sources and manifest aliases:

RealEstate10K
HM3D
OmniRooms
OmniRooms-Wide
WildRGB-D
DL3DV
ScanNet++ Fisheye
Replica, and Tanks and Temples for validation-only protocols

Training manifests use the names released under manifests/train:

dataset_manifests/
├── re10k_train_chunks.txt            
├── hm3d_train_scenes.txt            
├── omnirooms.txt              
├── wildrgbd_train_scenes.txt         
├── dl3dv_train_scenes.txt            
└── scanetpp_fisheye_train_scenes.txt

Validation manifests use the names released under manifests/validation:

validation_manifests/
├── re10k.txt                      
├── dl3dv.txt                         
├── hm3d.txt                          
├── omnirooms.txt                      
├── omnirooms_wide.txt              
├── wildrgbd.txt                     
├── scanetpp_fisheye.txt              
├── replica.txt                       
├── tat.txt

🤝 Checkpoints

Training starts UniSHARP heads from scratch and loads the original pretrained UniK3D weights through the UniK3D loader. The official launcher does not resume from a previous UniSHARP checkpoint by default.

Released UniSHARP checkpoints are available at Insta360-Research/Unisharp. Place a checkpoint anywhere on disk and pass the path to validation or inference:

CHECKPOINT=/path/to/pretained_model.pt

🚀 Training

Use the official gt-override training launcher:

bash scripts/train.sh

Training outputs are saved under:

outputs/<run_name>/
├── config.json
├── losses.csv
├── step_XXXXXXX.pt
└── vis/

📊 Validation

Run validation with a checkpoint:

bash scripts/validate_unisharp.sh /path/to/step_XXXXXXX.pt

📒 Inference

Run single-image inference:

python scripts/infer_unisharp.py \
  --checkpoint /path/to/step_XXXXXXX.pt \
  --image /path/to/image.jpg \
  --out-dir outputs/inference

Run a directory or image list:

python scripts/infer_unisharp.py \
  --checkpoint /path/to/step_XXXXXXX.pt \
  --image-dir /path/to/images \
  --out-dir outputs/inference

If calibrated camera parameters are available, pass them through a JSON file. Without this file, the script predicts rays with UniK3D and fits the camera parameters automatically.

Example perspective camera JSON:

{
  "camera": "perspective",
  "intrinsics": {
    "fx": 820.0,
    "fy": 820.0,
    "cx": 512.0,
    "cy": 384.0
  }
}

python scripts/infer_unisharp.py \
  --checkpoint /path/to/step_XXXXXXX.pt \
  --image /path/to/perspective.jpg \
  --camera-json /path/to/perspective_camera.json

Example Fisheye624 camera JSON:

{
  "camera": "fisheye",
  "camera_params": [820.0, 820.0, 512.0, 384.0, 0.01, -0.001, 0.0, 0.0]
}

python scripts/infer_unisharp.py \
  --checkpoint /path/to/step_XXXXXXX.pt \
  --image /path/to/fisheye.jpg \
  --camera-json /path/to/fisheye_camera.json

For batched inference, the JSON can also contain per-image entries:

{
  "default": {
    "camera": "perspective",
    "intrinsics": [820.0, 820.0, 512.0, 384.0]
  },
  "images": {
    "panorama.jpg": {
      "camera": "panorama"
    },
    "fisheye.jpg": {
      "camera": "fisheye",
      "camera_params": [820.0, 820.0, 512.0, 384.0, 0.01, -0.001, 0.0, 0.0]
    }
  }
}

🙏 Acknowledgement

This project builds on open-source work from:

SHARP for monocular Gaussian view synthesis
UniK3D for universal camera geometry and features
3DGEER for generic-camera Gaussian rasterization
gsplat for Gaussian splatting utilities

📝 Citation

@article{song2026unisharp,
  title={UniSHARP: Universal Sharp Monocular View Synthesis},
  author={Song, Meixi and Zhang, Dizhe and Ren, Hao and Zhang, Ruiyang and Du, Bo and Yang, Ming-Hsuan and Qi, Lu},
  journal={arXiv},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
assets		assets
scripts		scripts
unisharp		unisharp
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UniSHARP:
Universal Sharp Monocular View Synthesis

🔨 Installation

🧩 External Dependencies

UniK3D

3DGEER

🖼️ Dataset

🤝 Checkpoints

🚀 Training

📊 Validation

📒 Inference

🙏 Acknowledgement

📝 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

UniSHARP: Universal Sharp Monocular View Synthesis

🔨 Installation

🧩 External Dependencies

UniK3D

3DGEER

🖼️ Dataset

🤝 Checkpoints

🚀 Training

📊 Validation

📒 Inference

🙏 Acknowledgement

📝 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

UniSHARP:
Universal Sharp Monocular View Synthesis

Packages