A PyTorch implementation of Neural Radiance Fields (NeRF) for synthesizing novel views of complex 3D scenes from a set of 2D images.
This project implements the core concepts from the paper "NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis" by Mildenhall et al. The implementation includes:
- Positional Encoding: High-frequency encoding of spatial coordinates and viewing directions
- Volume Rendering: Ray marching and alpha compositing for image synthesis
- View-Dependent Effects: Modeling view-dependent appearance using viewing directions
- 3D Scene Reconstruction: Training on multi-view images to learn implicit 3D representations
- 2D image fitting with positional encoding
- Full 3D NeRF implementation with volume rendering
- View-dependent color prediction
- Ray generation and sampling
- Hierarchical volume sampling
- Training utilities and visualization tools
cd conda
mamba env create -f environment.yml # or use conda
conda activate nerf_proj
cd ..
pip install -e .conda create -n nerf_proj python=3.11.0
conda activate nerf_proj
cd conda
pip install -r requirements.txt
cd ..
pip install -e .from vision.training import train_nerf
from vision.nerf import render_image_nerf
# Train the model
model, encode = train_nerf(
images=train_images,
tform_cam2world=camera_poses,
cam_intrinsics=intrinsics,
testpose=test_pose,
testimg=test_image,
height=height,
width=width,
near_thresh=2.0,
far_thresh=6.0,
device=device,
num_frequencies=6,
depth_samples_per_ray=64,
lr=5e-4,
num_iters=1000
).
├── src/
│ └── vision/
│ ├── encoding.py # Positional encoding and 2D model
│ ├── nerf.py # NeRF model and rendering functions
│ ├── training.py # Training pipeline
│ └── utils.py # Utility functions
├── tests/ # Unit tests
│ ├── test_encoding.py # Tests for positional encoding
│ └── test_nerf.py # Tests for NeRF core functions
├── conda/
│ ├── environment.yml # Conda environment specification
│ └── requirements.txt # Pip requirements
├── nerf_colab.ipynb # Google Colab notebook for end-to-end training and evaluation
├── nerf_local.ipynb # Local Jupyter notebook for end-to-end training and evaluation
├── output/ # Training outputs, saved models, and demo videos
│ ├── nerf_model.pth # Trained model checkpoint
│ └── view_dependence_yes.mp4 # 360° demo video
└── README.md
Maps input coordinates to higher dimensional space using sinusoidal functions:
γ(p) = [sin(2^0 π p), cos(2^0 π p), ..., sin(2^(L-1) π p), cos(2^(L-1) π p)]
- 8-layer MLP with skip connections
- Inputs: encoded 3D position (and optionally encoded viewing direction)
- Outputs: RGB color and volume density (sigma)
- View-dependent rendering for realistic reflections and specularities
Classical volume rendering with numerical quadrature:
C(r) = Σ T(t_i) * α(t_i) * c(t_i)
where T(t) is transmittance and α(t) is alpha value at sample point t.
The trained model can:
- Synthesize photorealistic novel views from arbitrary camera positions
- Capture view-dependent effects like specular highlights and reflections
- Reconstruct 3D geometry implicitly through volume density
A sample 360° novel view synthesis video is available in output/view_dependence_yes.mp4, demonstrating the model's ability to render realistic views with view-dependent lighting effects.
This project was originally based on a computer vision course assignment. The overall structure, testing infrastructure, and dataset format were provided by the course staff.
My contributions include:
- Implementing all core NeRF components: positional encoding, MLP architecture, ray generation, volume rendering, and compositing
- Developing the complete training pipeline with view-dependent rendering support
- Adding visualization utilities and interactive Jupyter notebooks for experimentation
- Rewriting documentation, README, and packaging to make the project standalone
- Refactoring code structure and removing course-specific markers for clarity
This project is for educational and research purposes.