CUDA Path Tracer
University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 3
- (TODO) YOUR NAME HERE
- Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
(TODO: Your README)
DO NOT leave the README to the last minute! It is a crucial part of the project, and we will not be able to grade you without a good README.
Instructions (delete me)
This is NOW due
Thursday, September 24 Tuesday, September 29 evening at midnight.
Summary: In this project, you'll implement a CUDA-based path tracer capable of rendering globally-illuminated images very quickly. Since in this class we are concerned with working in GPU programming, performance, and the generation of actual beautiful images (and not with mundane programming tasks like I/O), this project includes base code for loading a scene description file, described below, and various other things that generally make up a framework for previewing and saving images.
The core renderer is left for you to implement. Finally, note that, while this base code is meant to serve as a strong starting point for a CUDA path tracer, you are not required to use it if you don't want to. You may also change any part of the base code as you please. This is YOUR project.
Recommendation: Every image you save should automatically get a different filename. Don't delete all of them! For the benefit of your README, keep a bunch of them around so you can pick a few to document your progress at the end.
src/C++/CUDA source files.
scenes/Example scene description files.
img/Renders of example scene description files. (These probably won't match precisely with yours.)
external/Includes and static libraries for 3rd party libraries.
Running the code
The main function requires a scene description file. Call the program with
one as an argument:
(In Visual Studio,
If you are using Visual Studio, you can set this in the Debugging > Command Arguments section in the Project properties. Make sure you get the path right - read the console for errors.
- Esc to save an image and exit.
- Space to save an image. Watch the console for the output filename.
- W/A/S/D and R/F move the camera. Arrow keys rotate.
Ask on the mailing list for clarifications.
In this project, you are given code for:
- Loading and reading the scene description format
- Sphere and box intersection functions
- Support for saving images
- Working CUDA-GL interop for previewing your render while it's running
- A function which generates random screen noise (instead of an actual render).
You will need to implement the following features:
- Raycasting from the camera into the scene through an imaginary grid of pixels
- Implement simple antialiasing (by jittering rays within each pixel).
- Diffuse surfaces (using provided cosine-weighted scatter function) [PBRT 8.3].
- Perfectly specular-reflective (mirrored) surfaces (e.g. using
- See notes on diffuse/specular in
scatterRayand on imperfect specular below.
- See notes on diffuse/specular in
- Stream compaction optimization, using:
- NEWLY ADDED: Work-efficient stream compaction using shared memory across multiple blocks. (See GPU Gems 3, Chapter 39.)
You are also required to implement at least 2 of the following features. If you find other good references for these features, share them! Extra credit: implement more features on top of the 2 required ones, with point value up to +20/100 at the grader's discretion (based on difficulty and coolness).
- NOW REQUIRED - NOT AN EXTRA:
Work-efficient stream compaction (see above).
- These 2 smaller features:
- Refraction (e.g. glass/water) [PBRT 8.2] with Frensel effects using
or more accurate methods [PBRT 8.5]. You can use
glm::refractfor Snell's law.
- Physically-based depth-of-field (by jittering rays within an aperture) [PBRT 6.2.3].
- Recommended but not required: non-perfect specular surfaces. (See below.) (INSTRUCTOR TODO: require in the future?)
- Refraction (e.g. glass/water) [PBRT 8.2] with Frensel effects using Schlick's approximation or more accurate methods [PBRT 8.5]. You can use
- Texture mapping [PBRT 10.4].
- Bump mapping [PBRT 9.3]. (INSTRUCTOR TODO: combine with texture mapping, and possibly add displacement mapping.)
- Direct lighting (by taking a final ray directly to a random point on an emissive object acting as a light source). Or more advanced [PBRT 15.1.1].
- Some method of defining object motion, and motion blur by averaging samples at different times in the animation.
- Subsurface scattering [PBRT 5.6.2, 11.6].
- Arbitrary mesh loading and rendering (e.g.
objfiles). You can find these online or export them from your favorite 3D modeling application. With approval, you may use a third-party OBJ loading code to bring the data into C++.
- You can use the triangle intersection function
- You can use the triangle intersection function
This 'extra features' list is not comprehensive. If you have a particular idea you would like to implement (e.g. acceleration structures, etc.), please contact us first.
For each extra feature, you must provide the following analysis:
- Overview write-up of the feature
- Performance impact of the feature
- If you did something to accelerate the feature, what did you do and why?
- Compare your GPU version of the feature to a HYPOTHETICAL CPU version (you don't have to implement it!) Does it benefit or suffer from being implemented on the GPU?
- How might this feature be optimized beyond your current implementation?
Base Code Tour
You'll be working in the following files. Look for important parts of the code:
CHECKITOUT. You'll have to implement parts labeled with
(But don't let these constrain you - you have free rein!)
src/pathtrace.cu: path tracing kernels, device functions, and calling code
pathtraceInitinitializes the path tracer state - it should copy scene data (e.g. geometry, materials) from
pathtraceFreefrees memory allocated by
pathtraceperforms one iteration of the rendering - it handles kernel launches, memory copies, transferring some data, etc.
- See comments for a low-level path tracing recap.
src/intersections.h: ray intersection functions
sphereIntersectionTest, which take in a ray and a geometry object and return various properties of the intersection.
src/interactions.h: ray scattering functions
calculateRandomDirectionInHemisphere: a cosine-weighted random direction in a hemisphere. Needed for implementing diffuse surfaces.
scatterRay: this function should perform all ray scattering, and will call
calculateRandomDirectionInHemisphere. See comments for details.
src/main.cpp: you don't need to do anything here, but you can change the program to save
.hdrimage files, if you want (for postprocessing).
Generating random numbers
thrust::default_random_engine rng(hash(index)); thrust::uniform_real_distribution<float> u01(0, 1); float result = u01(rng);
There is a convenience function for generating a random engine using a combination of index, iteration, and depth as the seed:
thrust::default_random_engine rng = random_engine(iter, index, depth);
Imperfect specular lighting
In path tracing, like diffuse materials, specular materials are simulated using a probability distribution instead computing the strength of a ray bounce based on angles.
Equations 7, 8, and 9 of GPU Gems 3, Chapter 20 give the formulas for generating a random specular ray. (Note that there is a typographical error: χ in the text = ξ in the formulas.)
Also see the notes in
scatterRay for probability splits between
diffuse/specular/other material types.
See also: PBRT 8.2.2.
Handling Long-Running CUDA Threads
By default, your GPU driver will probably kill a CUDA kernel if it runs for more than 5 seconds. There's a way to disable this timeout. Just beware of infinite loops - they may lock up your computer.
The easiest way to disable TDR for Cuda programming, assuming you have the NVIDIA Nsight tools installed, is to open the Nsight Monitor, click on "Nsight Monitor options", and under "General" set "WDDM TDR enabled" to false. This will change the registry setting for you. Close and reboot. Any change to the TDR registry setting won't take effect until you reboot. Stack Overflow
Notes on GLM
This project uses GLM for linear algebra.
On NVIDIA cards pre-Fermi (pre-DX12), you may have issues with mat4-vec4
multiplication. If you have one of these cards, be careful! If you have issues,
you might need to grab
multiplyMV from the
Fall 2014 project.
Let us know if you need to do this.
Scene File Format
This project uses a custom scene description format. Scene files are flat text
files that describe all geometry, materials, lights, cameras, and render
settings inside of the scene. Items in the format are delimited by new lines,
and comments can be added using C-style
Materials are defined in the following fashion:
- MATERIAL (material ID) //material header
- RGB (float r) (float g) (float b) //diffuse color
- SPECX (float specx) //specular exponent
- SPECRGB (float r) (float g) (float b) //specular color
- REFL (bool refl) //reflectivity flag, 0 for no, 1 for yes
- REFR (bool refr) //refractivity flag, 0 for no, 1 for yes
- REFRIOR (float ior) //index of refraction for Fresnel effects
- EMITTANCE (float emittance) //the emittance strength of the material. Material is a light source iff emittance > 0.
Cameras are defined in the following fashion:
- CAMERA //camera header
- RES (float x) (float y) //resolution
- FOVY (float fovy) //vertical field of view half-angle. the horizonal angle is calculated from this and the reslution
- ITERATIONS (float interations) //how many iterations to refine the image
- DEPTH (int depth) //maximum depth (number of times the path will bounce)
- FILE (string filename) //file to output render to upon completion
- EYE (float x) (float y) (float z) //camera's position in worldspace
- VIEW (float x) (float y) (float z) //camera's view direction
- UP (float x) (float y) (float z) //camera's up vector
Objects are defined in the following fashion:
- OBJECT (object ID) //object header
- (cube OR sphere OR mesh) //type of object, can be either "cube", "sphere", or "mesh". Note that cubes and spheres are unit sized and centered at the origin.
- material (material ID) //material to assign this object
- TRANS (float transx) (float transy) (float transz) //translation
- ROTAT (float rotationx) (float rotationy) (float rotationz) //rotation
- SCALE (float scalex) (float scaley) (float scalez) //scale
Two examples are provided in the
scenes/ directory: a single emissive sphere,
and a simple cornell box made using cubes for walls and lights and a sphere in
Third-Party Code Policy
- Use of any third-party code must be approved by asking on our Google Group.
- If it is approved, all students are welcome to use it. Generally, we approve use of third-party code that is not a core part of the project. For example, for the path tracer, we would approve using a third-party library for loading models, but would not approve copying and pasting a CUDA function for doing refraction.
- Third-party code MUST be credited in README.md.
- Using third-party code without its approval, including using another student's code, is an academic integrity violation, and will, at minimum, result in you receiving an F for the semester.
Please see: TIPS FOR WRITING AN AWESOME README
- Sell your project.
- Assume the reader has a little knowledge of path tracing - don't go into detail explaining what it is. Focus on your project.
- Don't talk about it like it's an assignment - don't say what is and isn't "extra" or "extra credit." Talk about what you accomplished.
- Use this to document what you've done.
- DO NOT leave the README to the last minute! It is a crucial part of the project, and we will not be able to grade you without a good README.
- This is a renderer, so include images that you've made!
- Be sure to back your claims for optimization with numbers and comparisons.
- If you reference any other material, please provide a link to it.
- You wil not be graded on how fast your path tracer runs, but getting close to real-time is always nice!
- If you have a fast GPU renderer, it is very good to show case this with a video to show interactivity. If you do so, please include a link!
- Stream compaction helps most after a few bounces. Print and plot the effects of stream compaction within a single iteration (i.e. the number of unterminated rays after each bounce) and evaluate the benefits you get from stream compaction.
- Compare scenes which are open (like the given cornell box) and closed (i.e. no light can escape the scene). Again, compare the performance effects of stream compaction! Remember, stream compaction only affects rays which terminate, so what might you expect?
If you have modified any of the
CMakeLists.txt files at all (aside from the
SOURCE_FILES), you must test that your project can build in Moore
100B/C. Beware of any build issues discussed on the Google Group.
- Open a GitHub pull request so that we can see that you have finished.
The title should be "Submission: YOUR NAME".
- In the body of the pull request, include a link to your repository.
- Send an email to the TA (gmail: kainino1+cis565@) with:
- Subject: in the form of
[CIS565] Project N: PENNKEY.
- Direct link to your pull request on GitHub.
- Estimate the amount of time you spent on the project.
- If there were any outstanding problems, or if you did any extra work, briefly explain.
- Feedback on the project itself, if any.
- Subject: in the form of
- [PBRT] Physically Based Rendering, Second Edition: From Theory To Implementation. Pharr, Matt and Humphreys, Greg. 2010.