Skip to content

vgudapati/GPU-Programming-with-CPP-and-CUDA

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GPU Programming with C++ and CUDA, First Edition

This is the code repository for GPU Programming with C++ and CUDA, First Edition, published by Packt.

Uncover effective techniques for writing efficient GPU-parallel C++ applications

Paulo Motta

Free PDF       Graphic Bundle       Amazon      

About the book

Unity Cookbook, Fifth Edition

Written by Paulo Motta, a senior researcher with decades of experience, this comprehensive GPU programming book is an essential guide for leveraging the power of parallelism to accelerate your computations. The first section introduces the concept of parallelism and provides practical advice on how to think about and utilize it effectively. Starting with a basic GPU program, you then gain hands-on experience in managing the device. This foundational knowledge is then expanded by parallelizing the program to illustrate how GPUs enhance performance.

The second section explores GPU architecture and implementation strategies for parallel algorithms, and offers practical insights into optimizing resource usage for efficient execution. In the final section, you will explore advanced topics such as utilizing CUDA streams. You will also learn how to package and distribute GPU-accelerated libraries for the Python ecosystem, extending the reach and impact of your work.

Combining expert insight with real-world problem solving, this book is a valuable resource for developers and researchers aiming to harness the full potential of GPU computing. The blend of theoretical foundations, practical programming techniques, and advanced optimization strategies it offers is sure to help you succeed in the fast-evolving field of GPU programming.

Key Learnings

  • Manage GPU devices and accelerate your applications
  • Apply parallelism effectively using CUDA and C++
  • Choose between existing libraries and custom GPU solutions
  • Package GPU code into libraries for use with Python
  • Explore advanced topics such as CUDA streams
  • Implement optimization strategies for resource-efficient execution

Chapters

Unity Cookbook, Fifth Edition
  1. Introduction to Parallel Programming
  2. Setting Up Your Development Environment
  3. Hello CUDA
  4. Hello Again, but in Parallel
  5. A Closer Look into the World of GPUs
  6. Parallel Algorithms with CUDA
  7. Performance Strategies
  8. Overlaying Multiple Operations
  9. Exposing Your Code to Python
  10. Exploring Existing GPU Models

Requirements for this book

You should be comfortable writing computer programs in C++, and basic knowledge of operating systems will help to understand some of the more advanced concepts, given that we have to manage device communication.

Software / hardware covered in the bookOperating system requirements
NVIDIA GPU or access to a Cloud-based VM with NVIDIA GPUUbuntu Linux 20 or later with NVIDIA Video Driver
CUDA Toolkit 12
Docker 27.0
VS Code 1.92
CMake 3.16
g++ 9.4
Python 3.8
Nsight Compute 2023.3
In Chapter 2, we discuss options for configuring the development environment. Some of the software that we need is installed automatically if you elect to use the Docker-based development environment.

Code conventions

We are using the following convetions:

  1. camelCase for names of functions, kernels and variables
  2. CamelCase with uppercase letter at beginning for structs
  3. snake_case for names of files
  4. When using two functions to perform the same computation on the CPU and GPU we use the same name with a Cpu/Kernel suffix like: computeSomethingCpu / computeSomethingKernel
  5. When we need to allocate buffers with similar names, we are using h_ preffix for host side and d_for device side.
    • float* h_A;
    • float* d_A;
  6. When comparing results we add a suffix to the name of the variable _CPU or _GPU.
    • float* h_C_GPU; // this is the C array or matrix calculated on the GPU and copied back to the host
    • float* h_C_CPU; // this is the C array or matrix calculated on the CPU

Get to know author

Paulo Motta completed the PhD in Computer Science with an emphasis in parallel systems at PUC-Rio in 2011. Currently, Paulo Motta is a Senior Research Software Development Engineer at Microsoft and a postdoctoral researcher on quantum walks simulations with Hiperwalk at the National Scientific Computing Laboratory in Brazil. Paulo is a senior member of IEEE Computer Society with over 25 years' experience in software development and 9 years experience as a university professor.

Other related books

About

GPU Programming with C++ and CUDA, published by Packt

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Cuda 77.3%
  • Python 6.3%
  • C++ 5.7%
  • C 4.7%
  • CMake 4.3%
  • Shell 1.4%
  • Dockerfile 0.3%