GPU Programming with C++ and CUDA, First Edition

This is the code repository for GPU Programming with C++ and CUDA, First Edition, published by Packt.

Uncover effective techniques for writing efficient GPU-parallel C++ applications

Paulo Motta

About the book

Written by Paulo Motta, a senior researcher with decades of experience, this comprehensive GPU programming book is an essential guide for leveraging the power of parallelism to accelerate your computations. The first section introduces the concept of parallelism and provides practical advice on how to think about and utilize it effectively. Starting with a basic GPU program, you then gain hands-on experience in managing the device. This foundational knowledge is then expanded by parallelizing the program to illustrate how GPUs enhance performance.

The second section explores GPU architecture and implementation strategies for parallel algorithms, and offers practical insights into optimizing resource usage for efficient execution. In the final section, you will explore advanced topics such as utilizing CUDA streams. You will also learn how to package and distribute GPU-accelerated libraries for the Python ecosystem, extending the reach and impact of your work.

Combining expert insight with real-world problem solving, this book is a valuable resource for developers and researchers aiming to harness the full potential of GPU computing. The blend of theoretical foundations, practical programming techniques, and advanced optimization strategies it offers is sure to help you succeed in the fast-evolving field of GPU programming.

Key Learnings

Manage GPU devices and accelerate your applications
Apply parallelism effectively using CUDA and C++
Choose between existing libraries and custom GPU solutions
Package GPU code into libraries for use with Python
Explore advanced topics such as CUDA streams
Implement optimization strategies for resource-efficient execution

Chapters

Introduction to Parallel Programming
Setting Up Your Development Environment
Hello CUDA
Hello Again, but in Parallel
A Closer Look into the World of GPUs
Parallel Algorithms with CUDA
Performance Strategies
Overlaying Multiple Operations
Exposing Your Code to Python
Exploring Existing GPU Models

Requirements for this book

You should be comfortable writing computer programs in C++, and basic knowledge of operating systems will help to understand some of the more advanced concepts, given that we have to manage device communication.

Software / hardware covered in the book	Operating system requirements
NVIDIA GPU or access to a Cloud-based VM with NVIDIA GPU	Ubuntu Linux 20 or later with NVIDIA Video Driver
CUDA Toolkit 12
Docker 27.0
VS Code 1.92
CMake 3.16
g++ 9.4
Python 3.8
Nsight Compute 2023.3

In Chapter 2, we discuss options for configuring the development environment. Some of the software that we need is installed automatically if you elect to use the Docker-based development environment.

Code conventions

We are using the following convetions:

camelCase for names of functions, kernels and variables
CamelCase with uppercase letter at beginning for structs
snake_case for names of files
When using two functions to perform the same computation on the CPU and GPU we use the same name with a Cpu/Kernel suffix like: computeSomethingCpu / computeSomethingKernel
When we need to allocate buffers with similar names, we are using h_ preffix for host side and d_for device side.
- float* h_A;
- float* d_A;
When comparing results we add a suffix to the name of the variable _CPU or _GPU.
- float* h_C_GPU; // this is the C array or matrix calculated on the GPU and copied back to the host
- float* h_C_CPU; // this is the C array or matrix calculated on the CPU

Get to know author

Paulo Motta completed the PhD in Computer Science with an emphasis in parallel systems at PUC-Rio in 2011. Currently, Paulo Motta is a Senior Research Software Development Engineer at Microsoft and a postdoctoral researcher on quantum walks simulations with Hiperwalk at the National Scientific Computing Laboratory in Brazil. Paulo is a senior member of IEEE Computer Society with over 25 years' experience in software development and 9 years experience as a university professor.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.devcontainer		.devcontainer
.vscode		.vscode
ch1		ch1
ch10		ch10
ch2		ch2
ch3		ch3
ch4		ch4
ch5		ch5
ch6		ch6
ch7		ch7
ch8		ch8
ch9		ch9
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GPU Programming with C++ and CUDA, First Edition

Uncover effective techniques for writing efficient GPU-parallel C++ applications

About the book

Key Learnings

Chapters

Requirements for this book

Code conventions

Get to know author

Other related books

About

Uh oh!

Releases

Packages

Languages

License

vgudapati/GPU-Programming-with-CPP-and-CUDA

Folders and files

Latest commit

History

Repository files navigation

GPU Programming with C++ and CUDA, First Edition

Uncover effective techniques for writing efficient GPU-parallel C++ applications

About the book

Key Learnings

Chapters

Requirements for this book

Code conventions

Get to know author

Other related books

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages