# Next Steps


In this section, we outline how to analyze and optimize GPU applications using Nvidia's profiling tools, and discuss patterns and considerations for further performance improvements.

## Profiling and Optimization

The instructions for first steps in profiling and optimizing GPU applications are summarized in the [Profiling & Optimization](./profiling-optimization.ipynb) notebook.

## Computational Patterns

When selecting a GPU programming approach, it is crucial to identify the application's *computational patterns* and check for their support:
* Can the pattern be implemented at all?
* Can it be implemented concisely?
* Can it be implemented efficiently?

One common computational pattern in many applications is the *reduction*, which can be a challenge in GPU-accelerated codes.
The [reductions](./reductions.ipynb) notebook compares the degree of support the different GPU programming approaches offer for this pattern.

## Beyond 1D

Many algorithms use multidimensional iteration spaces and data structures, not just 1D.
When choosing a GPU programming approach, consider:
* Can multidimensional iteration spaces be parallelized intuitively?
* Can the thread hierarchy be multidimensional? This can improve data reuse for neighborhood-based access patterns.
* Is there support for multidimensional data structures?

This repository includes a 2D matrix addition example parallelized with the techniques discussed in this workshop which can serve as a starting point.

## Interoperability

Another important consideration is interoperability.
Although beyond the scope of this workshop, many GPU programming approaches provide interfaces to and from CUDA/HIP on their respective platforms.

## Multi-GPU

Scaling workloads to multiple GPUs and even multiple GPU-equipped nodes is crucial in many HPC applications.\
This requires mechanisms for targeting different GPUs on a node and interoperability with distributed memory solutions such as MPI.

## Next Step

Proceed to the [programming challenge](./programming-challenge.ipynb) notebook to apply what you've learned.