# Objectives

- To learn the main venues and developer resources for GPU computing
  - Where CUDA C fits in the big picture

<hr style="height:2px">

# 3 Ways to Accelerate Applications

![alt tag](img/3.png)
<hr style="height:2px">

# Libraries: Easy, High-Quality Acceleration

- **Ease of use**: Using libraries enables GPU acceleration without in depth knowledge of GPU programming
- **“Drop-in”**: Many GPU-accelerated libraries follow standard APIs, thus enabling acceleration with minimal code changes
- **Quality**: Libraries offer high-quality implementations of functions encountered in a broad range of applications
<hr style="height:2px">

# GPU Accelerated Libraries

![alt tag](img/5.png)
<hr style="height:2px">

# Vector Addition in Thrust


>```c++
thrust::device_vector<float> deviceInput1(inputLength);
thrust::device_vector<float> deviceInput2(inputLength);
thrust::device_vector<float> deviceOutput(inputLength);
thrust::copy(hostInput1, hostInput1 + inputLength,
deviceInput1.begin());
thrust::copy(hostInput2, hostInput2 + inputLength,
deviceInput2.begin());
thrust::transform(deviceInput1.begin(), deviceInput1.end(),
deviceInput2.begin(), deviceOutput.begin(),
thrust::plus<float>());
```

<hr style="height:2px">

# Compiler Directives: Easy, Portable Acceleration

- **Ease of use**: Compiler takes care of details of parallelism management and data movement
- **Portable**: The code is generic, not specific to any type of hardware and can be deployed into multiple languages
- **Uncertain**: Performance of code can vary across compiler versions
<hr style="height:2px">

# OpenACC

```cpp
#pragma acc parallel loop
copyin(input1[0:inputLength],input2[0:inputLength]),
copyout(output[0:inputLength])
for(i = 0; i < inputLength; ++i) {
  output[i] = input1[i] + input2[i];
}

```
<hr style="height:2px">

# Programming Languages: Most Performance and Flexible Acceleration

- **Performance**: Programmer has best control of parallelism and data movement
- **Flexible**: The computation does not need to fit into a limited set of library patterns or directive types
- **Verbose**: The programmer often needs to express more details
<hr style="height:2px">

# GPU Programming Languages

![alt tag](img/10.png)
<hr style="height:2px">

# CUDA - C

![alt tag](img/11.png)
<hr style="height:2px">

<footer>
<cite> GPU NVIDIA Teaching Kit - University of Illinois </cite>
</footer>