# A complete Hipfort application

In the **Fortran Refresher** section we covered the essentials of the Fortran language and how to use `subroutines`, `functions`, `pointers`, `modules` as well as how to call C code from Fortran. If this is unfamiliar then it might be useful to review the material in that section first.

From **GPU Computing Fundamentals** section, every accelerated application has the same basic design:

1. At program launch compute devices are discovered and initialized.
2. Memory spaces are allocated on the compute device.
3. Kernels are prepared.
4. Memory is copied from the host to the compute device.
5. Kernels are run to perform whatever compute operation is required.
6. The output from kernel runs is copied back from the compute device to the host. IO may then occur before the next iteration.
  
**Steps 4-6** are repeated as many times as neccessary until the program is done, then

7. Deallocate memory, 
8. Release resources and exit.


In this section we are going to walk through each of these steps as part of a complete example with Hipfort, using 2D tensor addition as the basic algorithm. For 2D tensors **A**, **B**, and **C**, each of size (M,N), the following relationship holds true at each index (i,j) in the tensors.

$$
A(i,j)+B(i,j)=C(i,j)
$$

In the prior **Fortran Refresher** section we used CPU code in Fortran and C to compute the answer $C(i)$ for 1D tensor addition. In this example we are going to use a HIP Kernel on the GPU to compute the answer $C(i,j)$ at every location in **C**.


## Resource discovery

## Memory allocation and de-allocation

## Memory copies

## Kernel source and launch

## Resource cleanup

## Object oriented types for memory safety