# Modules

We saw in the chapter on _Subroutines_ that it is possible to break a program into re-usable self-contained units. However, when they are included in the same file as the main program they are not truly independent. We could create a subroutine to perform a complex calculation, for example, but it would still have to be compiled alongside a potentially unrelated function to do some plotting. _Modules_ solve this problem by allowing units to be packaged, tested, compiled and distributed separately, allowing the creation of a toolbox of useful functions. For those with experience of other programming language, Fortran modules resemble _Python_ modules but with features of Python and C++ _Classes_. 

## Inside a Module

A _Module_ can contain variable declarations, which have scope throughout the module, and also units such as subroutines, which can _also_ contain declarations with restricted scope. It cannot contain _bare_ executable statements. The best way to demonstrate what a module looks like is with a simple example:

```fortran
Module my_module
  Implicit none
  ! Ensure that all declared items have scope restricted to this module only
  ! unless stated otherwise.
  Private
  
  ! Allow the 'add_vars' subroutine to be accessed from outside (public).
  Public   :: add_vars   
  
  ! These values are accessible within the module but not from outside (private).
  Integer  :: my_counter = 0
  
  ! A module can 'contain' multiple units, which can either be public or private.
  contains
    integer function add_vars(a, b)
      Implicit none
      
      ! Dummy arguments
      Integer, intent(in)  :: a
      Integer, intent(in)  :: b
      
      ! Increment the access counter and print it out
      my_counter = my_counter + 1
      print *, 'Number of times function called: ', my_counter
      
      ! Add two values and return the result
      c = a + b

    end function
end module my_module
```

A module always begins with `Module`, and ends with `End module`. Within in we can place declaration statements like `Implicit none` but not program code. This is why the variable `my_counter` has to be declared and initialised in the same step. The first statement, `Private`, is recommended at the beginning of a module as it forces all subsequent contents to have scope restricted to that module unless made explicitly `Public`.

The `contains` statement signifies the point of the module where declarations end and the payload of program units begins. In this case we have a single function that adds two values and returns the result, very similar to the example we saw earlier. We have made it public so that it can be accessed from the main program, otherwise it wouldn't be very useful!

The module can be accessed by another program unit by using the `use` statement, which must always come before the `implicit none` statement. This is the simplest example:

```fortran
program moduletest
  use my_module
  Implicit none
end program
```

A fully working example can be found in the following location:

| Program        | Directory              | Purpose                                               |
| -------------- | ---------------------- | ----------------------------------------------------- |
| **moduletest** | `src/modules/example1` | Call a function defined in a module.                  |

This program calls our `add_vars()` function with a range of values. Whilst extremely simple it demonstrates a potentially very useful features of modules, _persistence_. The module itself remains in scope the whole time the main program is running, and because the `my_counter` variable belongs to the scope of the module, not the subroutine, its value persists between function calls. This allows a Fortran module to mimic the behaviour of a `Class`, in that it can encapsulate both data and functionality in one package.

Whilst it is possible to call a bare subroutine from a different file that hasn't been wrapped in a module, this will lead to extra complications in that we will need to define an explicit _interface_ for the subroutine so that the main program knows what number and type of arguments the subroutine is expecting. Using modules means interfaces are generated automatically and we don't need to worry about them. When you compile code that includes a module, a `.mod` file is automatically created that contains details of the interface. This file is queried when code containing an appropriate `use` statement is compiled.

### Exercise 1 - _Modules_

 * The variable `my_counter` is declared in the module alongside the function `add_vars()`. Can you access it from the main program. If not, why not? Can you modify the code so that it can be accessed?
 * Remove the initialisation of the variable `my_counter` (setting it to zero). Do the results change? Do you trust them?
 * When large numbers of module variables need to be initialised, potentially with values read from a file or passed via user input, it is often neater to create a separate function to do this. Create another function called `init()` or similar, that sets `my_counter` to zero and modify the main program so that this function is called before `add_vars()` is ever used. Try setting the initial value to something other than zero.

## Common Blocks

Those familiar with Fortran 77 may use common blocks to transfer persistent values between program units. This is not recommended as it makes debugging more difficult. Without defined entry and exit points to subroutines and functions it can be hard to determine where bad values can be set and renders those program units dependent upon external code. They can no longer be self-contained. Modules can be used to replicate the concept of persistent values across program units, but using them as exact duplicates is discouraged as it simply replicates the problems. Rewriting the code to take full advantage of modules will reduce problems of testing and extensibility in future.

# Array Operations

Efficient use of arrays is of crucial importance in time critical, iterative operations. The burden on the software developer is to remember how Fortran stores arrays (in column-major order) and in which circumstances Fortran will be forced to make a copy of an array, or part of an array, thus taking time and additional memory. Although an array may be two dimensional it is stored as a linear series of values, column by column. Accessing contiguous memory locations (one following the next) is always faster the jumping around, so trying to ensure that your code works in this manner will ensure it is running as fast as possible. Some examples of good practice are shown below. 

## Looping

Remember the example of a two dimensional array discussed in the _Arrays_ section. An array that looks like this:

|[]()|[]()|
|----|----|
| 1  | 2  |
| 3  | 4  |
| 5  | 6  |

...is stored in memory like this:

|[]()|[]()|[]()|[]()|[]()|[]()|
|----|----|----|----|----|----|
| 1  | 3  | 5  | 2  | 4  | 6  |

In order access memory in a contiguous manner, we would need to span the data column by column, rather than row by row. 

In [None]:
%num_images: 1
Program array_loop
  Implicit none
  ! Declare a two dimentsional array
  Integer  :: arrName(3, 2)
  
  ! Declare loop counters
  Integer  :: x, y
  
  ! Assign some values
  arrName = reshape([1, 3, 5, 2, 4, 6], shape(arrName))
  
  ! Loop over arrays and print out values
  do y = 1, 2    ! 
    do x = 1, 3  ! Loop over column slice of contiguous memory
      print *, 'value = ', arrName(x, y)
    end do
  end do

End program

Looping over such a small array takes too short a time to show a real difference but, you can find larger examples using a `(50000, 4000)` array in the following directory:

| Program         | Directory             | Purpose                                                    |
| --------------- | --------------------- | ---------------------------------------------------------- |
| **efficient**   | `src/arrays/example1` | Loop over a two dimensional array in an efficient manner   |
| **inefficient** | `src/arrays/example1` | Loop over a two dimensional array in an inefficient manner |

### Exercise 2 - _Looping_

 * Run these two examples. How much longer does the inefficient version take?
 * Bearing in mind how two-dimensional arrays are stored, what would you expect to happen to the ratio of the speeds of the two methods if the number of columns were increased relative to the number of rows for an array with the same number of elements? Try it and see.

## Slicing

Arrays do not have to be accessed element by element. It is possible to extract or insert a full or partial row or column, and this technique is called _slicing_. An example of extracting a column slice of an array is shown below:

```fortran
Integer :: arrName(12, 15)
Integer :: arrSlice(12)

arrSlice = arrName(:, 12)
```
This snippet of code defines a two dimensional array, of twelve columns by fifteen rows, and a one dimensional array of twelve elements. One of columns in the 2-D array is then extracted and assigned to the 1-D array in a single statement. The shorthand `:` means _all the elements in this dimension_.

This technique is useful for allocating or reading arrays an entire dimension at a time, and is not limited to the first dimension, as the following example demonstrates:

In [None]:
%num_images: 1
Program slice
  Implicit none
  ! Declare a two dimentsional array
  Integer  :: arrName(12, 15)
  
  ! Declare two one-dimensional arrays to hold slices of the array
  Integer  :: columnSlice(12), rowSlice(15)
  
  ! Array indexing variables
  Integer  :: x, y
  
  ! Loop over array, column first, and populate it with values
  do y = 1, 15
    do x = 1, 12
      arrName(x, y) = (x*100) + y
    end do
  end do

  ! Extract a column and row using the slicing technique
  columnSlice = arrName(:, 10)
  rowSlice = arrName(7, :)

  ! Print out the two slices
  print *, 'columnSlice:'
  print "(12i5)", columnSlice
  print *, 'rowSlice:'
  print "(15i5)", rowSlice

End program

The same rules of contiguous memory apply, so extracting a row will be more expensive than extracting a column of the same size, so if using this technique it is important to consider how data is stored in order to allow most efficient access to it. The good news is that using slices instead of explicit loops to access data enables the compiler to optimise the code as best it can.

Slices are not limited to being one-dimensional as the following example demonstrates:

In [None]:
%num_images: 1
Program slice2
  Implicit none
  ! Declare a three dimensional array
  Integer  :: arrName(10, 10, 10)

  ! Declare a two dimensional slice
  Integer  :: arrSlice(4, 3)

  ! Array index variables
  Integer  :: x, y, z
  
  ! Loop over array in efficient manner, with highest dimensions on
  ! the outside of the loop and lowest on the inside.
  do z = 1, 10
    do y = 1, 10
      do x = 1, 10
        arrName(x, y, z) = (x*100) + (y*10) + z
      end do
    end do
  end do

  ! Extract a 4X3 slice
  arrSlice = arrName(3:6, 7:9, 5)

  ! Print out the slice
  do x = 1, 4
    print "(3i6)", arrSlice(x, :)
  end do

End program

Notice that in this example we are specifying upper and lower limits to the slice to be extracted, for example `(3:6)` instead of `(:)`. It is harder to visualise three-dimensional and higher arrays in order to determine how to most efficiently access the elements in memory, but the basic rule is that the lowest dimension (the first in the array declaration) should be on the inside of any loops, and the highest dimension on the outside, with the remaining following in order.

### Excercise 3 - _Slicing_

 * Ensure you understand how slices work by calculating what the list of elements in a one dimensional slice _should_ be when extracted from a two dimensional array and check that this is what is printed.
 * Modify the three dimensional example to move the 2-D 'window' of extracted elements around the 3-D space.
 * Make sure you can extract a 1-D slice from the 3-D array.

## Passing as Arguments

In earlier versions of Fortran the only way to pass an array to a subroutine or function was in the same manner as passing a variable, by specifying the name in the argument list. For example:

```fortran
call myfunction(myvariable, myarray)
```

This resulted not in a copy of the array being passed to the function, but simply a memory location corresponding to the start of the array. No explicit information such as the dimensions or range of the array elements was passed along with it, so if you wanted to be able to to loop over the array in the function you had to pass this information along with it as separate arguments:

```fortran
call myfunction(myvariable, myarray, num_dimensions, x_size, y_size, z_size)
```

Since Fortran had no concept of the array as anything other than a memory location it was impossible to do such things as slicing, or even querying the size of a dimension. The burden on making proper use of the array and not accidentally accessing memory outside the array was entirely on the author. Needless to say, this meant code was hard to debug and could easily lead to mistakes.

In Modern Fortran, passing arrays is much more flexible.

### Assumed Size Arrays

Passing an array as _assumed size_ is very similar to old style array passing, in that the function receives a memory location and has no conception of the array size or dimensions. It is mainly used when having to interface with old-style Fortran code, and the main difference is that the array is can be received with formal dummy arguments in the function, allowing us to specify `intent`:

```fortran
Integer function myfunction(x_size, y_size, myarray)
  Implicit none
  
  ! Dummy argument for receiving an array of assumed size
  Integer,intent(in)  :: myarray(x_size, *)
  
  ! Array indices
  Integer             :: x, y
  
  ! Loop over array
  do y = 1, y_size
    do x = 1, x_size
      print *, 'value = ', myarray(x, y)
    end do
  end do
  
End function
```

Note that the final dimension size does not need to be explicitly specified and can be replaced by a `*`. We do, however, need to implicitly specify the shape by the number of comma-separated sizes.

Advantages of using assumed size arrays include the fact that only the memory location needs to be transferred to the function, thus making the process lightweight by avoiding time and memory consuming array copying. Disadvantages include the fact that we still have no information about the array other than its name. Because of this we still need to receive the dimension sizes from the calling code.

### Assumed Shape Arrays

_Assumed shape_ arrays provide much more flexibility than assumed size arrays. Instead of just passing the memory location of the array, additional information such as the number of dimensions and size is passed alongside, meaning we don't need to do it explicitly. This does come with additional burdens on the author which will be discussed as they arise.

To pass an array as _assumed shape_ we must know the number and size of dimensions - we can't just pass on a memory location we have received. The array is received by the subroutine with the following dummy argument:

```fortran
Integer function myfunction(my_array)
  Implicit none
  
  ! Dummy argument for receiving an array as assumed shape
  Integer,intent(in)  :: myarray(:, :)
End function
```

Note that instead of specifying the size of the dimensions (as we had to with assumed size arrays), we only need to specify the _number_ of dimensions (two in this case), with the size replaced by the range symbol, `:`. We also don't need to pass the size of the dimensions as arguments, as they are automatically included as part of `my_array`, which is no longer simply a memory location, but an _array descriptor_ containing extra information. The best way to demonstrate this is with a functional example. Try running it see what happens...

In [None]:
%num_images: 1
Program assumed_shape
  Implicit none
  
  ! Declare our function
  Integer, external :: process_arr
  
  ! Declare an array to store our data
  Integer           :: data_store(20, 30)
  
  ! Declare return value from our processing
  Integer           :: my_result
  
  ! Array indices
  Integer x, y
  
  do y = 1, 30
    do x = 1, 20
      data_store(x, y) = (x*100) + y
    end do
  end do
  
  ! Process this array in some way
  my_result = process_arr(data_store)
  
  ! Print out the result
  print *, my_result

End program

Integer function process_arr(my_array)
  Implicit none
  
  ! Dummy argument to receive array as assumed shape
  Integer  :: my_array(:, :)
  Integer  :: z
  
  ! Because my_array is not just a memory location, there are intrinsic
  ! functions we can use to query it.
  print *, 'size of first dimension         = ', size(my_array, 1)
  print *, 'size of second dimension        = ', size(my_array, 2)
  print *, 'lower bound of first dimension  = ', lbound(my_array, 1)
  print *, 'upper bound of first dimension  = ', ubound(my_array, 1)
  print *, 'lower bound of second dimension = ', lbound(my_array, 2)
  print *, 'upper bound of second dimension = ', ubound(my_array, 2)
  
  ! Return an arbitrary value
  z = my_array(2, 5)
  
  return

End function

Did you get an error similar to **explicit interface required**? This demonstrates some of the extra work that needs to be performed in order to handle assumed shape arrays. We briefly touched on _interfaces_ when talking about _Modules_ at the top of this page, but now we can see an example of why such functionality is required. The main program needs to know what arguments the function receives, and importantly what type they are, so the compiler can perform the necessary checks at compile time to make sure there are no errors. A bare function like this doesn't provide an explict interface, so the best way to create one is to wrap it in a _module_, which generates the interface automatically.

| Program           | Directory             | Purpose                                                    |
| ----------------- | --------------------- | ---------------------------------------------------------- |
| **assumed_shape** | `src/arrays/example2` | Demonstrate assumed shape array intrinsics using a module  |

This example contains exactly the same code as in the above example, with the exception that the function is wrapped in a module, rather than _raw_. Notice that using assumed shape arrays allows us to examine the structure of the array with _intrinsic functions_, so we don't need to pass in the dimensions and size explicitly. The combination of assumed shape arrays and modules means we can create completely self-contained functions that don't need to know about the data structures in the code from which they are called. This is a big step towards creating a toolbox off re-usable functions.

### Exercise 4 - _Assumed Shape Arrays_

 * Using the intrinsics such as _lbound_ and _size_, write a nested `do` loop that iterates over the array and prints out the values.
 * Using your knowledge of array slicing, pass only a section of the `data_store` array into the `process_array` function. Are the results what you expect? Can you see how this would allow you operate only on subsets of your total data or work on multiple sections in parallel?

| [<- Control and Interaction](Control%20and%20Interaction.ipynb) | [Flexible Fortran ->](Flexible%20Fortran.ipynb) |
| ------------------------------------- | ----------------------------------------------------------- |
| []()                                  | []()                                                        |