# Modules, pragmas and cabal

## Outline

* Modules
  - Prelude module

  - Working with base modules 

  - Creating our own modules

  - Compiling Haskell programs

* Language pragmas

* Using Cabal
  - Introduction
  
  - Managing of packages with cabal
  
  - Building a project with cabal

In this lesson, we will learn how you can work with Haskell modules, pragmas and Cabal. These are the tools necessary to build a full-fledged Haskell application so that we can finally build a complete project. We will see that modules will allow us to manage code better and let us reuse code that others have built! Pragmas will let us tweak the behavior of Haskell and how it should work. Lastly, Cabal will help us manage a complete project and all its possible dependencies!

## Modules

**What are they?**

Modules are Haskell files that contain a module definition statement at the beginning of the file, followed by some Haskell code. For example,

```haskell
module Tree where
```
Here the name `Tree` represents the name of the module. It is also good practice to name the Haskell file after the name of the module. 

These modules can be imported into other Haskell files. If done so, some or all of the code from the module becomes available to the Haskell file. 

If nothing is specified, like the above code snippet, all the function and type definition will be exposed by this module. 

In case you want to restrict access to some functionality, you can specify what is exported by making it explicit as

```haskell
module Tree (function1, function2, myDataType) where
```

We will show how to import them on a practical example in the chapter **Working with base modules**.

**Why use them?**

Modules are nice because they allow you to split up and control your code into multiple files and group it according to similar functionality.

The advantage of this is that you have a better overview of your code and can easier manage it, it prevents spaghetti code.

Additionally, when compiling, Haskell considers each module with its own scope. This means you can compile each module with other compiler settings. 

We will explain this more in the **Language pragmas** chapter.

Another use for them is when you need to use some functions, types and / or type classes that are defined in existing Haskell modules.

You can then import these modules and use the functionality that they offer in your code. 

### Prelude module

When working in GHCi some functions are available by default, for instance `head`, `sum` and `length`.

This means you had to do nothing to have these imported or installed in Haskell for them to be used.

This is because those functions are part of the standard Haskell module called **Prelude** that is imported by default. 

The word prelude means an introduction to something more important, which is the code and modules you will write.

You can find a list of all the functions contained in **Prelude** on **Hackage** [(1)](https://hackage.haskell.org/package/base-4.17.0.0/docs/Prelude.html). We will explain in lesson 15 what Hackage is and how you can use it in general.

<div class="alert alert-block alert-info">
    To view all the imported modules in a current GHCi session, use <code>:show imports</code>
</div>

Some of the most used **Prelude** function we have already seen are:

- `head` (gives you the first element of the list)

- `tail` (gives you the elements of the list, except the first one)

- `sum` (sums elements of a list)

- `length` (gives the length of a list)

- `print` (converts a variable to a string a prints it to the terminal)

On the Hackage link of the prelude module provided above, you will also find the type signatures of all these functions and more.

### Working with Base modules

Some functions in Haskell need to be imported via modules that are already locally available, besides the standard **Prelude** module. 

We call the combination of all these already locally available modules the **Base** modules (of which the prelude is one). Haskell contains many convenient modules for you to try out!

For an overview of what the Base consists of, take a look at the following foldable list [(2)](https://hackage.haskell.org/package/base-4.17.0.0/docs/index.html).

As this is an introduction to Haskell, the most important modules in this list right now are 

* Data
* Numeric
* Prelude
* System
* Text

Let's say you want to run a Haskell program that executed something from the command line by giving it some input parameters and then read them from the program. 

This is a common pattern, and it is available in the Base module `System`.

More precise, you can do this with the function `getArgs` that is part of the `System.Environment` module.

```haskell
import System.Environment 

main :: IO ()
main = do
  args <- getArgs
  print $ head args
```

<div class="alert alert-block alert-warning">
The above code will not properly work in this Jupyter Notebook. This is because the <code>getArgs</code> function looks for arguments from a command line! At the end of the lesson, you will be able to execute this code locally with cabal or GHC.  
</div>

When we import a module as above, all of its types and functions that it exposes get imported. We can also specify precisely which functions and types we want to import.

For the example above, we could have imported only the function `getArgs` with the following statement instead of the whole module
```haskell
import System.Environment (getArgs)
```
If there are more functions and types we wish to import, we separate them with a comma. 

Now you might be asking yourself how to know which module to import and which function to use? The best answer is always googling what you want to achieve. 

For our example of the `getArgs` function, you could for instance write into google: `haskell get command line arguments`.

Once you find a module or function that fits your needs, you can further use Google to find a good explanation and example. 

You can also use the Haskell web services **Hackage** and **Hoogle** to learn more about them. Again, we will discuss these web services in detail in lesson 15.

### Base modules Data.Char, Data.List and Data.Array

As an example, we will go through three often used modules that are also locally available for you to use in Haskell. We will briefly discuss what these might be used for to provide some insight in their structure.

The **Data.Char** module defines functions that deal with characters. Some of the often used functions are:

- `isDigit` (checks if a character is a digit)

- `isPunctuation` (checks if a character is a punctuation mark)

- `toUpper` and `toLower` (converts a lower case character to upper or vice versa)

- `ord` and `chr` (convert a character to ASCII code number and vice versa)

For some example code we could have

In [None]:
import Data.Char

isDigit '1'
isPunctuation '.'
toUpper 'a'
ord 'a'

This module is great for parsing characters and changing them.

The **Data.List** module defines functions that deal with lists. Some of the functions we have not covered yet are:

- `sort` (sorts a list from lower to upper values, if the variables can be compared to each other)

- `splitAt` (splits a list into a tuple of two lists, where the user defines the length of the first list)

- `scanl` (is similar to foldl, but it shows the value of the accumulator on each iteration instead of the result)

- `scanr` (is the right-to-left dual of scanl. The order of parameters on the accumulating function are reversed compared to scanl.)

For some example code, we could have

In [None]:
import Data.List (sort)

myList = [4,2,1,5,3]

sort myList
splitAt 3 myList

scanl (-) 5 [1..4]
scanr (-) 5 [1..4]
foldr (-) 5 [1..4]

<div class="alert alert-block alert-info">
    The scan functions are similar to the fold functions. The difference is that the fold functions are focused on the result of the fold (a single return value). While the scan functions tell you how that result was achieved by showing its intermediate values (a list of values). So, the last element of a <code>scanl</code> list is the result of a <code>foldl</code>, and the first element of <code>scanr</code> is the same as <code>foldr</code> over the same arguments.
</div>

Here's a more precise breakdown of how the scan functions work in the above example
```haskell
scanl (-) 5 [1,2,3,4] = 5 : scanl (-) ((-) 5 1) [2,3,4]
                      = 5 : ((-) 5 1) : scanl (-) ((-) ((-) 5 1) 2) [3,4]
                      = 5 : ((-) 5 1) : ((-) ((-) 5 1) 2) : scanl (-) ((-) ((-) ((-) 5 1) 2) 3) [4]
                      = 5 : ((-) 5 1) : ((-) ((-) 5 1) 2) : ((-) ((-) ((-) 5 1) 2) 3) : scanl (-) ((-) ((-) ((-) ((-) 5 1) 2) 3) 4) []
                      = 5 : ((-) 5 1) : ((-) ((-) 5 1) 2) : ((-) ((-) ((-) 5 1) 2) 3) : ((-) ((-) ((-) ((-) 5 1) 2) 3) 4) : []
                      = 5 : 4 : 2 : -1 : -5 : []
                      = [5, 4, 2, -1, -5]      

scanr (-) 5 [1,2,3,4] = scanr (-) ((-) 4 5) [1,2,3] : [5]
                      = scanr (-) ((-) 3 ((-) 4 5)) [1,2] : ((-) 4 5) : [5]
                      = scanr (-) ((-) 2 ((-) 3 ((-) 4 5))) [1] : ((-) 3 ((-) 4 5)) : ((-) 4 5) : [5]
                      = scanr (-) ((-) 1 ((-) 2 ((-) 3 ((-) 4 5)))) [] : ((-) 2 ((-) 3 ((-) 4 5))) : ((-) 3 ((-) 4 5)) : ((-) 4 5) : [5]
                      = ((-) 1 ((-) 2 ((-) 3 ((-) 4 5)))) : ((-) 2 ((-) 3 ((-) 4 5))) : ((-) 3 ((-) 4 5)) : ((-) 4 5) : [5]
                      = [3, -2, 4, -1, 5]
```

The **Data.Array** module defines functions that deal with arrays. Some of the often used functions are:

- `listArray` (construct an array from a pair of bounds and a list of values in index order)

- `(!)` (access the value at the given index in an array)

- `indices` (the list of indices of an array in ascending order)

- `elems` (the list of elements of an array in index order)

- `(//)` (constructs an array identical to the first argument, except that it has been updated by the associations in the right argument)

For some example code, we could have

In [None]:
import Data.Array

myArray :: Array Int Int
myArray = listArray (1,3) [4,5,6]

myArray ! 3
indices myArray 
elems myArray
myArray // [(2,10)]
elems $ myArray // [(2,10)]

The benefit of using arrays over lists are the helper functions you get that work with arrays when using the **Data.Array** module.

Performance wise, there are no benefits. If you need a better performance, you can use  the unboxed `UArray` array type from the **Data.Array.Unboxed** module.

A `UArray` works with similar syntax as an Array and will generally be more efficient in terms of both time and space. 

<div class="alert alert-block alert-info">
    In all of the above modules we saw that their name contained <code>Data.</code> which classified each module as a member of the Data module. Practicly this is achieved by having the module files in the path of a directory called <code>Data</code>. So in general a module name with dots <code>A.B.C</code> originates from the UNIX file path <code>A/B/C.hs</code>. 
</div>

### Creating your own module

Modules are just plain Haskell files that can be imported in other Haskell files, and it is easy to create a module on your own. 

In the following section we will show you how to set up a module, later we will show how you can use it via Cabal.

Let's say we want another version of the Prelude function `sum` that returns an error for the input of the empty list, instead of the value 0 that the Prelude `sum` returns.

First, we create a Haskell file that we call `Sum.hs` and write a module statement at the beginning of the file:
```haskell
module Sum where
```

With this statement, we define the name of our module, which should start with an upper case letter.

It is good practice that the name of the module is the same as the name of the file, though this is not mandatory.

Then we define our own parameterized type `MyData` that can be used for displaying an error message. 

We will learn more about proper error handling in lesson 14. After that, we define our `sum` function.

```haskell
module Sum where

data MyData a b = Error a | Result b deriving Show

sum :: Num a => [a] -> MyData String a
sum [] = Error "List is empty"
sum xs = Result $ Prelude.sum xs
```

Notice that in the definition of our `sum` function, we use the prelude version of `sum` that we access by `Prelude.sum`.

Now, if we were in another Haskell file and wanted to import our **Sum** module, we would have to do this with a qualified import to avoid name collision of both `sum` functions.

```haskell
import qualified Sum as SumModule

Prelude.sum []       -- 0
Prelude.sum [1..3]   -- 6

SumModule.sum []     -- Error "List is empty" 
SumModule.sum [1..3] -- Result 6
```

If you would not want to rename the **Sum** module in its usages, you could simply write `import qualified Sum`, and then you would access the function with `Sum.sum`.

If you defined your sum function with another name that would not match any default function name from Prelude, you could use a simple import statement as `import Sum`. 

This would work if our function name would be e.g. `sum1`. Then you could use the function names directly. 

```haskell
import Sum

sum []      -- 0
sum [1..3]  -- 6

sum1 []     -- Error "List is empty" 
sum1 [1..3] -- Result 6
```

### Compiling Haskell programs

In this section, we will show how you can compile a simple Haskell file, later we will show how you can compile a more complex project using Cabal.

You can compile a Haskell program, that is a Haskell file with your main logic, using GHC. This main file can import other Haskell files that define modules as well.

To compile something with GHC, your main file can have any name, but it must contain a `main :: IO ()` function which is the entry point for the compiler. 

When the program is run, this is the function that will be evaluated. For example, consider the file `Main.hs`
```haskell
module Main where

main :: IO ()
main = do 
        print "Hello World"
```

You compile your program from the command line like this:
```
ghc Main.hs
```

This will output three new files
```
 1) Main
 2) Main.hi
 3) Main.o
```

The first is the executable that we can run, on a Unix system, this can be done via the command `./Main`. 

The second and the third files are intermediate files for compilation to the `Main` executable.

The `Main.hi` file, is an interface file of the `Main.hs` file seen as a module. The `.o` file is a C object file of this module. 

Together, they can be reused by GHC to compile other files again. 

This necessary for optimization when you recompile a big project with many of modules but with a small change in the code. 

This way, GHC can use the old `.o` and `.hi` files of all the modules to compile the new executable.

The code from the additional Haskell files is included automatically if the files are sitting in the same directory as the Main.hs file.

Also, all the files have to have module declaration statements and the Main.hs file has to import them for them to be included in the compile process.

<div class="alert alert-block alert-info">
    GHCi the Haskell REPL allows you to load a Haskell file with the <code>:l</code> command. There, it is not relevant whether the file has a main function or not. Once the file is loaded into GHCi you can call any of the functions or types defined in the file and test them if they work as you expected. If you load a <code>main.hs</code> file into GHCi that imports some user-defined modules, they will also be included as in the compilation process.
</div>

## Language pragmas

Language pragmas, also called language extensions, are a way to add or alter some functionality to your Haskell code that is not there by default. 

The syntax to add a pragma to a Haskell file is by declaring `{-# LANGUAGE pragme_name #-}` at the top of the file.

Though pragmas extend your Haskell file like modules, pragmas are not like modules because they do not bring any new functions to Haskell, but rather code functionality.

Let's look at an example where we use the **OverloadedStrings** pragma. In lesson 13 you will learn about bytestrings and how to display them.

When this pragma is added to your file, it allows you to write bytestrings as normal strings without having to use the `pack` function from the **Data.Bytestring** module. 

In [None]:
{-# LANGUAGE OverloadedStrings #-}

import qualified Data.ByteString as BS

bytestring1 :: BS.ByteString
bytestring1 = "1"  -- would throw error without the pragma

-- This code would not throw an error without the pragma
bytestring2 :: BS.ByteString
bytestring2 = BS.pack "2" 

main :: IO ()
main = do
  print bytestring1
  print bytestring2

main

This extension enables to define various type variables that contain strings, by only writing out the string. 

Besides the `ByteString` this will also work for the `Text` type that you will learn about also in lesson 13. 

We see that with this pragma we can make it easier to code because we do not have to use the `BS.pack` function.

But sometimes we need to add a pragma in order that we can implement our required code solution.

Let's say we have a requirement to define two user types with record syntax that should both contain the `name` function name.

We can accomplish this if we add the `DuplicateRecordFields` language extension to our code, that allows duplicated function names.

In [None]:
{-# LANGUAGE DuplicateRecordFields #-}

data UserAge = UserAge { name :: String
                       , age :: Int }

data UserHeight = UserHeight { name :: String
                             , height :: Int }                     

There are many other pragmas that you can add to your code. Some of them are:

- **NoImplicitPrelude**: This language extension prevents the Prelude module to be imported by default.<br>
In Plutus the Cardano smart contract language, we prefer to use a custom Prelude that uses strict functions by default. 

- **TemplateHaskell**: Provides tools for Haskell meta-programming, which means that the code generates other code. It is also used in Plutus.

- **ViewPatterns**: Allows for more sophisticated pattern matching.

A list of all language extensions can be found on Hackage [(2)](https://hackage.haskell.org/package/template-haskell-2.19.0.0/docs/Language-Haskell-TH.html#g:5).

## Using cabal

### Introduction

**What is cabal?**

It is a Haskell package management tool that can be used from the command line, similar as **pip** for Python or **npm** for JavaScript. 

The name **cabal** stands for *Common Architecture for Building Applications and Libraries*.

**What is a Haskell package?**

A Haskell **package** is a collection of Haskell files that defines modules. The modules contain code that offer related functionality. 

The reason why there are multiple modules contained in a package are: 
- a developer might not need all the functionality from the package and wants to import only certain modules

- it is easier to maintain the package if the code is contained in multiple modules instead of one 

**Why do we need cabal?**

Imagine we have a file that contains the blood molar concentractions of vitamin C, vitamin E and copper for a person. 

The units are `mol/L` and the numbers have scientific notation.
```
Vitamin C: 40e-6
Vitamin E: 50e-6
Copper:    15e-6
```

Now when you read the file and extract the strings that contain the numbers you can not parse them with the `read` function.

After some search on google you find out there is a module called **Data.Scientific** that allows you to parse the numbers in scientific format.

The parsing can be done with the `read` function:
```haskell
read "40e-6" :: Scientific
```

But when you try to import this module your realize that you get an error.

The reason for this is that the package **scientific** which contains this module is not installed by default when you install **GHC**.

Only **packages** that contain commonly used modules come with the default installation of Haskell. Some of them are **base**, **text** and **time**.

For other packages that you might need, **cabal** can help you to install them on your OS.

### Managing packages with cabal

**How to check if a module / package is available by default?**

To check whether a module can be imported in Haskell, simply start GHCi and type `import module_name`. 

If the package containing the module is not installed, you will get an error. 

You can also use the TAB button for auto-completion. For instance, type `import Data.A` and hit TAB. You will get a list of all modules that start with `Data.A`. 

For packages you can display the list of installed pacakges on your OS with the command:
```
cabal list --installed
```

**How to install a Haskell package?**

When **cabal** installs a package it downloads it from [Hackage](https://hackage.haskell.org/). As said before we will talk about Hackage in lesson 15.

To install a package on your OS, you can use the command:
```
cabal install package_name
```

It could happen that cabal will complain if the package is in form of a library. In this case, use the command:
```
cabal install --lib package_name
```

If you download a tar.gz file of a package, you can install the package locally with the command:
```
cabal install ./package_name.tar.gz
```

**How to remove a Haskell package?**

Currently, **cabal** does not know how to uninstall packages.

You can install the **cabal-uninstall** package that lets you remove Haskell packages.
```
cabal-uninstall package_name
```

### Building a project with cabal

We learned now how to use **cabal** to manage Haskell packages. But **cabal** can also be used for another purpose.

You can use it to build Haskell projects instead of compiling the source code with the **ghc** command.

Cabal uses a `.cabal` file that describes the building process of a project.

The reasons for building projects with **cabal** instead of **ghc** are:
- Can handle a project structure that contains files in multiple folders.

- In the cabal file you can specify for each folder seperatly: 
  - which of the installed packages you want to import.

  - which packages cabal should add as build dependencies.<br>
    They will be used in the compiling process but not installed on the OS.

  - if it contains the main executable file, library files or test files.

- You can create a package with cabal and distribute it to other developers.

**Creating new project**

To create a new project in cabal, create an empty folder, move into it and use one of the following commands:

Creates a simple project:
```
cabal init
```

Creates a project by asking you multiple questions where you can choose from a set of parameters:
```
cabal init --interactive
```
**NOTE**: Initially, you should say "no" for a simple project. Otherwise, the command runs as in the first case.

The commands above will create a cabal file and some folders with files depending on the options you specified.

**Explaning the cabal file**

Let's have a look at the contents of an example cabal file: 
```
cabal-version:      2.4
name:               test
version:            0.1.0.0

-- A short (one-line) description of the package.
-- synopsis:

-- A longer description of the package.
-- description:

-- A URL where users can report bugs.
-- bug-reports:

-- The license under which the package is released.
-- license:
author:             Luka Kurnjek
maintainer:         luka.kurnjek@iohk.io

-- A copyright notice.
-- copyright:
-- category:
extra-source-files: CHANGELOG.md

executable test
    main-is:          Main.hs

    -- Modules included in this executable, other than Main.
    -- other-modules:

    -- LANGUAGE extensions used by modules in this package.
    -- other-extensions:
    build-depends:    base 
    hs-source-dirs:   app
    default-language: Haskell2010
```

The fields in the beginning are self-explanatory. You can remove any comments that start with `--`. 

The name of your project is set to the name of the folder on which you created the project. 

This name will also be used for the name of the package if you want to build and distribute it. 

In case a package with the name you set allready exists on Hackage you will be notified. 

After the intial 5 fields the `extra-source-files` field specifies the file that contains the copyright notice.

Then the fields that specify the project folder types are define which also contain build instructions. One of the more often used are:
- `executable`: defines data for the exucutable part of project

- `library`: defines data for the supporting libraries that the project

- `test-suite`: defines data for the testing part of the project

The `executable` section first defines the name of the project and then configuration options:
- `main-is` defines the haskell file which gets executed when the programm is started (ussually Main.hs). 

- `build-depends` defines the packages needed to build this project (base package is included by default)

- `hs-source-dirs` defines the name of the folder where the main executable file is residing

- `default-language` specifies the name of the Haskell release you want to use.

Also other configuration options can be added to the executable section as:
- `import` defines which of the already installed modules to import when compiling the main executable

- `ghc-options` defines which compiler flags to use when building this code

Cabal supports equality and inequality operators for comparing package versions. 

For instance when you define the `build-depends` you can set conditions for library versions. 

You could for instance write: 
```
build-depends: base ^>=4.14.3.0
```

The carrot operator `^` is used to treat `^>= x.y.z` as identical to `>= x.y.z && < x.(y + 1)`. 

So in our case we could write the above statement also as: 
```
base >= 4.14.3.0 && < 4.15
``` 

For the `library` and `test-suite` sections same options and rules apply as for the `executable` section.

Common folder naming practices for sections are:
- `src` or `lib` for `library` section

- `test` for `test-suite` section

**Building and running your project**

All the command in this section need to be performed inside the project folder at the top level.

To build your project, you run the command:
```
cabal build
```

To run your project, you run the command:
```
cabal exec project_name
```

You can do both actions also with one command:
```
cabal run
```

## That's it for today!