# Basic I/O

## Outline

* Intro to impure functions

* Basic IO actions

* Composing IO actions

* The do block

* Recap

## Intro to impure functions

As we saw in lesson 1, Haskell is a purely functional language. This means that for every function, **only** its arguments determine what the outcome will be.

But this is quite limiting for a program in the real world where we want an application to process data from outside the program and provide results back. 

The reason why we can not access data from outside the program in pure functions is that the final result may not be determined only by the function arguments but also the outside data.

Still we want programs to be interactive, and they should definitely change things outside the scope of the program.

The interactions and changes outside the scope of a function are called **side effects**. 

Say we have a function that fetches the time and logs it into a file. 

Each time we call this function we give it no inputs and get no outputs back. 

We only observe the side effect of the file that is changed. 

This function is called **impure**. As a recap:

| Pure | Impure |
| --- | --- |
| Always produces the same result when given the same parameters | May produce different results for the same parameters |
| Never has side effect | May have side effects |
| Never alters state | May alter the global state of the program, system, or world |

## Basic IO actions

Haskell strictly separates pure code from the impure interactions with things outside the program. 

Because we would still like our Haskell programs to alter the world outside our program, we can do this via **IO actions** denoted by `IO a`.

Because also **IO actions** can take in arguments as pure functions we sometimes call them **IO functions** or **impure functions**. 

But strictly speaking all functions in Haskell are pure functions and the rest are IO actions.

All these impure functions include a type that is marked with the `IO` wrapper followed by a type parameter `a`. 

This tells us that the impure function may first performs some computation with possible side effects and then return something of type `a` wrapped in the `IO` context.

Inside IO actions it is possible to call pure functions and/or other IO actions. But you can never call an IO action inside of a pure function.

### Printing messages to the user

Let's first look at the `putStrLn` function, which takes as input a string and writes it to StdOut (standard output), that gets displyed in a terminal if we run the code from there.
```haskell
putStrLn :: String -> IO ()
```
This function returns a IO type variable parameterized by `()`. 

The `()` variable here is an empty tuple. Since it has zero elements it can only be one thing, namely nothing.

This indicates that the IO action does not return any meanigful result and is performing only side effects.

If you want to use this function in a compiled Haskell program you have to call it from a IO action. In jupyter cells that behave like the Haskell REPL it works also calling it directly.

We ussually have a function called main in our Haskell programs that is of type `IO ()`. Inside of it wa can call the `putStrLn` function.

In [None]:
-- An IO action that perform the action of printing "Hello World".
main :: IO ()
main = putStrLn "Hello World"
main

If you load a Haskell file in GHCi with the `:l` command the file does not have to contain the `main` function.

If you would try to compile a single Haskell file that does not contain the `main` function you would get a compilation error.

The main function is by default the one which gets started when you run a compiled program.

Below is an example of when we call a pure function inside an IO action that also takes in a parameter.

In [None]:
-- A pure function that prepends the string "Hello " to another string
addHello :: String -> String
addHello string = "Hello " ++ string

-- Using the putStrLn function in the definition of an IO action.
newPutStrLn :: String -> IO ()
newPutStrLn name = putStrLn (addHello name)

newPutStrLn "John"

Since we used the IO action `putStrLn` inside the function `newPutStrLn` is has to be labeled as an IO action, which is indicated by the type of the return variable.

We see that we can **never** escape the `IO` wrapper once we used it inside of a function.

### Retrieving input data from the user

We can retrieve some data from the user with the `getLine` action.
```haskell
getLine :: IO String
```
It asks for input from the user by taking some StdIn (standard input) and returns a string.

In the code below we see how the programm asks the user for some input.

In [None]:
main :: IO String
main = getLine

main

### Reading from the file system

The last example we will show here is the `listDirectory` which is located in the module `System.Directory`. 
```haskell
listDirectory :: FilePath -> IO [FilePath]
type FilePath = String
```
It takes in a string that represents the file path to a folder and returns an IO list of strings that are the files and folders contained in the input folder. 

Modules are boundles of Haskell functions that are not avaiable by default unless we import them. We will talk more about this in lesson 11.

Here we show a simple example of how to import this function from its module and use it. 

In [None]:
import System.Directory (listDirectory)

-- The "." input refers to the current jupyter working directory (week02)
main :: IO [String]
main = listDirectory "."

main

## Composing IO actions

Now that we can use actions to do useful side effects outside our program, we would like to be able to compose two IO actions.

We could try to use another `IO a` as an input to get a function of type `IO a -> IO b`. 

In [None]:
-- An IO action that perform the action of printing "Hello". 
someIOType :: IO ()
someIOType = putStrLn "Hello"

-- A naive and wrong function that can combine two IO () actions
combineIO :: IO () -> IO ()
combineIO io = putStrLn "world"

combineIO someIOType

This program works but it does not print the *Hello* string. The `someIOType` action is not being performed.

This is because the input `IO ()` is not being used in the body of the function, so it is not being evaluated. 

But how can we call this IO action on the right side? To solve this, we introduce two new operators: 
- the **sequence** operator given by `>>`

- the **bind** operator given by `>>=`

These operators allow for the composition of IO actions. Their properties are: 

| `>>` | `>>=` |
| --- | --- |
| Composes two IO actions | Composes two IO actions |
| Does not forward the return type | Forwards the return type |

We first look at an example how the `>>` operator works.

In [None]:
-- A new expression that is of type IO ()
printHello :: IO ()
printHello = putStrLn "Hello " 

-- A new expression that is of type IO ()
printWorld :: IO ()
printWorld = putStrLn "World"

--Combining the above IO actions
main :: IO ()
main = printHello >> printWorld

main

Here, the `printHello` first performs some side effect and then the `printWorld` action performs its side effect. 

The arrows of the operator `>>` indicate the flow of the action that is in which order they are performed. 

We could say the type signature for the sequence operator is:
```haskell
(>>) :: IO a -> IO b -> IO b
```
It takes two IO actions, performs both and returns the result of the second one.

But the actual type signature is:
```haskell
(>>) :: m a -> m b -> m b
```
where `m` represents the monad type class, which IO has an instance of. We will talk about this type class in lesson 19.

Next we look at an example of the `>>=` operator works.

In [None]:
printHelloName :: String -> IO ()
printHelloName name = putStrLn ("Hello " ++ name)

main :: IO ()
main = getLine >>= printHelloName

We see how we get the input string with the `getLine` function and then pass it on to the `printHelloName` function.

**IMPORTANT:** The bind operator also removes the `IO` context from the `IO String` variable and passes ony a variable of type `String` forward.

Also here we could say the type signature for the bind operator is:
```haskell
(>>=) :: IO a -> (a -> IO b) -> IO b
```
It takes in an IO action that it performs and hands the result without the IO context to a function that produces another IO action.

The actual type signature is:
```haskell
(>>=) :: m a -> (a -> m b) -> m b
```
and also here `m` represents the monad type class.

Another example of the bind operator can be seen below.

In [None]:
import System.Directory (listDirectory)

-- A new expression that is of type IO ()
getFiles :: IO [FilePath]
getFiles = listDirectory "." 

-- A new expression that is of type IO ()
printFirstFile :: [FilePath] -> IO ()
printFirstFile = print . head 

--Combining the above IO actions with forwarding the
main :: IO ()
main = getFiles >>= printFirstFile

main

Here we used `print` which is also an IO action instead of `putStrLn` to show the first element of the list of file paths. 
```haskell
print :: Show a => a -> IO ()
```
This function automatically converts a type to a string and sends it to StOut, which can save us writing syntax.

## The do block

Now that we know how to compose actions, we are going to generalize this with the so called **do block**. 

To introduce the necessity and reasons behind this useful syntax, we look at a few examples. 

First, consider the composition of many actions via the then operator `>>`.

In [None]:
--  IO ()              IO String           IO ()
putStrLn "Action one" >> getLine >> putStrLn "Action three"

Here each action is **sequentially** performed, the first and last concatenated actions print “Action one” and “Action three” respectively. 

The middle action asks for input, which performs no output to the console at all.

Let's look at how this pattern would work for the bind operator `>>=`.

In [None]:
import System.Directory (listDirectory)

-- A new expression that is of type IO ()
getFiles :: IO [FilePath]
getFiles = listDirectory "." 

-- A new expression that is of type IO ()
printFirstFile :: [FilePath] -> IO ()
printFirstFile = print . head 

-- A new expression that is of type IO ()
printLastFile :: [FilePath] -> IO ()
printLastFile = print . last 

main :: IO ()
main = getFiles >>= printFirstFile >>= printLastFile

main

The compiler gives an error since we are incorrectly matching the types of the functions used! 

The function `printFirstFile` returns a `()` while the `printLastFile` function expects a `[FilePath]`.

Somehow we need to manage how inputs and outputs of these functions are used. 

Ideally we would like the `printLastFile` function to use the output of the `getFiles` function. 

Achieving this can be done with lambda functions, that we saw in lesson 5. As a small recap, they take the form

In [None]:
(\x -> 2*x + 1) 1 -- Here the lambda function (\x -> 2*x + 1) is applied to the argument 1

Using this, we can combine the action while making the inputs **progressively** available to the other actions inline. 

We do this by adding the actions in order to a lambda function. The following example achieves what we wanted:

In [None]:
import System.Directory (listDirectory)

-- A new expression that is of type IO ()
getFiles :: IO [FilePath]
getFiles = listDirectory "." 

-- A new expression that is of type IO ()
printFirstFile :: [FilePath] -> IO ()
printFirstFile = print . head 

-- A new expression that is of type IO ()
printLastFile :: [FilePath] -> IO ()
printLastFile = print . last 

main :: IO ()
main = getFiles >>= (\x1 -> printFirstFile x1 >> printLastFile x1)

main

Here we first forward the output of the `grabFiles` action denoted by `x1` into a lambda function. 

Then in the lambda function we call `printFirstFile` and `printLastFile` and chain them together with `>>`.

Notice that with this construction, all inputs become available to functions that need to use them. 

Now, clearly this construction is very cumbersome and not practical in real life coding!

That's why to make it easier this whole construction is done by a **do block**. This is just some syntactic sugar that does the above construction. 

This block looks just like an imperative way of programming, but in its core it is still functional by using lambda functions. 

Let's re-write the previous main function in a do block:

In [None]:
-- The introduction of the do block
main :: IO ()
main = do
   x1 <- getFiles
   printFirstFile x1
   printLastFile x1

main

In a do block, each line can perform some action. In the above example, the do block is called by `do` followed by the first action `getFiles`. 

This action is performed, and its output is stored in the variable `x1` using the syntactic arrow `<-`. 

Then the next action `printFirstFile` is performed with as input the variable `x1`. 

Lastly, the action `printLastFile` is called with also as input `x1`. 

This do block  is fundamentally the same as the previous example that used a lambda function. 

With these do blocks, we can write clearly structured code that performs multiple side effect. 

Inside a do block, you can also define let and where bindings define variables in the block.

In [None]:
main :: IO ()
main = do fileList <- getFiles
          let elemOne = fileList !! 1 -- take the 2nd elemet from the list
          print elemOne

main 

Also for the `let` keyword lambda functions are used in the de-sugared code:

In [None]:
main :: IO ()
main = getFiles >>= (\x1 -> (\x2 -> print x2) (x1 !! 1))

main

The `let` keyword allows you also to define functions beside variable. The previous example can be re-written to:

In [None]:
main :: IO ()
main = do fileList <- getFiles
          let elemOne xs = xs !! 1 -- elemOne is now a function
          print $ elemOne fileList

main 

### Nesting do-blocks

It may not apear obvious but do-blocks can be nested. Here is an example:

In [None]:
import Data.Char ( isDigit )

main :: IO ()
main = do
    putStrLn "What is your age:"
    ageString <- getLine
    let validAge = all isDigit ageString
    if validAge
    then do
        let age = read ageString :: Int
            msg = "Int 10 years you will be " ++ show (age + 10) ++ " years old."
        putStrLn msg 
    else do
        putStrLn "Your age should contain only digits from 0-9."
        main

We import the `isDigit` function that checks weather a character is a number or not.
```haskell
isDigit :: Char -> Bool
```

The we ask the user to input his age and after that define an if-statement which contains two different do-blocks.

If the input string contains only numbers we print a message how much will be the users age in 10 years.

If it does not contain only numbers then we notify the user and start the program from the beginning. 

Also you may noticed in the `then` do-block we define two variable but use the `let` keyword only once.

This is allowed in Haskell if they have same indentation and follow one after each other. Then you need to write `let` only for the first variable.

## Recap

In this section, we have introduced the concepts of IO actions. 

- Haskell knows two types of functions, pure and impure function, the latter are called IO actions.

- These types of impure function are wrapped and marked by `IO` and may perform useful side effects. 

- Once you use a function or variable type that is wrapped in IO, the function inside of which you use it has to be wrapped in IO. 

- IO actions can be composed using two operators, the sequence operator denoted by `>>` and the bind operator denoted by `>>=`.

- The do block let you easily compose multiple actions and can be nested.