# ECS713P Functional Programming

## Notebook 1

## A first taste of Haskell

This is a Jupyter notebook running Haskell code. It is an interactive document. You can take a copy, edit it and save it for yourself. As you edit it, you can save checkpoints. But you must download a copy when you finish your session. The copy you now see is in the temporary storage associated with a Docker image. 

If this is the first time you've used a Jupyter Notebook then you may want to take the short user interface tour. Click on Help above. Also accessible from Help is a table of keyboard shortcuts, general help on using notebooks, and a guide to the html-like language the text-like cells are written in. Start with [Basic writing and formatting syntax](https://help.github.com/en/github/writing-on-github/basic-writing-and-formatting-syntax).

There are two main types of cells. This is a markdown cell containing text in the markdown formatting language, which is a bit like HTML in the sense of doing much the same job but with an almost completely different syntax. See: [markdown guide](https://www.markdownguide.org/basic-syntax/) for a slightly different guide.

The other main type of cell contains Haskell code. Here is a simple example. It is just an expression. Run the cell by selecting it and then pressing the run button, or by selecting it and then SHIFT+ENTER. This will evaluate the expression, and the result will be printed below the cell. Change this expression and run it again to see a different result. 

In [None]:
3+5

Congratulations, you have written and run your first piece of Haskell!

In general, you can put a bunch of things into a cell. When you run it, you will get the result of evaluating the different expressions. 

In [None]:
True && False
"This"++" is a String"

A typical Haskell program consists of a sequence of declarations (or definitions). The simplest form of declaration (definition) is just: 
`<identifier> = <expression>`
If you run the code below, you won't get an output. We've made a declaration, we haven't computed a value. 

In [None]:
x = 7

But now we can use `x` in expressions (to represent the value `7`).

In [None]:
x+3

**BEWARE** This is not the same as an assignment. You can't change value of `x`. Try running the cell below. As before, it runs immediately and nothing gets printed out. 

In [None]:
x = x+1

Now try evaluating `x`. You will see that the kernel runs, and keeps running. Press the square stop button above to interrupt it. 

In [None]:
x

The problem is that we just declared `x` to be `x+1`. That puts the kernel into an infinite loop if we ask it what `x` is. 

More interestingly we can declare functions: 

In [None]:
inc y = y+1

In [None]:
inc 4

Haskell is a strongly typed language and we can only use functions on particular data types. `inc` only works for numbers. 

In [None]:
inc "a string"

## A substantial example

Now we are going to move onto a more substantial example of a Haskell program. This illustrates the programming style. 

Let's suppose we have an address book, and we want to extract all the phone numbers in it. We need to be able to pass the book to Haskell in some format, so let's suppose we have saved it in a standard format to a file. Here is an example, saved from the standard Mac address book. This is just one card to give us a simple test: 

```
BEGIN:VCARD
VERSION:3.0
PRODID:-//Apple Inc.//Address Book 6.1.2//EN
N:Doe;John;;;
FN:Doe John
ORG:Queen Mary;
EMAIL;type=INTERNET;type=HOME;type=pref:JohnDoe@nogmail.com
TEL;type=CELL;type=VOICE;type=pref:0751 234567
TEL;type=HOME;type=VOICE:020 7123 4567
item1.ADR;type=HOME;type=pref:;;42 Nowhere St;London;;E1 0XX;
item1.X-ABADR:gb
X-ABUID:85152BB5-BFB5-45DA-853A-BA021C7A0FC8:ABPerson
END:VCARD
```

Notice that this is basically just a text file, split into lines, with a bit of data on each line. 

Haskell deliberately makes it hard to read in stuff. So here is a bit of magic. 

In [None]:
vcard <- readFile "john.vcard"

In [None]:
vcard

We can see that we have the entry above as a string with escaped newline characters. Let's just check its type. 

In [None]:
:type vcard

The task is to extract the phone numbers from this string. Before we jump in let's think about the strategy. How might we do it? One option is to identify a format for phone numbers, and to recognise substrings that have that format. But that is quite hard. There are lots of formats for phone numbers, and we might have false negatives and even some false positives as a result. Moreoever, this is structured data. Let's use that.

First the card is split into lines, with one form of information onto each line. 

**Stage 1** split the card into lines

Then we notice that all of the lines containing a phone number start with the string `TEL`.

**Stage 2** extract the lines containing the phone numbers (those beginning with `TEL`)

Finally, we notice that the phone number itself comes between the first colon (`:`) and the end of the line

**Stage 3** extract the phone number from each line. 

By this stage we should be done. 
    

There's a little bit more to do before we start. Let's think about the data involved. We've written about this almost as if we were making changes to some piece of data that we start out with. But that's not really the case. 

**Stage 1** converts a String into some kind of list of lines (or list of Strings). We're not changing the data we have. We are producing a new piece of data. 

**Stage 2** extracts some of the entries of the list. That could be interpreted as changing the list (without modifying the entries). But that can be a dangerous thing to do. If any other code is using the same list, it will now produce a different result. That may not be what you want. It's safer to produce a new list just containing the entries you want. 

**Stage 3** extracts the phone numbers one at a time from a list of lines. Again we could interpret it as a modification (this time a modification of each of the entries in a list without changing the shape of the list). But again that is often a dangerous thing to do, particularly if the list might be shared. A further hint that it would be a good idea to produce a new list just containing the phone numbers comes from the thought that intuitively we are starting out with a list of vcard lines, and producing a list of phone numbers. So the input and output have conceptually different types. This means you should probably produce new data, not modify the old.  

So it we are going to write our program by producing a function to carry out each stage, producing a new piece of data at each stage, and then pipeline our data through them. Haskell is designed for this. 

## Stage 1: extracting a list of lines

This is the easiest of the three stages. Haskell has a standard function to do this. It is called `lines`.

In [None]:
linesOfCard = lines vcard

We'll print the value of `linesOfCard`, and then check its type. 

In [None]:
linesOfCard

In [None]:
:type linesOfCard

This looks very like the String we started with, but look closely and there are differences. The type is `[String]`, not `String`. This means that it is a list of Strings. The value starts with a `[`, not a `"`, and all the escaped new lines have disappeared. It is not a list, with 13 entries, one for each line of the card. 

In [None]:
length linesOfCard

Let's look at the eighth line (numbering starts at 0). 

In [None]:
linesOfCard!!7

So that is, indeed a line of the card containing a phone number. 

## Stage 2: extract the list of lines containing the phone numbers

Once again we are going to use some standard Haskell functions. 

`filter` extracts all the elements of a list with a particular property. 

In [None]:
filter even [1,2,3,4,5,6]

The property we are interested in is that the first three characters of the line are `"TEL"`. We'll write a little function that checks that. We use the function `take`, which takes the first however many characters of a string (or more generally entries of a list). 

In [None]:
take 3 "abcdef"

In [None]:
take 3 (linesOfCard!!7)

In [None]:
isTELLine line = take 3 line == "TEL"

We simply check whether the result of taking the first three characters of the line is the String `"TEL"`. Let's verify that it works. 

In [None]:
linesOfCard !! 7 
isTELLine (linesOfCard !! 7)

Try again with different lines to make sure. 

We can now use this with filter. (Standard Haskell identifiers must begin with lowercase). 

In [None]:
tELLinesOfCard = filter isTELLine linesOfCard

In [None]:
tELLinesOfCard

## Stage 3: extract the phone numbers from each line

We are going to start by writing a function that extracts the phone number from a single line. Then we are going to see how to apply this to each entry in our list of lines.

Start by looking at a line containing a phone number. 

In [None]:
tELLinesOfCard !! 0

This tells us there is a special function head for the first element of a list. Let's ignore that. 

If we look at this line the phone number is between the first colon (:) and the end of the line.

We produce a small function to check whether a character is not a colon.

In [None]:
isNotColon c = c /= ':'

In [None]:
isNotColon 'a'
isNotColon ':'

Note that characters are delimited by single quotes. 

We use the function dropWhile to remove everything up to the first colon. 

In [None]:
dropWhile isNotColon (tELLinesOfCard !! 0)

We still have the : at the beginning of the String. To get rid of this we can just take the tail of the String. This is everything except the first character. 

In [None]:
tail $ dropWhile isNotColon (tELLinesOfCard !! 0)

There is an odd $ sign there. This is a way of saying that we are pipelining values in from the right through the function on the left. It is just a fancy way of writing function applications.

But now we need to package this operation as a function. That just means replacing the example with a parameter in the context of a declaration.

In [None]:
getTELFromLine line = tail $ dropWhile isNotColon line

We're nearly there. All we need to do now is apply this to every appropriate line. In order to do this we use the function map, which does exactly that. 

In [None]:
map (*2) [1,2,3,4]

In [None]:
map getTELFromLine tELLinesOfCard

That's it. We now have the phone numbers. 

## Putting it together

Let's put this all together so that we have a single function taking a vcard into its list of phone numbers. All we've done is pipeline three operations together. 

In [None]:
getTELFromVcard card = map getTELFromLine $ filter isTELLine $ lines card

In [None]:
vcard 
getTELFromVcard vcard

This code means we take the vcard, and pipe it through, successively the operations 
- `lines`: split into lines
- `filter isTELLine`: extract the lines holding phone numbers
- `map getTELFromLine`: get the telephone number from each line. 

Now putting the entire program together we have: 

In [None]:
isTELLine line = take 3 line == "TEL"
getTELFromLine line = tail $ dropWhile (/= ':') line
getTELFromVcard card = map getTELFromLine $ filter isTELLine $ lines card

I've made one small change here. I've replaced the function `isNotColon` with a nifty bit of shorthand `(/= ':')`.

Haskell encourages this kind of programming. We've taken a piece of data and applied a lot of functions to it to get the end result we want. What we haven't done at all in this program is read data in or print it out. Haskell encourages you to read and print in separate bits of code, as here: 

In [None]:
showTELFromVcardFile filename = 
  do
    vcard <- readFile filename        -- read data in here
    let tels = getTELFromVcard vcard  -- manipulate it
    print tels                        -- print result  

In [None]:
showTELFromVcardFile "john.vcard"

This has a different type. 

In [None]:
:type showTELFromVcardFile "john.vcard"