# CIS 194: Homework 2

Something has gone terribly wrong!
- Files you will need: Log.hs, error.log, sample.log
- Files you should submit: LogAnalysis.hs

## Log file parsing

We’re really not sure what happened, but we did manage to recover
the log file error.log. It seems to consist of a different log message
on each line. Each line begins with a character indicating the type of
log message it represents:
- ’I’ for informational messages,
- ’W’ for warnings, and
- ’E’ for errors.

The error message lines then have an integer indicating the severity
of the error, with 1 being the sort of error you might get around to
caring about sometime next summer, and 100 being epic, catastrophic
failure. 

All the types of log messages then have an integer time stamp
followed by textual content that runs to the end of the line. Here is a
snippet of the log file including an informational message followed
by a level 2 error message:

```
I 147 mice in the air, I’m afraid, but you might catch a bat, and
E 2 148 #56k istereadeat lo d200ff] BOOTMEM
```

It’s all quite confusing; clearly we need a program to sort through
this mess. We’ve come up with some data types to capture the structure of this log file format:

```haskell
data MessageType = Info
| Warning
| Error Int
deriving (Show, Eq)

type TimeStamp = Int

data LogMessage = LogMessage MessageType TimeStamp String
| Unknown String
deriving (Show, Eq)
```

Note that LogMessage has two constructors: one to represent normallyformatted log messages, and one to represent anything else that does
not fit the proper format.

We’ve provided you with a module Log.hs containing these data
type declarations, along with some other useful functions. 

The first few lines of LogAnalysis.hs should look like this:

```haskell
module LogAnalysis where
import Log
```

which sets up your file as a module named LogAnalysis, and imports the module from Log.hs so you can use the types and functions
it provides.

In [1]:
:l Log

In [2]:
parseMessage :: String -> LogMessage
parseMessage str = case words str of
  ("I" : time : msg) -> parseNormal Info time msg
  ("W" : time : msg) -> parseNormal Warning time msg
  ("E" : level : time : msg) -> parseError level time msg
  _ -> Unknown str
  where
    parseNormal t time msg = case reads time of
      [(ts, "")] -> LogMessage t ts (unwords msg)
      _ -> Unknown str

    parseError level time msg = case (reads level, reads time) of
      ([(lv, "")], [(ts, "")]) -> LogMessage (Error lv) ts (unwords msg)
      _ -> Unknown str

In [3]:
parseMessage "E 2 562 help help" == LogMessage (Error 2) 562 "help help"

True

In [4]:
parseMessage "I 29 la la la" == LogMessage Info 29 "la la la"

True

In [5]:
parseMessage "This is not in the right format" == Unknown "This is not in the right format"

True

In [6]:
parse :: String -> [LogMessage]
parse = map parseMessage . lines

In [7]:
testParse parse 10 "error.log"



## Putting the logs in order

Unfortunately, due to the error messages being generated by multiple
servers in multiple locations around the globe, a lightning storm, a
failed disk, and a bored yet incompetent programmer, the log messages are horribly out of order. 

Until we do some organizing, there
will be no way to make sense of what went wrong! We’ve designed a
data structure that should help—a binary search tree of LogMessages:

```haskell
data MessageTree = Leaf
| Node MessageTree LogMessage MessageTree
```

Note that MessageTree is a recursive data type: the Node constructor itself takes two children as arguments, representing the left and right subtrees, as well as a LogMessage. Here, Leaf represents the empty tree.

A MessageTree should be sorted by timestamp: that is, the timestamp of a LogMessage in any Node should be greater than all timestamps of any LogMessage in the left subtree, and less than all timestamps of any LogMessage in the right child.

Unknown messages should not be stored in a MessageTree since they lack a timestamp.

In [8]:
insert :: LogMessage -> MessageTree -> MessageTree
insert (Unknown _) tree = tree
insert msg Leaf = Node Leaf msg Leaf
insert msg@(LogMessage _ x _) tree@(Node l m@(LogMessage _ y _) r)
  | x < y = Node (insert msg l) m r
  | otherwise = Node l m (insert msg r)

In [9]:
build :: [LogMessage] -> MessageTree
build = foldr insert Leaf

In [10]:
inOrder :: MessageTree -> [LogMessage]
inOrder Leaf = []
inOrder (Node l m r) = inOrder l ++ [m] ++ inOrder r

In [11]:
tree <- testParse parse 10 "error.log"
inOrder (build tree)



## Log file postmortem

Write a function which takes an unsorted list of LogMessages, and returns a list of the
messages corresponding to any errors with a severity of 50 or greater,
sorted by timestamp.

In [15]:
whatWentWrong :: [LogMessage] -> [String]
whatWentWrong = map extractMsg . filter isSevereError . inOrder . build
  where
    isSevereError (LogMessage (Error level) _ _) = level >= 50
    isSevereError _ = False

    extractMsg (LogMessage _ _ msg) = msg
    extractMsg _ = ""

In [13]:
testWhatWentWrong parse whatWentWrong "sample.log"

["Way too many pickles","Bad pickle-flange interaction detected","Flange failed!"]

## Optional

For various reasons we are beginning to suspect that the recent mess was caused by a single, egotistical hacker. Can you figure out who did it?

In [16]:
testWhatWentWrong parse whatWentWrong "error.log"

["Mustardwatch opened, please close for proper functioning!","All backup mustardwatches are busy","Depletion of mustard stores detected!","Hard drive failure: insufficient mustard","All backup mustardwatches are busy","Twenty seconds remaining until out-of-mustard condition","Ten seconds remaining until out-of-mustard condition","Empty mustard reservoir! Attempting to recover...","Recovery failed! Initiating shutdown sequence"]