## File Reading:

This document outlines techniques for efficiently reading and parsing a<br>
given file of `Double` precision `Floats` in Haskell. Further, outlined<br>
here is an Api for querying and working with the contents of the file.<p>

The following code makes use of a number of libraries and ideas which<br>
can be found here: 
<a href="https://wiki.haskell.org/Numeric_Haskell:_A_Vector_Tutorial#An_example:_filling_a_vector_from_a_file">
Numeric_Haskell: A Vector Tutorial</a>. Some of the information<br>
found there is a bit outdated, in particular `readDouble` is now <a href="https://github.com/wrengr/bytestring-lexing#changes-version-050-2015-05-06-vs-043-2013-03-21">deprecated</a><br>
in favor of `readDecimal`. There are a few loose-ends created in the wake<br>
of this change. I hope to cover them here.<p>

Essential to what follows are these libraries:
* Data.ByteString.Lex.Fractional
* Data.ByteString.Char8
* Data.Vector.Unboxed
* System.Environment

### Importing and Parsing

After creating a test data file with 1 Million numbers via `bash`: `seq 1 1000000 > data`,<br>
we are ready to import the necessary libraries, read the file and parse it. The code below<br>
then takes the additional step of summing the contents of the file.<br>
```
{-# LANGUAGE BangPatterns #-}
import qualified Data.ByteString.Char8 as L
import qualified Data.ByteString.Lex.Fractional as L
import qualified Data.Vector.Unboxed as U
import System.Environment

main = do
    !s <- L.readFile "./data"
    print . U.sum . parse $ s

parse :: L.ByteString -> U.Vector Double
parse = U.unfoldr step
  where
     step !s = case L.readDecimal s of
        Nothing       -> Nothing
        Just (!k, !t) -> Just (k, L.tail t)
```
Notice the use of `{-# LANGUAGE BangPatterns #-}` and the corresponding `!`<br>
prepended to the parsers `step` variable. The bang ensures that the file is read<br>
`strictly`. The two `ByteString` libraries do not have functional overlap and so<br>
can be qualified by the same scoping `L`. To facilitate data handling the contents<br>
are interpreted as `Unboxed Vector Doubles`.

### Compilation

While running the parser above can be done in the interactive environment,<br>
a pretty tremendous speed up is gained by compiling with the `-Odph` flags.<br>
Naming the above code `FileToVector.hs`, I proceed to compile via:<br>
`ghc -Odph --make FileToVector.hs`,<br>
    and run via `time ./FileToVector`<p>
Comparison with `ghci` gives: 
* uncompiled =>(0.75 secs, 686,023,208 bytes)
* compiled => real	0m0.072s
