In [1]:
{-# LANGUAGE DataKinds, FlexibleContexts, QuasiQuotes, TemplateHaskell #-}

In [2]:
import Frames
import Graphics.Rendering.Chart
import Data.Foldable

The goal is to add a new type to the "universe of types". In this case we want to allow Haskell to infer that a column is a Gender column by looking for the strings "male" and "female"
Code grabbed from the https://github.com/acowley/Frames/blob/master/demo/TutorialUsers.hs

In [16]:
{-# LANGUAGE DataKinds, DeriveDataTypeable, TypeFamilies, TypeOperators, OverloadedStrings #-}
module TitanicTypes where
import Control.Monad (mzero)
import qualified Data.Char as C
import Data.Readable (Readable(fromText))
import qualified Data.Text as T
import Data.Typeable
import qualified Data.Vector as V
import Frames.InCore (VectorFor)
import Frames

data GenderT = Male | Female deriving (Enum,Eq,Ord,Show,Typeable)

type instance VectorFor GenderT = V.Vector

instance Readable GenderT where
  fromText "male" = return Male
  fromText "female" = return Female
  fromText _ = mzero

type MyColumns = GenderT ': CommonColumns

Following along with the Frames tutorial, we apparently need to use the following Template haskell

In [20]:
import Control.Applicative
import qualified Control.Foldl as L
import qualified Data.Foldable as F
import Data.Proxy (Proxy(..))
import Frames
import Frames.CSV (readTableOpt, rowGen, RowGen(..))
import qualified Pipes.Prelude as P
import Frames.CSV (colQ)
import TitanicTypes


tableTypes' rowGen { rowTypeName = "U2"
                   , tablePrefix = "u2"
                   , columnUniverse = $(colQ ''MyColumns) }
            "data/train.csv"

We can use the function `tableTypes` from the `Frames` library to generate a type `Row` for our CSV file, after that we can make a loader function that will load a `Frame` from the file.

In [3]:
tableTypes "Row" "data/train.csv"

loadRows :: IO (Frame Row)
loadRows = inCoreAoS $ readTable "data/train.csv"

We can inspect the **row** type:

In [4]:
:i Row

It gives us the following:
```hs
type Row =
    Record
        '["PassengerId" :-> Int
         , "Survived"   :-> Bool
         , "Pclass"     :-> Int
         , "Name"       :-> Text
         , "Sex"        :-> Text
         , "Age"        :-> Int
         , "SibSp"      :-> Int
         , "Parch"      :-> Int
         , "Embarked"   :-> Int
         ]
```

In [5]:
rows <- loadRows

In [6]:
rec1 = head $ toList rows
rec1

{PassengerId :-> 4, Survived :-> True, Pclass :-> 1, Name :-> "Futrelle, Mrs. Jacques Heath (Lily May Peel)", Sex :-> "female", Age :-> 35, SibSp :-> 1, Parch :-> 0, Ticket :-> 113803, Fare :-> 53.1, Cabin :-> "C123", Embarked :-> "S"}

We can see that this visualization style is not really comfortable, as it is just printing the `Text` representation of the record. We could try to render this as an HTML table:

In [7]:
import Graphics.Rendering.Chart.Easy
import IHaskell.Display
import Data.Vinyl.Core
import Data.Vinyl



mapM_ print $ showFields rec1

"4"
"True"
"1"
"\"Futrelle, Mrs. Jacques Heath (Lily May Peel)\""
"\"female\""
"35"
"1"
"0"
"113803"
"53.1"
"\"C123\""
"\"S\""