This tutorial will use the titanic dataset which is available on [OpenML](https://www.openml.org/search?type=data&sort=runs&id=40945&status=active). Place it in the same directory as this Jupiter Notebook.  

The first thing we have to do is to import our data. We can do that by doing the following:

In [28]:
import tablebase

Titanic_Data = tablebase.CsvTable("Titanic.csv")

In line one we imported tablebase. Next we made an object called **Titanic_Data** the contains our imported csv file (*Titanic.csv*). Now, lets print out the data by using the **display** method:

In [29]:
Titanic_Data.display()

     id   pclass   survived                                                                                name      sex      age   sibsp   parch               ticket       fare             cabin   embarked      boat   body                                         home.dest
      1        1          1                                                        Allen,Miss. Elisabeth Walton   female       29       0       0                24160   211.3375                B5          S         2      ?                                       St Louis,MO
      2        1          1                                                       Allison,Master. Hudson Trevor     male   0.9167       1       2               113781     151.55           C22 C26          S        11      ?                     Montreal,PQ / Chesterville,ON
      3        1          0                                                         Allison,Miss. Helen Loraine   female        2       1       2               113781     151.55 

As you might have guessed, the **fare** in 1912 Brutish Pounds. We can convert it to USD by multiplying the **fare** column by 4.87. We can do this with the **expand** method.

In [30]:
Titanic_Data.expand("fare", "0 if @fare@ == '?' else float(@fare@) * 4.87")
Titanic_Data.display()

     id   pclass   survived                                                                                name      sex      age   sibsp   parch               ticket                 fare             cabin   embarked      boat   body                                         home.dest
      1        1          1                                                        Allen,Miss. Elisabeth Walton   female       29       0       0                24160          1029.213625                B5          S         2      ?                                       St Louis,MO
      2        1          1                                                       Allison,Master. Hudson Trevor     male   0.9167       1       2               113781    738.0485000000001           C22 C26          S        11      ?                     Montreal,PQ / Chesterville,ON
      3        1          0                                                         Allison,Miss. Helen Loraine   female        2       1       2   

As you can see, the code above changes the column **fare** to the old **fare** multiplied by 4.87. We had to include the if statement because there are some fares that say `?` which means unknown.

Now we know we have the data that we like. We can now create a filter to split the data by gender:

In [31]:
Male_Data = Titanic_Data.filter("@sex@ == 'male'")
Female_Data = Titanic_Data.filter("@sex@ == 'female'")

First we make an object with only male passengers by filtering for records where the column **sex** is **male**. We then did the same thing for the female passengers by filtering for records where the column **sex** is **female**. Now lets calculate the mean for the survival rate of both male and female passengers. First we need to make a simple mean function:

In [32]:
def mean(data: list):
    return sum(float(item) for item in data)/int(len(data))

We now can find the mean for both men and women and print the results to the console.

In [33]:
print("Average survival for men:", mean(Male_Data.get_col("survived")))
print("Average survival for women:", mean(Female_Data.get_col("survived")))

Average survival for men: 0.19098457888493475
Average survival for women: 0.7274678111587983


What if we want to the survival rate of first class women. We can filter by the column **pclass**.

In [34]:
print("Average survival for women in a class less then 3:", mean(Female_Data.filter("int(@pclass@) < 3").get_col("survived")))

Average survival for women in a class less then 3: 0.932
