## Setting up the environment

In [None]:
ENV["OS_AUTH_URL"]="https://keystone-yeg.cloud.cybera.ca:5000/v2.0"
ENV["OS_TENANT_NAME"]="julia_workshop"
ENV["OS_PROJECT_NAME"]="julia_workshop"
ENV["OS_USERNAME"]=""
ENV["OS_PASSWORD"]=""

include(joinpath("..", "src", "lib", "Config.jl"))

## Loading Modules

In [None]:
using FreqTables
using PlotlyJS
using MultivariateStats

## Fetching the dataset

In [None]:
titanic_clean = Dataset.fetch(:titanic_clean)

## Preparing data for visualization 

In [None]:
male = titanic_clean[titanic_clean[:Sex].=="male",:]
female = titanic_clean[titanic_clean[:Sex].=="female",:]

In [None]:
sex_ftable = freqtable(titanic_clean, :Sex)
survived_ftable = freqtable(titanic_clean, :Survived)
female_ftable = freqtable(female, :Survived)
male_ftable = freqtable(male, :Survived)

### Why use javascript charting library?
- Easy to integrate with any application. Separates out the visualization from 
  rest of the data wrangling and stats stuff
- Dynamic and interactive

### Why Plotly?

Go see for yourselves: https://plot.ly/javascript/

- It’s open source. Built on D3.js and stack.gl
- D3.js is “The” go to charting library widely used.
- Based on declarative json schema

## Let's hit it 

In section 02 we used `StatsPlots` to create some quick visualizations and now we shall try to replicate them using `PlotlyJS`.

In [None]:
trace1 = PlotlyJS.pie(;values=[sex_ftable["male"],sex_ftable["female"]],labels=["Male","Female"])
PlotlyJS.plot([trace1], Layout(height=400))

Whoa!! Doesn't it look wonderful? Now look at the top right corner of the plot, you'll see an option to save the plot as a PNG or save and edit plot in cloud.

Hover over the pie chart to view the value, percentage and it's label. Do you want to show less? Try modifying the pie chart to show only precentage and label.

Hint: what is `hoverinfo` ?

In [None]:
trace1 = PlotlyJS.pie(;values=[survived_ftable[0],survived_ftable[1]],labels=["Dead","Survived"])
PlotlyJS.plot([trace1], Layout(height=400))

In [None]:
titanic_clean = delete!(titanic_clean,[1,4,9,11])

In [None]:
trace1 = PlotlyJS.box(;y=titanic_clean[:Age],x=titanic_clean[:Sex])
layout = Layout(;yaxis=attr(title="Age"),title="Age Distribution by Gender")
PlotlyJS.plot([trace1], layout)

The centre line in the box plot represents the median value, but what if we wanted to show the mean instead? 

Hint: What is `boxmean`?

In [None]:
trace1 = PlotlyJS.histogram(;x=titanic_clean[:Age])
layout = Layout(;yaxis=attr(title="Frequency of Bucket"), xaxis=attr(title="Distribution of Age"),title="Distribution of Passenger Ages on Titanic")
PlotlyJS.plot([trace1], layout)


In [None]:
@time titanic_array_survived = array(titanic_clean[:Survived])
@time titanic_array = array(titanic_clean[:,[:Age,:Fare]])

In [None]:
    @eval using DataFrames, Colors
    # load data

nms = unique(titanic_clean[:Survived])
    colors = [RGB(0.89, 0.1, 0.1), RGB(0.21, 0.50, 0.72), RGB(0.28, 0.68, 0.3)]
    data = GenericTrace[]
    for (i, nm) in enumerate(nms)
        df = titanic_clean[titanic_clean[:Survived] .== nm, :]
#        df = df[1:100,:]
        x=df[:Age]
        y=df[:Fare]
        trace = PlotlyJS.scatter(;name=nm, mode="markers",
                           marker_size=5, marker_color=colors[i], marker_line_width=0,
                           x=x, y=y)
        push!(data, trace)
    end

layout = Layout(autosize=true, title="Scatter Plot - Age Vs Fare",
                    xaxis=attr(showbackground=true,
                                          backgroundcolor="rgb(230, 230,230)",
                                            title = "Age"),
                               yaxis=attr(showbackground=true,
                                           backgroundcolor="rgb(230, 230,230)",
    title = "Fare Price"))
    PlotlyJS.plot(data, layout)

In [None]:
function clustering_alpha_shapes()
    @eval using DataFrames, Colors
    # load data
    nms = unique(titanic_clean[:Survived])
    colors = [RGB(0.89, 0.1, 0.1), RGB(0.21, 0.50, 0.72), RGB(0.28, 0.68, 0.3)]
    data = GenericTrace[]
    for (i, nm) in enumerate(nms)
        df = titanic_clean[titanic_clean[:Survived] .== nm, :]
        x=df[:Age]
        y=log(df[:Fare])
        z=df[:Pclass]
        trace = PlotlyJS.scatter3d(;name=nm, mode="markers",
                           marker_size=3, marker_color=colors[i], marker_line_width=0,
                           x=x, y=y, z=z)
        push!(data, trace)
    end
    # notice the nested attrs to create complex JSON objects
    layout = Layout(width=800, height=550, autosize=false, title="3D Scatter plot",
                    scene=attr(xaxis=attr(showbackground=true,
                                          backgroundcolor="rgb(230, 230,230)",
                                            title = "Age"),
                               yaxis=attr(showbackground=true,
                                           backgroundcolor="rgb(230, 230,230)",
    title = "Log of Fare Price"),
                               zaxis=attr(showbackground=true,
                                           backgroundcolor="rgb(230, 230,230)",
    title = "Class")        ))
    PlotlyJS.plot(data, layout)
end
clustering_alpha_shapes()

In [None]:
Dataset.list()

In [None]:
test_array = Dataset.fetch(:titanic_test_predictions)

In [None]:
#Plotting test set predictions

function clustering_alpha_shapes()
    @eval using DataFrames, Colors, PlotlyJS
    # load data
    nms = unique(test_array[:comp])
    colors = [RGB(0.89, 0.1, 0.1), RGB(0.21, 0.50, 0.72), RGB(0.28, 0.68, 0.3)]
    data = GenericTrace[]
    tracename = ["Correct prediction", "False negative (Should be alive)", "False positive (Should be dead)"]
    for (i, nm) in enumerate(nms)
        df = test_array[test_array[:comp] .== nm, :]
        x=df[:Age]
        y=(df[:Fare])
        z=df[:Pclass]
        trace = PlotlyJS.scatter3d(;name=nm, mode="markers",
        marker_size=3, marker_color=colors[i], marker_line_width=0, name=tracename[i],
                           x=x, y=y, z=z)
        push!(data, trace)
    end

    layout = Layout(width=800, height=550, autosize=false, title="Titanic Survival",
                    scene=attr(xaxis=attr(showbackground=true,
                                          backgroundcolor="rgb(230, 230,230)",
                                            title = "Age"),
                               yaxis=attr(showbackground=true,
                                           backgroundcolor="rgb(230, 230,230)",
    title = "Log of Fare Price"),
                               zaxis=attr(showbackground=true,
                                           backgroundcolor="rgb(230, 230,230)",
    title = "Class")        ))
    PlotlyJS.plot(data, layout)
end
clustering_alpha_shapes()

In [None]:
#Testing multiple subplots

using Colors
    # load data
    nms = unique(test_array[:comp])
    colors = [RGB(0.89, 0.1, 0.1), RGB(0.21, 0.50, 0.72), RGB(0.28, 0.68, 0.3)]
    data = GenericTrace[]
    tracename = ["Correct prediction", "False negative (Should be alive)", "False positive (Should be dead)"]
    for (i, nm) in enumerate(nms)
        df = test_array[test_array[:comp] .== nm, :]
        df = df[1:10,:]
        x=df[:Age]
        y=(df[:Fare])
        z=df[:Pclass]
        trace = PlotlyJS.scatter3d(;name=nm, mode="markers",
        marker_size=3, marker_color=colors[i], marker_line_width=0, name=tracename[i],
            x=x, y=y, z=z, scene="scene1")
        push!(data, trace)
 
        #df = test_array[test_array[:comp] .== nm, :]
        x=df[:Age]
        y=(df[:Fare])
        z=df[:Pclass]
     
    trace2 = PlotlyJS.scatter3d(;name=nm, mode="markers", marker_size=3, marker_color=colors[i], marker_line_width=0, name=tracename[i], x=x, y=y, z=z, scene="scene2")
        push!(data, trace2)
    end
   
layout = Layout(width=800, height=550, autosize=false, title="Titanic Survival", scene1=attr(domain=attr(x=[0.0,0.5], y=[0.5,1.0]), xaxis=attr(title = "Age")), scene2=attr(domain=attr(x=[0.5,1], y=[0.5,1.0])))

#    layout = Layout(width=800, height=550, autosize=false, title="Titanic Survival",
 #                   scene=attr(xaxis=attr(title = "Age"), 
 #                              yaxis=attr(showbackground=true, backgroundcolor="rgb(230, 230,230)", title = "Log of Fare Price"),
  #                             zaxis=attr(showbackground=true, backgroundcolor="rgb(230, 230,230)", title = "Class"),       
   #                             xaxis2=attr(showbackground=true, backgroundcolor="rgb(230, 230,230)", title = "Class", domain=[0.55,1]),
    #                            yaxis2=attr(showbackground=true, backgroundcolor="rgb(230, 230,230)", title = "Class", anchor="x2"),
     #                           zaxis2=attr(showbackground=true, backgroundcolor="rgb(230, 230,230)", title = "Class", anchor="x2")
      #                          )
       #             )
    PlotlyJS.plot(data, layout)


## How to get data to JSON?

Set the path to where you want to save the json formatted javascript file.

In [None]:
results_js_path = joinpath(Config.Path.results,"titanic_survived.js")

Create the directory to store the results if it is not already available.

In [None]:
if !ispath(Config.Path.results)
  mkdir(Config.Path.results)
end

To get data written to a file in JSON format we created a utility function in Julia:

http://juliabox.cloud.cybera.ca/edit/titanic-julia/src/lib/DataFrameUtil.jl

In [None]:
write_js(results_js_path, titanic, [:Survived], append=true)

## Exercises: 

Create another pie chart with some of the features you're interested in.

In [None]:
trace1 = PlotlyJS.pie(;values=[survived_ftable[0],survived_ftable[1]],labels=["Dead","Survived"])
PlotlyJS.plot([trace1], Layout(height=400))