hyperplan-io · asauray · Mar 28, 2019 · Mar 20, 2019 · Mar 21, 2019 · Mar 21, 2019
diff --git a/README.md b/README.md
@@ -2,6 +2,8 @@
 Foundaml is a service that enables machine learning predictions to be stored, predicted and associated to their labels.
 Data is the core problem of machine learning. Foundaml helps you to manage your machine learning pipeline and to develop successful machine learning projects.
 
+[Getting started](https://foundaml.github.io/server/)
+
 ## Predict
 Foundaml does not execute algorithms on its own. It needs to be paired with other software, such as TensorFlow Serving, to be able to generate predictions.
 

diff --git a/build.sbt b/build.sbt
@@ -48,14 +48,14 @@ lazy val root = (project in file("."))
     micrositeName := "FoundaML",
     micrositeDescription := "Pipeline for machine learning algorithms",
     micrositeAuthor := "FoundaML contributors",
-    micrositeOrganizationHomepage := "https://github.com/antoinesauray/foundaml-server",
-    micrositeGitterChannelUrl := "antoinesauray/foundaml-server",
-    micrositeGithubOwner := "antoinesauray",
-    micrositeGithubRepo := "foundaml-server",
+    micrositeOrganizationHomepage := "https://github.com/foundaml/server",
+    micrositeGitterChannelUrl := "foundaml/server",
+    micrositeGithubOwner := "foundaml",
+    micrositeGithubRepo := "server",
     micrositeFavicons := Seq(
       microsites.MicrositeFavicon("favicon.png", "512x512")
     ),
-    micrositeUrl := "https://antoinesauray.github.io",
-    micrositeBaseUrl := "/foundaml-server"
+    micrositeUrl := "https://foundaml.github.io",
+    micrositeBaseUrl := "/server"
   )
   .enablePlugins(MicrositesPlugin)
diff --git a/src/main/resources/microsite/img/Foundaml.png b/src/main/resources/microsite/img/Foundaml.png
diff --git a/src/main/tut/getting_started.md b/src/main/tut/getting_started.md
@@ -7,6 +7,48 @@ title:  "Getting Started"
 
 # Getting Started
 
+FoundaML will help you industrialize your machine learning projects. It sits between your clients (web apps, mobile apps etc) and your algorithms (heuristics or machine learning).
 
-# Not ready yet
+![hello](img/Foundaml.png)
 
+FoundaML has four key concepts.
+
+### Projects
+A project is a set of algorithms working on the same data. 
+The principle here is that you can compare and switch algorithms only if they operate on the same data.
+
+When you use FoundaML, you begin by defining your project with the data that it will work on and the objective it will pursue. For the moment, FoundaML supports the following types of problems.
+
+* Classification
+
+FoundaML supports a set of generic features that you can combine to build any algorithm, those include.
+
+* Double or Float
+* Integer
+* String
+
+### Algorithms
+
+An algorithm is code, running on another instance, that computes values from data. Algorithms in the same project will work on the same data. if necessary, they can reprocess  it in a pre processing pipeline ([An example with Tensorflow Transform](https://github.com/tensorflow/transform)). This can be useful if you need to normalize your data or you need to try different word embeddings on your  NLP problem.
+
+
+FoundaML can implement various backends. Currently, FoundaML supports the following APIs.
+
+* TensorFlow Serving API
+
+To perform the transformation from the project features to the algorithm input features, FoundaML needs **features transformers**. The same operation is required when converting the algorithm labels to the project labels, using **label transformers**.
+
+
+### Predictions
+
+A prediction belongs to a project and an algorithm. The algorithm that is executed depends on the project policy (if not specified explicitely by the client). You can choose between various policies available.
+
+* **No Algorithm** (By default, a project will deny predictions until an algorithm is created)
+* **DefaultAlgorithm** means executing the same algorithm all the time
+* **RoundRobin** allows you to specify weights for each algorithms that you created in your projects. This is helpful for AB testing.
+
+
+### Examples
+Each prediction comes with a set of urls that allow you to tag it as correct or incorrect. This will help you generate a labeled dataset as well as evaluate your algorithm in real time.
+
+Let's now move on to the [Titanic Example](https://foundaml.github.io/server/the_titanic.html)
diff --git a/src/main/tut/the_titanic.md b/src/main/tut/the_titanic.md
@@ -0,0 +1,270 @@
+---
+layout: page
+position: 3
+section: home
+title:  "Example: The Titanic"
+---
+
+# The Titanic (Work in progress)
+The Kaggle Titanic challenge is a popular Kaggle contest. It will serve as a good example to teach you how to use FoundaML to solve this problem.
+
+# Creating the project
+To solve this problem, we need to list the data on which our algorithms will perform. The features look as follows.
+
+* PassengerId
+* Survived
+* Pclass
+* Name
+* Sex
+* Age
+* SibSp
+* Parch
+* Ticket
+* Fare
+* Cabin
+* Embarked
+
+The algorithm should predict whether or not the person survived. This is a classification problem. We can now create the project with a curl request.
+
+```
+curl -X POST \
+  http://localhost:8080/projects/ \
+  -H 'Content-Type: application/json' \
+  -H 'cache-control: no-cache' \
+  -d '{
+	"id": "kaggle-titanic",
+    "name": "Kaggle Titanic",
+    "configuration": {
+        "problem": {
+            "class": "Classification"
+        },
+        "features": {
+        	"featuresClasses": [
+        		{
+        		  "name": "passengerId",
+        		  "featureClass": "IntFeature",
+        		  "description": "The unique identifier of the passenger"
+        		},
+        		{
+        		  "name": "pClass",
+        		  "featureClass": "IntFeature",
+        		  "description": "Class of travel"
+        		},
+        		{
+        		  "name": "name",
+        		  "featureClass": "StringFeature",
+        		  "description": "Name of passenger"
+        		},
+        		{
+        	      "name": "sex",
+        		  "featureClass": "StringFeature",
+        		  "description": "Gender"
+        		},
+        		{
+        		  "name": "age",
+        		  "featureClass": "IntFeature",
+        		  "description": "Age"
+        		},
+        		{
+        		  "name": "sibSp",
+        		  "featureClass": "IntFeature",
+        		  "description": "Number of Sibling/Spouse aboard"
+        		},
+        		{
+        		  "name": "pArch",
+        		  "featureClass": "IntFeature",
+        		  "description": "Number of Parent/Child aboard"
+        		},
+        		{
+        		  "name": "ticket",
+        		  "featureClass": "StringFeature",
+        		  "description": "The ticket identifier"
+        		},
+        		{
+        		  "name": "fare",
+        		  "featureClass": "StringFeature",
+        		  "description": "Which fare"
+        		},
+        		{
+        		  "name": "cabin",
+        		  "featureClass": "StringFeature",
+        		  "description": "Which cabin"
+        		},
+        		{
+        		  "name": "embarked",
+        		  "featureClass": "StringFeature",
+        		  "description": "The port in which a passenger has embarked. C - Cherbourg, S - Southampton, Q = Queenstown"
+        		}
+        	]
+        },
+        "labels": [
+          "survived",
+          "notSurvived"
+        ]
+    }
+}'
+```
+
+# Our first algorithm, a simple heuristic
+Our first algorithm will be quite simple. It will be this [simple heuristic](https://github.com/foundaml/titanic-heuristic).
+It does not really matter at that point what we compute.
+
+We will use the [TensorFlow Serving API](https://www.tensorflow.org/tfx/serving/api_rest) for our algorithm. I suggest you read about it before you continue this example.
+
+### Features transformation
+This json object represents the mapping between our project features and the Tensorflow Serving API.
+
+```
+"featuresTransformer": {
+	"signatureName": "",
+	"fields": [
+		"passenger_id",
+		"p_class",
+		"name",
+		"sex",
+		"age",
+		"sib_sp",
+		"p_arch",
+		"ticket",
+		"fare",
+		"cabin",
+		"embarked"
+	]
+}
+```
+### Labels transformation
+It is possible that the output of the algorithm does not exactly match the output of our project. We can define a transformation that maps one to the other. 
+```
+"labelsTransformer": {
+    "fields": {
+	    "survived": "survived",
+	    "did_not_survived": "notSurvived"
+	}
+}
+```
+The keys of the ``field`` object are the outputs of the algorithm. Their value is the label of the project that we want to map it to.
+
+## Adding our algorithm to the project
+We can summarize the information above with this http query.
+
+```
+curl -X POST \
+  http://localhost:8080/algorithms/ \
+  -H 'Content-Type: application/json' \
+  -H 'cache-control: no-cache' \
+  -d '
+  {
+	"id": "tf-kaggle-titanic-1",
+	"projectId": "kaggle-titanic",
+	"backend": {
+	  "class": "TensorFlowBackend",
+	  "host": "127.0.0.1",
+	  "port": 3000,
+	  "featuresTransformer": {
+	    "signatureName": "",
+	    "fields": [
+	  	  "passenger_id",
+		  "p_class",
+		  "name",
+		  "sex",
+		  "age",
+		  "sib_sp",
+		  "p_arch",
+		  "ticket",
+		  "fare",
+		  "cabin",
+		  "embarked"
+		 ]
+	},
+	"labelsTransformer": {
+	  "fields": {
+	    "survived": "survived",
+	    "did_not_survived": "notSurvived"
+	  }
+	}
+  }
+}
+```
+
+# Start predicting labels
+So we should have a working algorithm by now. We can start making predictions. Let's take the first sample of our Kaggle dataset.
+
+```
+curl -X POST \
+  http://localhost:8080/predictions \
+  -H 'Content-Type: application/json' \
+  -H 'cache-control: no-cache' \
+  -d '{
+	"projectId": "kaggle-titanic",
+	"algorithmId": "tf-kaggle-titanic-1",
+	"features": {
+		"class": "CustomFeatures",
+		"data": [
+			1,
+			3,
+			"Braund Mr. Owen Harris",
+			"male",
+			22,
+			1,
+			0,
+			"A/5 21171",
+			"7.25",
+			"",
+			"S"
+		]
+	}
+}
+```
+
+If our algorithm is correctly configured, we will get something like below.
+
+```
+{
+    "id": "5c93c052-be9e-4b1c-bfda-fbd3c0514966",
+    "projectId": "kaggle-titanic",
+    "algorithmId": "tf-kaggle-titanic-1",
+    "features": {
+        "data": [
+            1,
+            3,
+            "Braund Mr. Owen Harris",
+            "male",
+            22,
+            1,
+            0,
+            "A/5 21171",
+            "7.25",
+            "",
+            "S"
+        ],
+        "class": "CustomFeatures"
+    },
+    "labels": {
+        "labels": [
+            {
+                "id": "4e1574f9-5376-434d-ada0-b74ed18ca50c",
+                "label": "survived",
+                "probability": 0,
+                "correctExampleUrl": "/examples?predictionId=5c93c052-be9e-4b1c-bfda-fbd3c0514966&labelId=4e1574f9-5376-434d-ada0-b74ed18ca50c&isCorrect=true",
+                "incorrectExampleUrl": "/examples?predictionId=5c93c052-be9e-4b1c-bfda-fbd3c0514966&labelId=4e1574f9-5376-434d-ada0-b74ed18ca50c&isIncorrect=true",
+                "class": "ClassificationLabel"
+            },
+            {
+                "id": "4f9e4773-a31d-4d49-a63e-360947060363",
+                "label": "notSurvived",
+                "probability": 1,
+                "correctExampleUrl": "/examples?predictionId=5c93c052-be9e-4b1c-bfda-fbd3c0514966&labelId=4f9e4773-a31d-4d49-a63e-360947060363&isCorrect=true",
+                "incorrectExampleUrl": "/examples?predictionId=5c93c052-be9e-4b1c-bfda-fbd3c0514966&labelId=4f9e4773-a31d-4d49-a63e-360947060363&isIncorrect=true",
+                "class": "ClassificationLabel"
+            }
+        ]
+    },
+    "examples": []
+}
+```
+
+Notice that we predicted the passenger would not survive with a probability of 1 ! Pay attention to the `correctExampleUrl` and `incorrectExampleUrl`. These links are relative to the root of the foundaml server. 
+
+They allow you to label a prediction correct or incorrect. If you can have humans validate your predictions, this is extremely valuable.
+
+You should also be aware that each prediction and example gets published to your favorite streaming platform (Only Amazon Kinesis at the moment). This allows you to evaluate your algorithms in real time.