fix(demo): fix celeba example docs, logic, code (#145)

* fix(demo): fix celeba example docs, logic, code * fix(demo): fix celeba example docs, logic, code
jina-ai · Oct 19, 2021 · bc8b36e · bc8b36e
1 parent 0be69a4
commit bc8b36e
Show file tree

Hide file tree

Showing 10 changed files with 88 additions and 76 deletions.
diff --git a/README.md b/README.md
@@ -40,7 +40,7 @@ and production.
 
 ## Install
 
-Make sure you have Python 3.7+ and one of Pytorch, Keras or PaddlePaddle installed on Linux/MacOS.
+Make sure you have Python 3.7+ and one of Pytorch (>=1.9), Tensorflow (>=2.5) or PaddlePaddle installed on Linux/MacOS.
 
 ```bash
 pip install finetuner

diff --git a/docs/get-started/celeba-labeler.gif b/docs/get-started/celeba-labeler.gif
diff --git a/docs/get-started/celeba.md b/docs/get-started/celeba.md
@@ -1,26 +1,48 @@
-# Finetuning Pre-Trained ResNet on CelebA Dataset
+# Finetuning Pretrained ResNet for Celebrity Face Search
 
-In this example, we want to "tune" the pre-trained [ResNet](https://arxiv.org/abs/1512.03385) on [CelebA dataset](https://mmlab.ie.cuhk.edu.hk/projects/CelebA.html), the ResNet model has pre-trained weights on ImageNet.
+```{tip}
+For this example, you will need a GPU machine to enable the best experience.
+```
+
+In this example, we want to "tune" the pre-trained [ResNet](https://arxiv.org/abs/1512.03385) on [CelebA dataset](https://mmlab.ie.cuhk.edu.hk/projects/CelebA.html). Note that, the original weights of the ResNet model was trained on ImageNet.
 
-Precisely, "tuning" means: 
-- we set up a Jina search pipeline and will look at the top-K visually similar result;
-- we accept or reject the results based on their quality;
-- we let the model to remember our feedback and produce better search result.
+The Finetuner will work in the following steps: 
+- first, we spawn the Labeler that helps us to inspect the top-K visually similar celebrities face images from original ResNet;
+- then, with the Labeler UI we accept or reject the results based on their similarities;
+- finally, the results are collected at the backend by the Tuner, which "tunes" the ResNet and produces better search result.
 
 Hopefully the procedure converges after several rounds; and we get a tuned embedding for better celebrity face search.
 
-## Build embedding model
+## Prepare CelebA data
 
-Let's import pre-trained ResNet as our {ref}`embedding model<embedding-model>` using any of the following frameworks.
+Let's first make sure you have downloaded all the images [`img_align_celeba.zip`](https://drive.google.com/file/d/0B7EVK8r0v71pZjFTYXZWM3FlRnM/view?usp=sharing&resourcekey=0-dYn9z10tMJOBAkviAcfdyQ) and [`IdentityCelebA.txt`](https://drive.google.com/file/d/1_ee_0u7vcNLOfNLegJRHmolfH5ICW-XS/view?usp=sharing) locally.
 
-````{tab} PyTorch
+```{caution}
+Beware that the original CelebA dataset is 1.3GB. In this example, we do not need the full dataset. Here is a smaller version which contains 1000 images from the original dataset. You can [download it from here](https://static.jina.ai/celeba/celeba-img.zip).
+```
+
+Note that Finetuner accepts Jina `DocumentArray`/`DocumentArrayMemmap`, so we first load CelebA data into this format using generator:
 
+```python
+from jina.types.document.generators import from_files
+
+
+def data_gen():
+    for d in from_files('/Users/jina/Downloads/img_align_celeba/*.jpg', size=100, to_dataturi=True):
+        d.convert_image_datauri_to_blob(color_axis=0)  # no need of tf
+        yield d
+```
+
+## Load the pretrained model
+
+Let's import pretrained ResNet50 as our base model. ResNet50 is implemented in Pytorch, Keras and Paddle. You can choose whatever framework you feel comfortable:
+
+````{tab} PyTorch
 ```python
 import torchvision
 
 model = torchvision.models.resnet50(pretrained=True)
 ```
-
 ````
 ````{tab} Keras
 ```python
@@ -37,73 +59,53 @@ model = paddle.vision.models.resnet50(pretrained=True)
 ```
 ````
 
-## Prepare data
-
-Now prepare CelebA data for the Finetuner. Note that Finetuner accepts Jina `DocumentArray`/`DocumentArrayMemmap`, so we first convert them into this format.
-
-Let's first make sure you have downloaded all the images `img_align_celeba.zip` (unzip) and `IdentityCelebA.txt` locally.
-
-Since each celebrity has multiple facial images, we first create a `defaultdict` and group these images by their identity:
-
-```python
-from collections import defaultdict
-
-DATA_PATH = '~/[YOUR-DIRECTORY]/img_align_celeba/'
-IDENTITY_PATH = '~/[YOUR-DIRECTORY]/identity_CelebA.txt'
-
-
-def group_imgs_by_identity():
-    grouped = defaultdict(list)
-    with open(IDENTITY_PATH, 'r') as f:
-        for line in f:
-            img_file_name, identity = line.split()
-            grouped[identity].append(img_file_name)
-    return grouped
-```
-
-Then we create a data generator that yields every image as a `Document` object:
-
-```python
-from jina import Document
-
-def train_generator():
-    for identity, imgs in group_imgs_by_identity().items():
-        for img in imgs:
-            d = Document(uri=DATA_PATH + img)
-            d.convert_image_uri_to_blob(color_axis=0)
-            d.convert_uri_to_datauri()
-            yield d
-```
-
 
 ## Put together
 
-Finally, let's feed the model and the data into the Finetuner:
+Finally, let's start the Finetuner. Note that we freeze the weights of the original ResNet and tuned only for the last linear layer, which leverages Tailor underneath:
 
-```python
-rv = fit(
+```{code-block} python
+---
+emphasize-lines: 5, 8
+---
+import finetuner
+
+finetuner.fit(
     model=model,
     interactive=True,
-    train_data=train_generator,
+    train_data=data_gen,
     freeze=True,
+    to_embedding_model=True,
     input_size=(3, 224, 224),
-    output_dim=512,  # Chop-off the last fc layer and add a trainable linear layer.
+    output_dim=100
 )
 ```
 
+Note that how we specify `interactive=True` and `to_embedding_model=True` in the above code to activate Labler and Tailor, respectively.
+
+`input_size` is not required when you using Keras as the backend.
+
 ## Label interactively
 
-You can now label the data by mouse/keyboard. The model will get trained and improved as you are labeling.
+After running the script, the browser will open the Labeler UI. You can now label the data by mouse/keyboard. The model will get trained and improved as you are labeling. If you are running this example on a CPU machine, it can take up to 20 seconds for each labeling round. 
+
+```{figure} celeba-labeler.gif
+:align: center
+```
 
-From the backend you will see model's training procedure:
+On the backend, you should be able to see the training procedure in the terminal.
 
-```bash
-           Flow@22900[I]:🎉 Flow is ready to use!
+```console
+           Flow@6620[I]:🎉 Flow is ready to use!
 	🔗 Protocol: 		HTTP
-	🏠 Local access:	0.0.0.0:52621
-	🔒 Private network:	172.18.1.109:52621
-	🌐 Public address:	94.135.231.132:52621
-	💬 Swagger UI:		http://localhost:52621/docs
-	📚 Redoc:		http://localhost:52621/redoc
-           JINA@22900[I]:Finetuner is available at http://localhost:52621/finetuner
+	🏠 Local access:	0.0.0.0:61622
+	🔒 Private network:	172.18.1.109:61622
+	🌐 Public address:	94.135.231.132:61622
+	💬 Swagger UI:		http://localhost:61622/docs
+	📚 Redoc:		http://localhost:61622/redoc
+UserWarning: ignored unknown argument: ['thread']. (raised from /Users/hanxiao/Documents/jina/jina/helper.py:685)
+⠴ Working... ━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0:00:00 estimating...            JINA@6620[I]:Finetuner is available at http://localhost:61622/finetuner
+⠏ Working... ━━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0:00:00  0.0 step/s UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at  ../torch/csrc/utils/tensor_numpy.cpp:180.) (raised from /Users/hanxiao/Documents/trainer/finetuner/labeler/executor.py:53)
+⠦       DONE ━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0:00:06  1.4 step/s 11 steps done in 6 seconds
+⠙       DONE ━━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0:00:03  0.3 step/s T: Loss=    0.75
 ```
diff --git a/docs/get-started/covid-qa.md b/docs/get-started/covid-qa.md
@@ -1,4 +1,9 @@
-# Finetuning Bi-LSTM on Text 
+# Finetuning Bi-LSTM for Question-Answering
+
+```{tip}
+This example is inspired by [`jina hello chatbot`](https://docs.jina.ai/get-started/hello-world/covid-19-chatbot/). We stronly recommend you to checkout that demo first before go through this tutorial.
+```
+
 
 In this example, we want to "tune" the 32-dim embedding vectors from a bidirectional LSTM on Covid19 QA data, the same dataset that we are using in `jina hello chatbot`. 
 

diff --git a/docs/get-started/fashion-mnist.md b/docs/get-started/fashion-mnist.md
@@ -1,4 +1,8 @@
-# Finetuning MLP on Image
+# Finetuning MLP for Fashion Image Search
+
+```{tip}
+This example is inspired by [`jina hello fashion`](https://docs.jina.ai/get-started/hello-world/fashion/). We stronly recommend you to checkout that demo first before go through this tutorial.
+```
 
 In this example, we want to "tune" the 32-dim embedding vectors from a 2-layer MLP on the Fashion-MNIST image data, the same dataset that we are using in `jina hello fashion`. 
 

diff --git a/docs/index.md b/docs/index.md
@@ -252,6 +252,7 @@ Learn more about {term}`Tailor`, {term}`Tuner` and {term}`Labeler`.
 
 get-started/fashion-mnist
 get-started/covid-qa
+get-started/celeba
 ```
 
 

diff --git a/finetuner/labeler/executor.py b/finetuner/labeler/executor.py
@@ -51,8 +51,9 @@ def embed(self, docs: DocumentArray, parameters: Dict, **kwargs):
             self._embed_model.eval()
             da_input = torch.from_numpy(da.blobs)
             docs_input = torch.from_numpy(docs.blobs)
-            da.embeddings = self._embed_model(da_input).detach().numpy()
-            docs.embeddings = self._embed_model(docs_input).detach().numpy()
+            with torch.inference_mode():
+                da.embeddings = self._embed_model(da_input).detach().numpy()
+                docs.embeddings = self._embed_model(docs_input).detach().numpy()
         elif f_type == 'paddle':
             import paddle
 

diff --git a/finetuner/labeler/ui/js/components/image-match-card.vue.js b/finetuner/labeler/ui/js/components/image-match-card.vue.js
@@ -16,7 +16,7 @@ const imageMatchCard = {
             <div class="card-body">
                 <div class="image-matches-container">
                     <div class="col compact-img" v-for="(match, matchIndex) in doc.matches">
-                        <div class="w-100" v-bind:class="{ 'positive-match': match.tags.finetuner_label }">
+                        <div class="d-flex justify-content-center" v-bind:class="{ 'positive-match': match.tags.finetuner_label }">
                             <img v-bind:src="getContent(match)" class="img-thumbnail img-fluid"
                                     v-on:click="toggleRelevance(match)">
                         </div>

diff --git a/finetuner/labeler/ui/js/main.js b/finetuner/labeler/ui/js/main.js
@@ -45,7 +45,7 @@ const app = new Vue({
         advanced_config: {
             pos_value: {text: 'Positive label', value: 1, type: 'number'},
             neg_value: {text: 'Negative label', value: -1, type: 'number'},
-            epochs: {text: 'Epochs', value: 10, type: 'number'},
+            epochs: {text: 'Epochs', value: 1, type: 'number'},
             sample_size: {text: 'Match pool', value: 1000, type: 'number'},
             model_path: {text: 'Model save path', value: 'tuned-model', type: 'text'}
         },

diff --git a/finetuner/labeler/ui/main.css b/finetuner/labeler/ui/main.css
@@ -122,6 +122,7 @@ main {
 footer {
     align-self: center;
     margin-top: auto;
+    font-weight: lighter;
 }
 
 footer a {
@@ -317,10 +318,6 @@ footer a {
     border: none;
 }
 
-.image-matches-container .positive-match {
-    background: transparent;
-}
-
 .positive-match:before {
     background: url(./img/check.svg);
     content: '';
@@ -349,12 +346,14 @@ footer a {
 
 .image-card .card-header .img-thumbnail {
     margin-left: .25rem;
-    min-width: 30%;
+    min-height: 20vh;
+    width: auto;
     padding: 0;
 }
 
 .image-card .card-body .img-thumbnail {
-    width: 100%;
+    width: auto;
+    max-height: 20vh;
     padding: 0;
 }