Skip to content

Commit

Permalink
docs: add finetuner docs (#771)
Browse files Browse the repository at this point in the history
* docs: add finetuner docs

* docs: add finetuner instruction

* docs: add finetuner instruction

* docs: add finetuner instruction

* docs: add finetuner instruction

* docs: add finetuner instruction

* docs: add finetuner instruction

* docs: improve narratives

* docs: minor revision

* docs: minor revision

* docs: address comment

* docs: table width

* docs: fix table

* docs: minor revision

Co-authored-by: Isabelle Mohr <retrospect@protonmail.com>

* docs: minor revision

Co-authored-by: Isabelle Mohr <retrospect@protonmail.com>

* docs: minor revision

Co-authored-by: Isabelle Mohr <retrospect@protonmail.com>

* docs: address comment

* docs: restructure

* docs: intersphinx

* docs: typo

* docs: add image preview

* docs: fix image link

* docs: fix typo

* docs: add model path example

* docs: add model path example

* docs: improve narratives

Co-authored-by: Isabelle Mohr <retrospect@protonmail.com>
  • Loading branch information
ZiniuYu and violenil committed Jul 20, 2022
1 parent 0ff4e25 commit bc6b72e
Show file tree
Hide file tree
Showing 4 changed files with 238 additions and 5 deletions.
2 changes: 2 additions & 0 deletions docs/conf.py
Expand Up @@ -80,6 +80,8 @@
html_show_sourcelink = False
html_favicon = '_static/favicon.png'

intersphinx_mapping = {'docarray': ('https://docarray.jina.ai/', None), 'finetuner': ('https://finetuner.jina.ai/', None)}

latex_documents = [(master_doc, f'{slug}.tex', project, author, 'manual')]
man_pages = [(master_doc, slug, project, [author], 1)]
texinfo_documents = [
Expand Down
1 change: 0 additions & 1 deletion docs/index.md
Expand Up @@ -178,7 +178,6 @@ It means the client and the server are now connected. Well done!
user-guides/client
user-guides/server
user-guides/faq
```

```{toctree}
Expand Down
187 changes: 187 additions & 0 deletions docs/user-guides/finetuner.md
@@ -0,0 +1,187 @@
(Finetuner)=
# Fine-tune Models

Although CLIP-as-service has provided you a list of pre-trained models, you can also fine-tune your models.
This guide will show you how to use [Finetuner](https://finetuner.jina.ai) to fine-tune models and use them in CLIP-as-service.

For installation and basic usage of Finetuner, please refer to [Finetuner documentation](https://finetuner.jina.ai).
You can also [learn more details about fine-tuning CLIP](https://finetuner.jina.ai/tasks/text-to-image/).

## Prepare Training Data

Finetuner accepts training data and evaluation data in the form of {class}`~docarray.array.document.DocumentArray`.
The training data for CLIP is a list of (text, image) pairs.
Each pair is stored in a {class}`~docarray.document.Document` which wraps two [`chunks`](https://docarray.jina.ai/fundamentals/document/nested/) with `image` and `text` modality respectively.
You can push the resulting {class}`~docarray.array.document.DocumentArray` to the cloud using the {meth}`~docarray.array.document.DocumentArray.push` method.

We use [fashion captioning dataset](https://github.com/xuewyang/Fashion_Captioning) as a sample dataset in this tutorial.
The following are examples of descriptions and image urls from the dataset.
We also include a preview of each image.

| Description | Image URL | Preview |
|---------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------|
| subtly futuristic and edgy this liquid metal cuff bracelet is shaped from sculptural rectangular link | [https://n.nordstrommedia.com/id/sr3/<br/>58d1a13f-b6b6-4e68-b2ff-3a3af47c422e.jpeg](https://n.nordstrommedia.com/id/sr3/58d1a13f-b6b6-4e68-b2ff-3a3af47c422e.jpeg) | <img src="https://n.nordstrommedia.com/id/sr3/58d1a13f-b6b6-4e68-b2ff-3a3af47c422e.jpeg?raw=true" width=100px> |
| high quality leather construction defines a hearty boot one-piece on a tough lug sole | [https://n.nordstrommedia.com/id/sr3/<br/>21e7a67c-0a54-4d09-a4a4-6a0e0840540b.jpeg](https://n.nordstrommedia.com/id/sr3/21e7a67c-0a54-4d09-a4a4-6a0e0840540b.jpeg) | <img src="https://n.nordstrommedia.com/id/sr3/21e7a67c-0a54-4d09-a4a4-6a0e0840540b.jpeg?raw=true" width=100px> |
| this shimmering tricot knit tote is traced with decorative whipstitching and diamond cut chain the two hallmark of the falabella line | [https://n.nordstrommedia.com/id/sr3/<br/>1d8dd635-6342-444d-a1d3-4f91a9cf222b.jpeg](https://n.nordstrommedia.com/id/sr3/1d8dd635-6342-444d-a1d3-4f91a9cf222b.jpeg) | <img src="https://n.nordstrommedia.com/id/sr3/1d8dd635-6342-444d-a1d3-4f91a9cf222b.jpeg?raw=true" width=100px> |
| ... | ... | ... |

You can use the following script to transform the first three entries of the dataset to a {class}`~docarray.array.document.DocumentArray` and push it to the cloud using the name `fashion-sample`.

```python
from docarray import Document, DocumentArray

train_da = DocumentArray(
[
Document(
chunks=[
Document(
content='subtly futuristic and edgy this liquid metal cuff bracelet is shaped from sculptural rectangular link',
modality='text',
),
Document(
uri='https://n.nordstrommedia.com/id/sr3/58d1a13f-b6b6-4e68-b2ff-3a3af47c422e.jpeg',
modality='image',
),
],
),
Document(
chunks=[
Document(
content='high quality leather construction defines a hearty boot one-piece on a tough lug sole',
modality='text',
),
Document(
uri='https://n.nordstrommedia.com/id/sr3/21e7a67c-0a54-4d09-a4a4-6a0e0840540b.jpeg',
modality='image',
),
],
),
Document(
chunks=[
Document(
content='this shimmering tricot knit tote is traced with decorative whipstitching and diamond cut chain the two hallmark of the falabella line',
modality='text',
),
Document(
uri='https://n.nordstrommedia.com/id/sr3/1d8dd635-6342-444d-a1d3-4f91a9cf222b.jpeg',
modality='image',
),
],
),
]
)
train_da.push('fashion-sample')
```

The full dataset has been converted to `clip-fashion-train-data` and `clip-fashion-eval-data` and pushed to the cloud which can be directly used in Finetuner.

## Start Finetuner

You may now create and run a fine-tuning job after login to Jina ecosystem.

```python
import finetuner

finetuner.login()
run = finetuner.fit(
model='openai/clip-vit-base-patch32',
run_name='clip-fashion',
train_data='clip-fashion-train-data',
eval_data='clip-fashion-eval-data', # optional
epochs=5,
learning_rate=1e-5,
loss='CLIPLoss',
cpu=False,
)
```

After the job started, you may use {meth}`~finetuner.run.Run.status` to check the status of the job.

```python
import finetuner

finetuner.login()
run = finetuner.get_run('clip-fashion')
print(run.status())
```

When the status is `FINISHED`, you can download the tuned model to your local machine.

```python
import finetuner

finetuner.login()
run = finetuner.get_run('clip-fashion')
run.save_artifact('clip-model')
```

You should now get a zip file containing the tuned model named `clip-fashion.zip` under the folder `clip-model`.

## Use the Model

After unzipping the model you get from the previous step, a folder with the following structure will be generated:

```text
.
└── clip-fashion/
├── config.yml
├── metadata.yml
├── metrics.yml
└── models/
├── clip-text/
│ ├── metadata.yml
│ └── model.onnx
├── clip-vision/
│ ├── metadata.yml
│ └── model.onnx
└── input-map.yml
```

Since the tuned model generated from Finetuner contains richer information such as metadata and config, we now transform it to simpler structure used by CLIP-as-service.

* Firstly, create a new folder named `clip-fashion-cas` or name of your choice. This will be the storage of the models to use in CLIP-as-service.

* Secondly, copy the textual model `clip-fashion/models/clip-text/model.onnx` into the folder `clip-fashion-cas` and rename the model to `textual.onnx`.

* Similarly, copy the visual model `clip-fashion/models/clip-vision/model.onnx` into the folder `clip-fashion-cas` and rename the model to `visual.onnx`.

This is the expected structure of `clip-fashion-cas`:

```text
.
└── clip-fashion-cas/
├── textual.onnx
└── visual.onnx
```

In order to use the fine-tuned model, create a custom YAML file `finetuned_clip.yml` like below. Learn more about [Flow YAML configuration](https://docs.jina.ai/fundamentals/flow/yaml-spec/) and [`clip_server` YAML configuration](https://clip-as-service.jina.ai/user-guides/server/#yaml-config).

```yaml
jtype: Flow
version: '1'
with:
port: 51000
executors:
- name: clip_o
uses:
jtype: CLIPEncoder
metas:
py_modules:
- clip_server.executors.clip_onnx
with:
name: ViT-B/32
model_path: 'clip-fashion-cas' # path to clip-fashion-cas
replicas: 1
```

```{warning}
Note that Finetuner only support ViT-B/32 CLIP model currently. The model name should match the fine-tuned model, or you will get incorrect output.
```

You can now start the `clip_server` using fine-tuned model to get a performance boost:

```bash
python -m clip_server finetuned_clip.yml
```

That's it, enjoy 🚀
53 changes: 49 additions & 4 deletions docs/user-guides/server.md
Expand Up @@ -75,6 +75,23 @@ Open AI has released 9 models so far. `ViT-B/32` is used as default model in all
| ViT-L/14 |||| 768 | 933 | 3.66 | 2.04 |
| ViT-L/14@336px |||| 768 | 934 | 3.74 | 2.23 |

### Use custom model

You can also use your own model in ONNX runtime by specifying the model name and the path to model directory in YAML file.
The model directory should have the same structure as below:

```text
.
└── custom-model/
├── textual.onnx
└── visual.onnx
```

One may wonder how to produce the model as described above.
Fortunately, you can simply use the [Finetuner](https://finetuner.jina.ai) to fine-tune your model based on custom dataset.
[Finetuner](https://finetuner.jina.ai) is a cloud service that makes fine-tuning simple and fast.
Moving the process into the cloud, [Finetuner](https://finetuner.jina.ai) handles all related complexity and infrastructure, making models performant and production ready.
{ref}`Click here for detail instructions<Finetuner>`.

## YAML config

Expand Down Expand Up @@ -230,11 +247,11 @@ executors:

For all backends, you can set the following parameters via `with`:

| Parameter | Description |
|-----------|--------------------------------------------------------------------------------------------------------------------------------|
| `name` | Model weights, default is `ViT-B/32`. Support all OpenAI released pretrained models. |
| Parameter | Description |
|-------------------------|--------------------------------------------------------------------------------------------------------------------------------|
| `name` | Model weights, default is `ViT-B/32`. Support all OpenAI released pretrained models. |
| `num_worker_preprocess` | The number of CPU workers for image & text prerpocessing, default 4. |
| `minibatch_size` | The size of a minibatch for CPU preprocessing and GPU encoding, default 64. Reduce the size of it if you encounter OOM on GPU. |
| `minibatch_size` | The size of a minibatch for CPU preprocessing and GPU encoding, default 64. Reduce the size of it if you encounter OOM on GPU. |

There are also runtime-specific parameters listed below:

Expand All @@ -252,6 +269,7 @@ There are also runtime-specific parameters listed below:
| Parameter | Description |
|-----------|--------------------------------------------------------------------------------------------------------------------------------|
| `device` | `cuda` or `cpu`. Default is `None` means auto-detect.
| `model_path` | The path to custom CLIP model, default `None`. |
````

Expand All @@ -278,6 +296,33 @@ executors:
- executors/clip_torch.py
```

To use custom model in ONNX runtime, one can do:

```{code-block} yaml
---
emphasize-lines: 9-11
---
jtype: Flow
version: '1'
with:
port: 51000
executors:
- name: clip_o
uses:
jtype: CLIPEncoder
with:
name: ViT-B/32
model_path: 'custom-model'
metas:
py_modules:
- executors/clip_onnx.py
```

```{warning}
The model name should match the fine-tuned model, or you will get incorrect output.
```

### Executor config

The full list of configs for Executor can be found via `jina executor --help`. The most important one is probably `replicas`, which **allows you to run multiple CLIP models in parallel** to achieve horizontal scaling.
Expand Down

0 comments on commit bc6b72e

Please sign in to comment.