docs(tutorial): polish image2image (#3843)

jina-ai · Nov 2, 2021 · 5905c85 · 5905c85
1 parent 0e6e2b2
commit 5905c85
Showing 1 changed file with 46 additions and 54 deletions.
diff --git a/docs/datatype/image/image2image.md b/docs/datatype/image/image2image.md
@@ -1,60 +1,53 @@
 # Search Similar Images
 
-Given an example image can we find similar images without the need of any labels? Leveraging Jina, the advantage that we
-get is that we do not need to use any labels or textual information about the images in order to build similar image
-search.
+Given an example image can we find similar images without the need of any labels? Leveraging Jina, we have the advantage that 
+we don't need to use any labels or textual information about the images in order to build a search for similar images.
 
-In this tutorial we are going to create an image search system that retrieves similar images. The data we are going to
-use is the test split of the [Dogs vs. Cats](https://www.kaggle.com/c/dogs-vs-cats/data?select=test1.zip) datasets. We
-will subsequently refer this dataset as pets dataset. It contains 12.5K images of cats and dogs. Now, we can define our
-problem as selecting an image of cat or dog, we would like to get similar images of cats or dogs respectively.
+In this tutorial we are going to create an image search system that retrieves similar images. We are going to
+use the test split of the [Dogs vs. Cats](https://www.kaggle.com/c/dogs-vs-cats/data?select=test1.zip) dataset, which we
+will subsequently refer to as the pets dataset. It contains 12.5K images of cats and dogs. Now, we can define our
+problem as selecting an image of cat or dog, and getting back images of similar cats or dogs respectively.
 
-We know that Jina searches semantically and this could vary with the neural network that we use for encoding. Since our
-task is to search similar images we will consider visually similar as semantically-related.
+Jina searches semantically, and the results will vary depending on the neural network that we use for image encoding. Our
+task is to search for similar images so we will consider visually-similar images as semantically-related.
 
 ## Build the Flow
 
-The solution to the problem entails a simple pipeline that can be subdivided into two steps:  **Index** and **Query**
+The solution uses a simple pipeline that can be subdivided into two steps:  **Index** and **Query**
 
 ### Index
 
-To search something out of the full data, first we need to index the data. What it means is that we store the embeddings
-of all the images from the pets dataset in some form of storage. The images can be read as a numpy array which is then
-fed to neural network of our choice. This neural network encodes the input images into some latent space which we call
-as embeddings. We then use **Indexer** to store these embeddings in memory.
+To search something out of the full dataset, we first need to index the data. This means that we store the embeddings
+of all the images from the dataset in some form of storage. The images can be read as a numpy array which is then
+fed to the neural network of our choice. This neural network encodes the input images into some latent space which we call
+"embeddings". We then use an **Indexer** to store these embeddings in memory.
 
 ### Query
 
-Once the data is indexed, i.e. our database is built, we simply need to feed our query which is an image or set of
-images to the model to encode it into embeddings and then use the **Indexer** to retrieve matching images. The matching
-can be based on any type of metrics but without going deeper into this, we will focus only on euclidean distance between
-two embeddings (corresponding to two images) as metrics.
+Once the data is indexed, i.e. our database is built, we simply need to feed our query (an image or set of
+images) to the model to encode it into embeddings and then use the **Indexer** to retrieve matching images. The matching
+can be based on any type of metric but without going deeper into this, we will focus only on Euclidean distance between
+two embeddings (corresponding to two images).
 
-Now, one might think what this *Indexer* is, or how to use neural network of our choice. Worry not, we've got you
-covered. In Jina AI, we have three fundamental concepts which is all you need to know to follow this tutorial. If you
-haven't read it yet, head on to [Jina's docs](https://docs.jina.ai/) page and give it a shot. Executor is the
-algorithmic unit in the Jina. It performs a single task on a `Document` or `DocumentArray`.
-
-We have many executors available at [Jina Hub](https://hub.jina.ai) - a marketplace for Executors. You can use any of
-them relevant to your tasks or build one of your own. Coming back to problem, we will use **SimpleIndexer** executor as
-our indexer (the one that stores and retrieves data). This executor also returns us the matching `Document` when we make
+We will use the **SimpleIndexer** Executor as
+our indexer (the one that stores and retrieves data). This Executor also returns the matching `Document` when we make
 a query. The search part is done using the built-in `match` function of `DocumentArrayMemmap`. To encode the images into
-embeddings we will use our own defined executor which uses pre-trained 'ResNet101' model.
+embeddings we will use our own Executor which uses the pre-trained 'ResNet101' model.
 
 ## Flow Overview
 
-We have one flow defined for this tutorial, however, it handles requests to `/index` and `/search` differently by
-defining different endpoints using `requests` decorators. Below we see the Flow, which consists of `Encoder` to encode
-the images as first step, followed by an `Indexer` to store/retrieve data.
+We have one Flow defined for this tutorial. However, it handles requests to `/index` and `/search` differently by
+defining different endpoints using `requests` decorators. Below we see the Flow, which consists of an `Encoder` to encode
+the images as the first step, followed by an `Indexer` to store/retrieve data.
 
 ```{figure} ../../../.github/images/image_search_flow.svg
 :align: center
 ```
 
 ## Insights
 
-Our firs task is to wrap the image data as `Document` and form a `DocumentArray`. This can be easily done by using
-following code snippet. `from_files` creates an iterator over a list of image path provided and yields `Document`.
+Our first task is to wrap the image data as `Document`s and form a `DocumentArray`. This can be done easily with the
+following code snippet. `from_files` creates an iterator over a list of image paths and yields `Document`s:
 
 ```python
 from jina import DocumentArray
@@ -63,9 +56,9 @@ from jina.types.document.generators import from_files
 docs_array = DocumentArray(from_files(f'{image_dir}/*.{image_format}'))
 ```
 
-Once the image is loaded our next step is to encode these images into embeddings. As stated earlier one can use
-executors from  [hub.jina.ai](https://hub.jina.ai) and use them off-the-shelf or can define an executor of their own in
-just few steps. For this tutorial we will write our own executor in few lines of codes as shown below.
+Once the image is loaded our next step is to encode these images into embeddings. As stated earlier you can use
+Executors from [Jina Hub](https://hub.jina.ai) off-the-shelf or you can define an Executor of your own in
+just a few steps. For this tutorial we will write our own Executor:
 
 ```python
 from jina import DocumentArray, Executor, requests
@@ -85,19 +78,19 @@ class FlashImageEncoder(Executor):
             doc.embedding = embed.numpy()
 ```
 
-As one can see, how simple it is to build an `Encoder` executor. We simply inherit the base `Executor` and use decorator
-to define endpoints. As this `request` decorator is empty, it means that this function will be called regardless of the
-endpoints invoked, i.e., on both `/index` and `/search` endpoint. We
-leverage [lightning-flash](https://github.com/PyTorchLightning/lightning-flash) to use pre-trained `ResNet101` model for
-getting the embeddings. Reader can replace this model with any other pre-trained models of their choice. When this
-executor is instantiated, the pre-trained weights are downloaded automatically. The `predict` function takes in
-the `DocumentArray` and extracts embeddings which is then stored in the `embedding` attribute of the
+To build an Encoder Executor we inherit the base `Executor` and use a decorator
+to define endpoints. As this `request` decorator is empty, this function will be called regardless of the
+endpoints invoked, i.e., on both the `/index` and `/search` endpoints. We
+leverage [lightning-flash](https://github.com/PyTorchLightning/lightning-flash) to use the pre-trained `ResNet101` model for
+getting the embeddings. You can replace this model with any other pre-trained models of your choice. When this
+Executor is instantiated, the pre-trained weights are downloaded automatically. The `predict` function takes in
+the `DocumentArray` and extracts embeddings, each of which is then stored in the `embedding` attribute of the
 respective `Document`.
 
-Finally, comes the storage/retrieval step. This we accomplish using an **Indexer** executor. We can use any of the
-available indexers on [hub.jina.ai](https://hub.jina.ai) or define our own. To create an **Indexer** we need to have two
-endpoints `/index` and `/search`. For this tutorial we will define a `SimpleIndexer` which is also available on jina
-hub.
+Finally, comes the storage/retrieval step. We do this with the **Indexer** Executor. You can use any of the
+available indexers on [Jina Hub](https://hub.jina.ai) or define your own. To create an **Indexer** you need to have two
+endpoints: `/index` and `/search`. For this tutorial we will define a `SimpleIndexer` which is [also available on jina
+Hub](https://hub.jina.ai/executor/zb38xlt4).
 
 ```python
 from jina import DocumentArrayMemmap, DocumentArray, Executor, requests
@@ -118,14 +111,13 @@ class SimpleIndexer(Executor):
         docs.match(self._dam)
 ```
 
-`SimpleIndexer` stores all the Documents with a memory map when invoked with a `/index` endpoint. During the search
-flow, it matches the query `Document` with the indexed `Document` using the built-in `match` function
+`SimpleIndexer` stores all the Documents with a memory map when invoked with the `/index` endpoint. During the search
+Flow, it matches the query `Document` with the indexed `Document` using the built-in `match` function
 of `DocumentArrayMemmap`.
 
 ## Putting it all together in a Flow
 
-So far we saw individual components of the Flow and how to define them. Next comes putting all this together in a Flow,
-which can be done as shown below
+So far we saw individual components of the Flow and how to define them. Next comes putting all of this together in a Flow:
 
 ```python
 from jina import Flow
@@ -147,7 +139,7 @@ with f:
 
 ### Query from Python
 
-Keeping the server running we can start a simple client to make query.
+Keeping the server running we can start a simple client to make a query:
 
 ```python
 from jina import Client, Document
@@ -165,9 +157,9 @@ c.post('/search', Document(uri='path/to/an/image/'), on_done=print_matches)
 
 ## Results
 
-The returned response contains the matching `Document` which contains the `uri` of the images. Below we can see the
-returned matching images to the query
+The returned response contains the matching `Document` which in turn contains the `uri` of the images. Below we can see the
+returned matching images of the query:
 
 ```{figure} image-search.png
 :align: center
-```
+```