Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(types): add visualization to doc type #1884

Merged
merged 4 commits into from
Feb 6, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/images/four-symbol-docs.svg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
56 changes: 44 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,8 +86,8 @@ jina hello-world --help

| | |
| --- |---|
| 🥚 | [CRUD Functions](#crud-functions) |
| 🐣 | [Document](#document) • [Flow](#flow) • [Visualize](#visualize) • [Feed Data](#feed-data) • [Fetch Result](#fetch-result) • [Add Logic](#add-logic) • [Inter & Intra Parallelism](#inter--intra-parallelism) • [Decentralize](#decentralized-flow) • [Asynchronous](#asynchronous-flow) |
| 🥚 | [CRUD Functions](#crud-functions) • [Document](#document) • [Flow](#flow) |
| 🐣 | [Feed Data](#feed-data) • [Fetch Result](#fetch-result) • [Add Logic](#add-logic) • [Inter & Intra Parallelism](#inter--intra-parallelism) • [Decentralize](#decentralized-flow) • [Asynchronous](#asynchronous-flow) |
| 🐥 | [Customize Encoder](#customize-encoder) • [Test Encoder](#test-encoder-in-flow) • [Parallelism & Batching](#parallelism--batching) • [Add Data Indexer](#add-data-indexer) • [Compose Flow from YAML](#compose-flow-from-yaml) • [Search](#search) • [Evaluation](#evaluation) • [REST Interface](#rest-interface) |

#### CRUD Functions
Expand Down Expand Up @@ -180,22 +180,48 @@ with f:

Get the vibe? Now we are talking! Let's learn more about the basic concepts and features in Jina.



#### Document
<a href="https://mybinder.org/v2/gh/jina-ai/jupyter-notebooks/main?filepath=basic-construct-document.ipynb"><img align="right" src="https://github.com/jina-ai/jina/blob/master/.github/badges/run-badge.svg?raw=true"/></a>

`Document` is [Jina's primitive data type](https://hanxiao.io/2020/11/22/Primitive-Data-Types-in-Neural-Search-System/#primitive-types). It can contain text, image, array, embedding, URI, and accompanied by rich meta information. It can be recurred both vertically and horizontally to have nested documents and matched documents. To construct a Document, one can use:
`Document` is [Jina's primitive data type](https://hanxiao.io/2020/11/22/Primitive-Data-Types-in-Neural-Search-System/#primitive-types). It can contain text, image, array, embedding, URI, and accompanied by rich meta information. To construct a Document, one can use:

```python
import numpy
from jina import Document

doc1 = Document(content=text_from_file, mime_type='text/x-python') # a text document contains python code
doc2 = Document(content=numpy.random.random([10, 10])) # a ndarray document
doc1.chunks.append(doc2) # doc2 is now a sub-document of doc1
```

Document can be recurred both vertically and horizontally to have nested documents and matched documents. To better see the recursive structure of a document, one can use `.plot()` function. If you are using JupyterLab/Notebook, all Document objects will be auto-rendered.

<table>
<tr>
<td>

```python
import numpy
from jina import Document

d0 = Document(id='🐲', embedding=np.array([0, 0]))
d1 = Document(id='🐦', embedding=np.array([1, 0]))
d2 = Document(id='🐢', embedding=np.array([0, 1]))
d3 = Document(id='🐯', embedding=np.array([1, 1]))

d0.chunks.append(d1)
d1.chunks[0].chunks.append(d2)
d0.matches.append(d3)

d0.plot() # simply `d0` on Jupyter
```

</td>
<td>
<img src="https://github.com/jina-ai/jina/blob/master/.github/.images/four-symbol-docs.svg?raw=true"/>
</td>
</tr>
</table>

<details>
<summary>Click here to see more about MultimodalDocument</summary>

Expand Down Expand Up @@ -256,7 +282,7 @@ Interested readers can refer to [`jina-ai/example`: how to build a multimodal se
#### Flow
<a href="https://mybinder.org/v2/gh/jina-ai/jupyter-notebooks/main?filepath=basic-create-flow.ipynb"><img align="right" src="https://github.com/jina-ai/jina/blob/master/.github/badges/run-badge.svg?raw=true"/></a>

Jina provides a high-level [Flow API](https://101.jina.ai) to simplify building search/index workflows. To create a new Flow:
Jina provides a high-level Flow API to simplify building CRUD workflows. To create a new Flow:

```python
from jina import Flow
Expand All @@ -265,14 +291,20 @@ f = Flow().add()

This creates a simple Flow with one [Pod](https://101.jina.ai). You can chain multiple `.add()`s in a single Flow.

#### Visualize
<a href="https://mybinder.org/v2/gh/jina-ai/jupyter-notebooks/main?filepath=basic-visualize-a-flow.ipynb"><img align="right" src="https://github.com/jina-ai/jina/blob/master/.github/badges/run-badge.svg?raw=true"/></a>

To visualize the Flow, simply chain it with `.plot('my-flow.svg')`. If you are using a Jupyter notebook, the Flow object will be displayed inline *without* `plot`:
To visualize the Flow, simply chain it with `.plot('my-flow.svg')`. If you are using a Jupyter notebook, the Flow object will be displayed inline *without* `plot`.

<img src="https://github.com/jina-ai/jina/blob/master/.github/simple-flow0.svg?raw=true"/>

`Gateway` is the entrypoint of the Flow.
`Gateway` is the entrypoint of the Flow.


| | |
| --- |---|
| 🥚 | [CRUD Functions](#crud-functions) • [Document](#document) • [Flow](#flow) |
| 🐣 | [Feed Data](#feed-data) • [Fetch Result](#fetch-result) • [Add Logic](#add-logic) • [Inter & Intra Parallelism](#inter--intra-parallelism) • [Decentralize](#decentralized-flow) • [Asynchronous](#asynchronous-flow) |
| 🐥 | [Customize Encoder](#customize-encoder) • [Test Encoder](#test-encoder-in-flow) • [Parallelism & Batching](#parallelism--batching) • [Add Data Indexer](#add-data-indexer) • [Compose Flow from YAML](#compose-flow-from-yaml) • [Search](#search) • [Evaluation](#evaluation) • [REST Interface](#rest-interface) |

#### Feed Data
<a href="https://mybinder.org/v2/gh/jina-ai/jupyter-notebooks/main?filepath=basic-feed-data.ipynb"><img align="right" src="https://github.com/jina-ai/jina/blob/master/.github/badges/run-badge.svg?raw=true"/></a>
Expand Down Expand Up @@ -546,8 +578,8 @@ That's all you need to know for understanding the magic behind `hello-world`. No

| | |
| --- |---|
| 🥚 | [CRUD Functions](#crud-functions) |
| 🐣 | [Document](#document) • [Flow](#flow) • [Visualize](#visualize) • [Feed Data](#feed-data) • [Fetch Result](#fetch-result) • [Add Logic](#add-logic) • [Inter & Intra Parallelism](#inter--intra-parallelism) • [Decentralize](#decentralized-flow) • [Asynchronous](#asynchronous-flow) |
| 🥚 | [CRUD Functions](#crud-functions) • [Document](#document) • [Flow](#flow) |
| 🐣 | [Feed Data](#feed-data) • [Fetch Result](#fetch-result) • [Add Logic](#add-logic) • [Inter & Intra Parallelism](#inter--intra-parallelism) • [Decentralize](#decentralized-flow) • [Asynchronous](#asynchronous-flow) |
| 🐥 | [Customize Encoder](#customize-encoder) • [Test Encoder](#test-encoder-in-flow) • [Parallelism & Batching](#parallelism--batching) • [Add Data Indexer](#add-data-indexer) • [Compose Flow from YAML](#compose-flow-from-yaml) • [Search](#search) • [Evaluation](#evaluation) • [REST Interface](#rest-interface) |


Expand Down
23 changes: 4 additions & 19 deletions jina/flow/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,16 +10,15 @@
import uuid
from collections import OrderedDict, defaultdict
from contextlib import ExitStack
from typing import Optional, Union, Tuple, List, Set, Dict, TextIO, TypeVar
from urllib.request import Request, urlopen
from typing import Optional, Union, Tuple, List, Set, Dict, TextIO

from .builder import build_required, _build_flow, _optimize_flow, _hanging_pods
from .. import __default_host__
from ..clients import Client, WebSocketClient
from ..enums import FlowBuildLevel, PodRoleType, FlowInspectType
from ..excepts import FlowTopologyError, FlowMissingPodError, RuntimeFailToStart
from ..excepts import FlowTopologyError, FlowMissingPodError
from ..helper import colored, \
get_public_ip, get_internal_ip, typename, ArgNamespace
get_public_ip, get_internal_ip, typename, ArgNamespace, download_mermaid_url
from ..jaml import JAML, JAMLCompatible
from ..logging import JinaLogger
from ..parsers import set_client_cli_parser, set_gateway_parser, set_pod_parser
Expand Down Expand Up @@ -604,7 +603,7 @@ def plot(self, output: str = None,
pass

if output:
op_flow._download_mermaid_url(url, output)
download_mermaid_url(url, output)
elif not showed:
op_flow.logger.info(f'flow visualization: {url}')

Expand All @@ -627,20 +626,6 @@ def _mermaid_to_url(self, mermaid_str, img_type) -> str:

return f'https://mermaid.ink/{img_type}/{encoded_str}'

def _download_mermaid_url(self, mermaid_url, output) -> None:
"""
Rendering the current flow as a jpg image, this will call :py:meth:`to_mermaid` and it needs internet connection
:param path: the file path of the image
:param kwargs: keyword arguments of :py:meth:`to_mermaid`
:return:
"""
try:
req = Request(mermaid_url, headers={'User-Agent': 'Mozilla/5.0'})
with open(output, 'wb') as fp:
fp.write(urlopen(req).read())
except:
self.logger.error('can not download image, please check your graph and the network connections')

@build_required(FlowBuildLevel.GRAPH)
def to_swarm_yaml(self, path: TextIO):
"""
Expand Down
16 changes: 16 additions & 0 deletions jina/helper.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
from itertools import islice
from types import SimpleNamespace
from typing import Tuple, Optional, Iterator, Any, Union, List, Dict, Set, Sequence, Iterable
from urllib.request import Request, urlopen

import numpy as np

Expand Down Expand Up @@ -773,3 +774,18 @@ def change_env(key, val):
def is_yaml_filepath(val) -> bool:
r = r'^[/\w\-\_\.]+.ya?ml$'
return re.match(r, val.strip()) is not None


def download_mermaid_url(mermaid_url, output) -> None:
"""
Rendering the current flow as a jpg image, this will call :py:meth:`to_mermaid` and it needs internet connection
:param path: the file path of the image
:param kwargs: keyword arguments of :py:meth:`to_mermaid`
"""
try:
req = Request(mermaid_url, headers={'User-Agent': 'Mozilla/5.0'})
with open(output, 'wb') as fp:
fp.write(urlopen(req).read())
except:
from jina.logging import default_logger
default_logger.error('can not download image, please check your graph and the network connections')