CLIPTextEncoder

CLIPTextEncoder is a text encoder that wraps the text embedding functionality using the CLIP model from huggingface transformers.

It takes Documents with text stored in the text attribute as inputs, and stores the resulting embedding in the embedding attribute.

The CLIP model was originally proposed in Learning Transferable Visual Models From Natural Language Supervision, and is trained to embed images and text to the same latent space. The corresponding image encoder is CLIPImageEncoder, using both encoders together works well in multi-modal or cross-modal search applications.

Usage

Here's a simple example of how to use CLIPTextEncoder in a Flow

from jina import Flow, Document

f = Flow().add(uses='jinahub+docker://CLIPTextEncoder')

def print_result(resp):
    doc = resp.docs[0]
    print(f'Embedded "{doc.text}" to {doc.embedding.shape[0]}-dimensional vector')

with f:
    doc = Document(text='your text')
    f.post(on='/foo', inputs=doc, on_done=print_result)

Note that this way the Executor will download the model every time it starts up. You can re-use the cached model files by mounting the cache directory that the model is using into the container. To do this, modify the Flow definition like this

f = Flow().add(
    uses='jinahub+docker://CLIPTextEncoder',
    volumes='/your/home/dir/.cache/huggingface:/root/.cache/huggingface'
)

With GPU

This encoder also offers a GPU version under the gpu tag. To use it, make sure to pass device='cuda', as the initialization parameter, and gpus='all' when adding the containerized Executor to the Flow. See the Executor on GPU section of Jina documentation for more details.

Here's how you would modify the example above to use a GPU

f = Flow().add(
    uses='jinahub+docker://CLIPTextEncoder/gpu',
    uses_with={'device': 'cuda'},
    gpus='all',
    volumes='/your/home/dir/.cache/huggingface:/root/.cache/huggingface' 
)

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.github/workflows		.github/workflows
tests		tests
.gitignore		.gitignore
Dockerfile.gpu		Dockerfile.gpu
README.md		README.md
clip_text.py		clip_text.py
config.yml		config.yml
gpu_requirements.txt		gpu_requirements.txt
manifest.yml		manifest.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

tests

tests

.gitignore

.gitignore

Dockerfile.gpu

Dockerfile.gpu

README.md

README.md

clip_text.py

clip_text.py

config.yml

config.yml

gpu_requirements.txt

gpu_requirements.txt

manifest.yml

manifest.yml

requirements.txt

requirements.txt

Repository files navigation

CLIPTextEncoder

Usage

With GPU

Reference

About

Releases 2

Packages

Contributors 7

Languages

jina-ai/executor-text-clip-encoder

Folders and files

Latest commit

History

Repository files navigation

CLIPTextEncoder

Usage

With GPU

Reference

About

Resources

Stars

Watchers

Forks

Languages