# Dependencies and imports

In [1]:
# Requirements.
%pip install "build>=1.0" "torch>=2.0" "transformers" "../../libs/buildlib" "pytest"

Processing /home/lmerrick/Code/embed_text_container_service/libs/buildlib
  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Installing backend dependencies ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone
Building wheels for collected packages: buildlib
  Building wheel for buildlib (pyproject.toml) ... [?25ldone
[?25h  Created wheel for buildlib: filename=buildlib-1.0.0-py3-none-any.whl size=2117 sha256=5b2702eb6b89aeed6ffd9d2f22cbdac1a21d736651598e8d961a2e1ec0f7c797
  Stored in directory: /tmp/pip-ephem-wheel-cache-2upxhuv4/wheels/c5/f3/ea/faa799a07226ed5f08d2769708ad11b5735aa5e19906b6c1c9
Successfully built buildlib
Installing collected packages: buildlib
  Attempting uninstall: buildlib
    Found existing installation: buildlib 1.0.0
    Uninstalling buildlib-1.0.0:
      Successfully uninstalled buildlib-1.0.0
Successfully installed buildlib-1.0.0
Note: you may need to restart the kernel to 

In [2]:
import shutil
from pathlib import Path

import buildlib as buildlib
import pytest
from build import ProjectBuilder
from transformers.models.bert.modeling_bert import BertModel
from transformers.models.bert.tokenization_bert_fast import BertTokenizerFast


  from .autonotebook import tqdm as notebook_tqdm


# Introduction

In order to create a custom text embedding service, we need to create a Docker image and deploy it. Between build and deploy, we'll also cover how to test locally to speed up the develoment cycle.

# Building

In order to build an image for our custom service, we put custom data (a `data/` directory) and logic (an `embed.py`, plus a `requirements.txt` specifying any dependencies) into the `build/` directory, and then we use the `buildlib` to run the build process (`buildlib.build()`).

## Clear out `build/`

Let's start with a clean slate by deleting and recreating an empty `build/`.

In [3]:
BUILD_DIR = Path(".").resolve().parents[1] / "build"
shutil.rmtree(BUILD_DIR)
BUILD_DIR.mkdir()

## Prepare the data

For this example, we will use the [`e5-base-v2`](https://huggingface.co/intfloat/e5-base-v2) model from Microsoft. We'll demonstrate how to preload the model weights directly into our Docker image for simpler deployment.

Rather than just using the [`transformers`](https://pypi.org/project/transformers/) library directly, we'll also demonstrate how to include a custom library into the Docker image by building a [pure Python wheel](https://packaging.python.org/en/latest/guides/distributing-packages-using-setuptools/#pure-python-wheels) and including that in the `data/` alongside the model weights. The library we demonstrate on is `embed_lib/`, a small example library we wrote to call the E5 model via `transformers`.

### Put model weights into `build/data/`

In [4]:
# Downloading the model weights.
MODEL_NAME = "intfloat/e5-base-v2"
DATA_DIR = BUILD_DIR / "data"
TOKENISER_DIR = DATA_DIR / "tokenizer"
MODEL_DIR = DATA_DIR / "model"

print(f"Downloading {MODEL_NAME} and saving tokenizer and weights to {DATA_DIR}")

tokenizer = BertTokenizerFast.from_pretrained(MODEL_NAME)
model = BertModel.from_pretrained(MODEL_NAME)
assert isinstance(model, BertModel)  # This is for typechecking.
tokenizer.save_pretrained(TOKENISER_DIR)
model.save_pretrained(MODEL_DIR)

# Validate that our saved files work by loading from them.
tokenizer = BertTokenizerFast.from_pretrained(TOKENISER_DIR)
model = BertModel.from_pretrained(MODEL_DIR)

Downloading intfloat/e5-base-v2 and saving tokenizer and weights to /home/lmerrick/Code/embed_text_container_service/build/data


### Put packaged code into `build/data/`

In [5]:
# Package our Python package into a wheel in the `data/` directory.
wheel_filename = ProjectBuilder(Path("") / "embed_lib").build(
    distribution="wheel", output_directory=DATA_DIR
)
print(f"Built {wheel_filename}")

running bdist_wheel
running build
running build_py
running egg_info
writing src/embed_lib.egg-info/PKG-INFO
writing dependency_links to src/embed_lib.egg-info/dependency_links.txt
writing requirements to src/embed_lib.egg-info/requires.txt
writing top-level names to src/embed_lib.egg-info/top_level.txt
reading manifest file 'src/embed_lib.egg-info/SOURCES.txt'
writing manifest file 'src/embed_lib.egg-info/SOURCES.txt'
installing to build/bdist.linux-aarch64/wheel
running install
running install_lib
creating build/bdist.linux-aarch64/wheel
creating build/bdist.linux-aarch64/wheel/embed_lib
copying build/lib/embed_lib/_batch_iter_util.py -> build/bdist.linux-aarch64/wheel/embed_lib
copying build/lib/embed_lib/__init__.py -> build/bdist.linux-aarch64/wheel/embed_lib
copying build/lib/embed_lib/e5.py -> build/bdist.linux-aarch64/wheel/embed_lib
running install_egg_info
Copying src/embed_lib.egg-info to build/bdist.linux-aarch64/wheel/embed_lib-1.0.0-py3.8.egg-info
running install_scripts
c

## Prepare the logic

In order to actually use our custom package, load our model weights, and perform text embedding, we need to implement the core logic of text embedding. This is done by implementing `get_embed_fn()` inside a file called `embed.py`. The function `get_embed_fn` should load model weights and return a function that maps a single input consisting of a `Sequence[str]` into a single 2d `numpy` array of datatype `np.float32`. Below we give an example.

### Implement `get_embed_fn` inside `build/embed.py`

In [6]:
%%writefile ../../build/embed.py
import logging
from typing import Callable
from typing import cast
from typing import Sequence

import embed_lib.e5
import numpy as np
from transformers.models.bert.modeling_bert import BertModel
from transformers.models.bert.tokenization_bert_fast import BertTokenizerFast

MAX_BATCH_SIZE = 4


def get_embed_fn(logger: logging.Logger) -> Callable[[Sequence[str]], np.ndarray]:
    # Load the model into memory.
    logger.info("[get_embed_fn]Loading model from disk to memory")
    tokenizer = BertTokenizerFast.from_pretrained(
        "/root/data/tokenizer", local_files_only=True
    )
    model = cast(
        BertModel, BertModel.from_pretrained("/root/data/model", local_files_only=True)
    )
    e5_model = embed_lib.e5.E5Model(tokenizer, model)

    def _embed(texts: Sequence[str]) -> np.ndarray:
        result_tensor = embed_lib.e5.embed(
            e5_model=e5_model,
            texts=texts,
            batch_size=MAX_BATCH_SIZE,
            normalize=True,
            progress_bar=False,
        )
        result_array = result_tensor.numpy().astype(np.float32)
        return result_array

    return _embed

Writing ../../build/embed.py


### Add a `build/requirements.txt`

We also need to specify the requirements for our embedding logic. During the build, we will populate the `BUILD_ROOT` environment variable, which enables you to include custom packages (like your `embed_lib` wheel) in your `requirements.txt` by absolute filepath.

In [7]:
%%writefile ../../build/requirements.txt

${BUILD_ROOT}/data/embed_lib-1.0.0-py3-none-any.whl

Writing ../../build/requirements.txt


## Configure

In [8]:
%%writefile ../../build/config.py
from service_config import Configuration

USER_CONFIG = Configuration(embedding_dim=768, max_batch_size=6)


Writing ../../build/config.py


## Build!

Now that all the pieces are in place inside `build/`, we can trigger a build via `buildlib`.

In [9]:
# We now have all the pieces we need to build our service!
list(BUILD_DIR.iterdir()) + list(DATA_DIR.iterdir())

[PosixPath('/home/lmerrick/Code/embed_text_container_service/build/requirements.txt'),
 PosixPath('/home/lmerrick/Code/embed_text_container_service/build/embed.py'),
 PosixPath('/home/lmerrick/Code/embed_text_container_service/build/config.py'),
 PosixPath('/home/lmerrick/Code/embed_text_container_service/build/data'),
 PosixPath('/home/lmerrick/Code/embed_text_container_service/build/data/embed_lib-1.0.0-py3-none-any.whl'),
 PosixPath('/home/lmerrick/Code/embed_text_container_service/build/data/model'),
 PosixPath('/home/lmerrick/Code/embed_text_container_service/build/data/tokenizer')]

In [11]:
# If you are building on a Mac with an arm CPU, you may want to build natively
# for local testing (e.g. perf tests will look really slow otherwise), but remember
# to rebuild to amd64 CPU architecture below before deploying.
buildlib.build(build_dir=BUILD_DIR, platform="linux/arm64", tag="latest_arm64")

# buildlib.build(build_dir=BUILD_DIR, platform="linux/amd64", tag="latest")

#0 building with "default" instance using docker driver

#1 [internal] load .dockerignore
#1 transferring context: 2B done
#1 DONE 0.0s

#2 [internal] load build definition from Dockerfile
#2 transferring dockerfile: 1.72kB done
#2 DONE 0.0s

#3 [internal] load metadata for docker.io/library/python:3.8.16-bullseye
#3 DONE 0.8s

#4 [ 1/11] FROM docker.io/library/python:3.8.16-bullseye@sha256:e411647c253b75948394a343b13ff32b5674687df0c54187445d12ee9de2b106
#4 DONE 0.0s

#5 [internal] load build context
#5 transferring context: 439.07MB 1.3s done
#5 DONE 1.3s

#6 [ 2/11] WORKDIR /root
#6 CACHED

#7 [ 3/11] COPY ./libs ./libs
#7 CACHED

#8 [ 4/11] COPY ./service_api ./service_api
#8 CACHED

#9 [ 5/11] COPY ./service_embed_loop ./service_embed_loop
#9 CACHED

#10 [ 6/11] COPY ./services_common_code ./services_common_code
#10 CACHED

#11 [ 7/11] RUN python -m venv --copies --prompt api create venv_api     && python -m venv --copies --prompt embed_loop create venv_embed_loop     && ./venv_api

# Test locally

In [12]:
test_path = BUILD_DIR.parent / "testing" / "tests" / "test_end_to_end.py"
# with buildlib.run_container_context(tag="latest_arm64"):  # NOTE: You can specify the tag.
with buildlib.run_container_context():
    pytest.main([str(test_path)])

platform linux -- Python 3.8.18, pytest-7.4.2, pluggy-1.3.0
rootdir: /home/lmerrick/Code/embed_text_container_service
plugins: anyio-3.6.2
collected 1 item

../../testing/tests/test_end_to_end.py [32m.[0m[32m                                 [100%][0m



# Deploy

Now that we have build our image and made sure it passes local tests, we can deploy our service.

In [13]:
# TODO : Fill in this section.