Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions content/guides/use-case/genai-video-transcription/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
---
description: Learn how to containerize a generative AI (GenAI) app that does video transcription.
keywords: python, generative ai, genai, llm, pinecone, openai, whisper, langchain
title: Generative AI - video transcription
toc_min: 1
toc_max: 2
---

In this guide you'll explore and run a containerized generative AI (GenAI) application that uses remote third-party services, like OpenAI and Pinecone, to transcribe videos and answer questions about them. You'll then learn how to containerize GenAI applications.

> **Acknowledgment**
>
> Docker would like to thank [David Cardozo](https://www.davidcardozo.com/) for
> his contribution to this guide.

Start by exploring and running an existing containerized GenAI application.

{{< button text="Explore and run a GenAI app" url="run.md" >}}
256 changes: 256 additions & 0 deletions content/guides/use-case/genai-video-transcription/containerize.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,256 @@
---
title: Containerize a GenAI video transcription app
keywords: python, generative ai, genai, llm, pinecone, openai, whisper, langchain
description: Learn how to containerize generative AI video transcription applications.
---
## Overview

In this section you're going to walk through the process of containerizing a video transcription app that uses generative AI.

## Prerequisites

Work through the steps of the [Explore and run a genAI video transcription app](run.md) section to get and configure the example application used in this section.

## Step 1: Prepare the example application

Before you walk through the process of creating Docker assets, delete the
current assets in the example application. This section only explores the
yt-whisper application, so delete the existing `Dockerfile` inside the
`docker-genai/yt-whisper/` directory.

You should now have the following contents inside the `docker-genai/yt-whisper/` directory.

```text
├── docker-genai/yt-whisper/
│ ├── scripts/
│ ├── tests/
│ ├── yt_whisper/
│ ├── README.md
│ ├── poetry.lock
│ └── pyproject.toml
```

## Step 2: Initialize Docker assets

Use `docker init` to create the necessary Docker assets to containerize your application. Inside the `docker-genai/yt-whisper/` directory, run the `docker init` command in a terminal. `docker init` provides some default configuration, but you'll need to answer a few questions about your application. For example, this application uses Streamlit to run. Refer to the following docker init example and use the same answers for your prompts.

```console
docker init
Welcome to the Docker Init CLI!

This utility will walk you through creating the following files with sensible defaults for your project:
- .dockerignore
- Dockerfile
- compose.yaml
- README.Docker.md

Let's get started!

? What application platform does your project use? Python
? What version of Python do you want to use? 3.11
? What port do you want your app to listen on? 8503
? What is the command to run your app? streamlit run yt_whisper/app.py --server.port=8503 --server.address=0.0.0.0
```

You should now have the following contents in your `docker-genai/yt-whisper/` directory.

```text
├── docker-genai/yt-whisper/
│ ├── scripts/
│ ├── tests/
│ ├── yt_whisper/
│ ├── README.md
│ ├── poetry.lock
│ ├── pyproject.toml
│ ├── README.Docker.md
│ ├── .dockerignore
│ ├── compose.yaml
│ └── Dockerfile
```

## Step 3: Explore and update the Docker assets

`docker init` creates Docker assets to help you get started. Depending on your application, you may need to modify the assets. In the following sections, you'll explore and update the Docker assets.

### Explore and update the Dockerfile

1. Open the `docker-genai/yt-whisper/Dockerfile` in a code or text editor.
2. Inspect the contents of the `Dockerfile`. First notice that the `Dockerfile`
is more extensive than the Dockerfile from
[Explore and run a genAI video transcription app](run.md). It's more
extensive because `docker init` implements several [best practices](../../../develop/develop-images/dockerfile_best-practices.md)
for creating production images. Next, notice the instructions specify a
`requirements.txt` file for the packages. This particular application uses
Poetry, so you'll need to update the relevant instructions.
3. Update the `Dockerfile` for Poetry. The following is the updated
`Dockerfile`.
```dockerfile{hl_lines=["36-40"]}
# syntax=docker/dockerfile:1

# Comments are provided throughout this file to help you get started.
# If you need more help, visit the Dockerfile reference guide at
# https://docs.docker.com/go/dockerfile-reference/

ARG PYTHON_VERSION=3.11
FROM python:${PYTHON_VERSION}-slim as base

# Prevents Python from writing pyc files.
ENV PYTHONDONTWRITEBYTECODE=1

# Keeps Python from buffering stdout and stderr to avoid situations where
# the application crashes without emitting any logs due to buffering.
ENV PYTHONUNBUFFERED=1

WORKDIR /app

# Create a non-privileged user that the app will run under.
# See https://docs.docker.com/go/dockerfile-user-best-practices/
ARG UID=10001
RUN adduser \
--disabled-password \
--gecos "" \
--home "/nonexistent" \
--shell "/sbin/nologin" \
--no-create-home \
--uid "${UID}" \
appuser

# Download dependencies as a separate step to take advantage of Docker's caching.
# Leverage a cache mount to /root/.cache/pip to speed up subsequent builds.
# Leverage a bind mount to requirements.txt to avoid having to copy them into
# into this layer.
RUN --mount=type=cache,target=/root/.cache/pip \
--mount=type=bind,source=pyproject.toml,target=pyproject.toml \
--mount=type=bind,source=poetry.lock,target=poetry.lock \
--mount=type=bind,source=yt_whisper,target=yt_whisper \
--mount=type=bind,source=README.md,target=README.md \
python -m pip install .

# Switch to the non-privileged user to run the application.
USER appuser

# Copy the source code into the container.
COPY . .

# Expose the port that the application listens on.
EXPOSE 8503

# Run the application.
CMD streamlit run yt_whisper/app.py --server.port=8503 --server.address=0.0.0.0
```
4. Save and close the file.

### Explore and update the .dockerignore file

One new file that you haven't explored yet is the `.dockerignore` file. The `.dockerignore` file excludes files and directories from the image. For more details, see [.dockerignore files](../../../build/building/context.md#dockerignore-files).

1. Open the `docker-genai/yt-whisper/.dockerignore` file in a code or text
editor.
2. Inspect the contents of the `.dockerignore` file. Notice that it has
`README.md`, but you just specified copying this file in the previous
`Dockerfile`.
3. Update the `.dockerignore` file and remove `README.md`. The following is the
updated `.dockerignore` file.
```text
# Include any files or directories that you don't want to be copied to your
# container here (e.g., local build artifacts, temporary files, etc.).
#
# For more help, visit the .dockerignore file reference guide at
# https://docs.docker.com/go/build-context-dockerignore/

**/.DS_Store
**/__pycache__
**/.venv
**/.classpath
**/.env
**/.git
**/.gitignore
**/.project
**/.settings
**/.toolstarget
**/.vs
**/.vscode
**/*.*proj.user
**/*.dbmdl
**/*.jfm
**/bin
**/charts
**/docker-compose*
**/compose*
**/Dockerfile*
**/node_modules
**/npm-debug.log
**/obj
**/secrets.dev.yaml
**/values.dev.yaml
LICENSE
```
4. Save and close the file.

### Explore and update the Compose file

1. Open the `docker-genai/yt-whisper/compose.yaml` file in a code or text
editor.
2. Inspect the contents of the `compose.yaml` file. Notice that the environment
variables file isn't specified. Also, notice that only one service named `server` is specified.
3. Update the `compose.yaml` file and add the environment variables file. The
following is the updated `compose.yaml` file.
```yaml{hl_lines=["16-17"]}
# Comments are provided throughout this file to help you get started.
# If you need more help, visit the Docker Compose reference guide at
# https://docs.docker.com/go/compose-spec-reference/

# Here the instructions define your application as a service called "server".
# This service is built from the Dockerfile in the current directory.
# You can add other services your application may depend on here, such as a
# database or a cache. For examples, see the Awesome Compose repository:
# https://github.com/docker/awesome-compose
services:
server:
build:
context: .
ports:
- 8503:8503
env_file:
- ../.env
# ...
```

## Step 4: Run the containerized generative AI application

To run the application, in a terminal, change directory to the `docker-genai/yt-whisper/` directory and run the following command.

```console
docker compose up --build
```

Docker builds the image and runs it as a container. Once the container starts, you can access the yt-whisper application by opening a web browser and navigating to [localhost:8503](http://localhost:8503).

Only the service for the yt-whisper app is started. You should be unable to access the bot at [localhost:8504](http://localhost:8504).

Stop the container by pressing `CTRL`+`C` in the terminal.

To start both the bot and yt-whisper services, you can use the exisiting Compose file in the root of `docker-genai/` directory. Change directory to `docker-genai/` and run the following command to bring up both services.

```console
docker compose up --build
```

Stop the containers by pressing `CTRL`+`C` in the terminal.

## Summary

In this section you learned how to containerize and run a generative AI application.

Related information:
* [docker init CLI reference](../../../engine/reference/commandline/init.md)
* [Dockerfile reference](/engine/reference/builder/)
* [Compose overview](../../../compose/_index.md)
* [.dockerignore files](../../../build/building/context.md#dockerignore-files)

## Next steps

* Try to containerize the docker-bot in the example application using `docker
init`.
* See the [Python language-specific guide](../../../language/python/_index.md) to learn how to configure CI/CD and deploy Python apps locally to Kubernetes.
* See the [GenAI Stack](https://github.com/docker/genai-stack) demo applications for more examples of containerized generative AI applications.
Loading