# Containerizing our components

We ave now written all the services of our pipeline and written the grpc server and client to allow the components to communicate. We are now ready to containerize the components! This simply means writing dockerfiles for each of the components as well as for the client. The dockerfiles are all quite similar, so we will go trough the creation of only one of the dockerfiles, the data components dockerfile. 

There are 7 things we need to do in order for our application to be successfully containerized. We need to:
1) define the base image
2) set the working directory
3) copy the requirements.txt file into the container
4) install the dependencies
5) copy the application code into the container
6) expose the port the app runs on
7) define the command for running the application

If we're able to successfully include these 7 steps into the dockerfile, our application should be successfully containerized. In order to simplify the writing of the dockerfiles, we have first restructured the file structure. After changing the file structure it is important to also update all the import statements and test that the application runs successfully before moving on. The testing can be done in the same way as in the previous chapter.

The new file structure is done so that each of the folders contains all the necessary files that the correspodning container is going to need. If you are unsure about which files are necessary to include in which container, you can always look at the import statements in the code. Each container is an isolated environment, so you need to make sure that all the files that need to be important are also present in the container. 

Now that we have a clear file structure we can begin writing the dockerfile.

1) Define the base image
In a Dockerfile, the base image is the image from which your Docker image is built. This base image is specified using the FROM instruction at the beginning of the Dockerfile and serves as the starting point for building your custom image. The base image typically includes a minimal operating system and any necessary pre-installed software, libraries, or dependencies that your application needs to run. In our case, since we are containerizing a python application, we get the following:

```dockerfile
FROM python:3.10-slim
```
2) Set the working directory
Setting the working directory in a Dockerfile using the WORKDIR instruction specifies the directory within the Docker container where commands will be executed. It essentially sets the context for any subsequent instructions in the Dockerfile, such as COPY, RUN, and CMD. By setting a working directory, you ensure that the application files and operations are organized within a specific path inside the container. This helps maintain a clean and predictable environment for running your application.

```dockerfile
WORKDIR /app
```
3) copy the requirements.txt file into the container
Copying the requirements.txt file into the Docker image is essential for ensuring that all necessary Python dependencies are installed in the container. This practice enables the pip install -r requirements.txt command to install the specified libraries and packages, creating a consistent environment for your application. 

```dockerfile
COPY ./requirements.txt .
```

4) install the dependencies
This is done for the same reasons listed in the previous step.

```dockerfile
RUN pip install --no-cache-dir -r requirements.txt
```

5) copy the application code into the container
Copying the code into the container using the command COPY . . ensures that your application code is included in the Docker image, allowing the container to execute the application. This command copies all files from the current directory on the host machine to the working directory inside the container. The dockerfile will be placed inside of the data folder, and therefor the first "." corresponds to the data folder and all it's contents. The second "." referes to the working directory inside the container. 

```dockerfile
COPY . .
```

6) expose the port the app runs on
When completing this step, make sure that you expose the same port that the serice is defined to run on accoring to the code in the server file. In our case, that is port number 8080 for the data server.

```dockerfile
EXPOSE 8080
```

7) define the command for running the application
We need to specify the default command to run when the container starts. This command is essential for defining the container's primary process, ensuring that when the container is initiated, it automatically runs your application.

```dockerfile
CMD ["python", "data_service_server.py"]
```

Now we have completed all 7 steps of defining the dockerfile, leaving us with the final file:

```dockerfile
# Use an official Python runtime as a parent image
FROM python:3.10-slim

# Set the working directory inside the container
WORKDIR /app

# Copy requirements.txt first to leverage Docker cache
COPY ./requirements.txt .

# Install any dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Copy the rest of the application code
COPY . .

# Expose the port the app runs on
EXPOSE 8080

# Command to run the Flask app
CMD ["python", "data_service_server.py"]
```

Most of these lines also apply for the other services. The only things you need to change is the port number and the name of the file you want to run! Also, since the client is not using a port, you can remove the line that exposes a specific port in the client's dockerfile.

## Testing

To test the containerized applications, run the command

```shell
docker build -t {desired name of container} .
```
In the directory containing the dockerimage. You can pick the name of the container freely, but it might be a good idea to pick something similar to the servie. If the container image is built successfully, you can run it either through docker desktop or using the command: 

```shell
docker run {name of container}
```









