# Writing Your Own Docker Images
  
Once you are able to manage images and containers, it’s time to create your own. In chapter 3, you’ll build your own images using Dockerfiles. Dockerfiles are text files that include everything needed for Docker to build an image. You’ll learn how to create images and will get an introduction to all the essential Dockerfile instructions like FROM, RUN, COPY, and more. By the end of this chapter, you’ll have insight into how Docker makes images and be able to create optimized Docker images from scratch.

## Resources
  
**Notebook Syntax**
  
<span style='color:#7393B3'>NOTE:</span>  
- Denotes additional information deemed to be *contextually* important
- Colored in blue, HEX #7393B3
  
<span style='color:#E74C3C'>WARNING:</span>  
- Significant information that is *functionally* critical  
- Colored in red, HEX #E74C3C
  
---
  
**Links**
  
[Docker Website](https://www.docker.com)  
  
---
  
**Notable Functions**
  
<table>
  <tr>
    <th>Index</th>
    <th>Operator</th>
    <th>Use</th>
  </tr>
  <tr>
    <td>1</td>
    <td>NaN</td>
    <td>NaN</td>
  </tr>
</table>
  
---
  
**Language and Library Information**  
  
CLI (Command Line Interface)
  
---
  
**Miscellaneous Notes**
  
NaN

## Creating your own Docker images
  
Now that we know how to work with docker images and containers, it's time to create our own images.
  
**Creating images with Dockerfiles**
  
Docker images are the recipes or blueprints for Docker containers. To create this blueprint, we must write down a list of instructions in what is called a Dockerfile. A Dockerfile is a text file containing all the commands we would run in the command line to install the software we need, with the addition of some Docker-specific syntax. Conveniently this file should be called Dockerfile for Docker to be able to find it.
  
<center><img src='../_images/creating-your-own-docker-images.png' alt='img' width='740'></center>
  
**Starting a Dockerfile**
  
Just like when we would follow a recipe, Docker runs the lines in a Dockerfile from top to bottom. The first line in a Dockerfile is always the `FROM` instruction. This instruction indicates to Docker which image to start from. We can base our images on any other image, Postgres, Ubuntu, another image you made yourself, or even the hello-world image. As with pulling an image, if you want to start from a specific version, you can specify the version right after the image name, separating both with a colon.
  
<center><img src='../_images/creating-your-own-docker-images1.png' alt='img' width='740'></center>
  
**Building a Dockerfile**
  
With the `FROM` instruction, we can create the most basic Dockerfile; we can then create an image from this Dockerfile using the Docker build command. The `docker build` command is followed by the location of the Dockerfile we want to build. If our Dockerfile is in the current folder, this is simply a dot. When running Docker build, in the last line of the output, we can see the id or hash docker assigns the new image. The hash starts by indicating its type, sha256, in this case. This is followed by the unique hash, which starts with a67f for the example on the slide.
  
<center><img src='../_images/creating-your-own-docker-images2.png' alt='img' width='740'></center>
  
**Naming our image**
  
If we want to give our image a more recognizable name, we can use the `-t` for tag flag followed by the name we want to give our image. If we also want to give a version to our image, we can add a colon and the version after the image name. In both cases we end the docker build command with a dot indicating our Dockerfile is in the current working directory. Once Docker has successfully built our image from our Dockerfile, we can run and use our image just like the images we downloaded from Dockerhub.
  
<center><img src='../_images/creating-your-own-docker-images3.png' alt='img' width='740'></center>
  
**Customizing images**
  
Now that we can create a very basic image from a Dockerfile, the next step is to start customizing our image. To customize our Dockerfile we will use the `RUN` instruction. The `RUN` instruction allows us to run any valid shell command while building an image. To make an image that runs a python data analysis, we start from the ubuntu image, which has Ubuntu installed, by specifying the `FROM ubuntu` instruction followed by `RUN apt-get` update. `apt-get` is a package manager which enables us to install all kinds of software. The `apt-get` update command we just added to our Dockerfile will update apt-get so it knows what the most up-to-date version is of all the different software it can install for us. Using another `RUN` instruction on the following line, we download python using `RUN apt-get install python3`. Like we can see at the bottom of the slide, some bash commands require user input. While a Docker image is building it is not possible to manually give any input to the bash commands docker runs. Instead we can pass the dash `-y` flag to `apt-get install` to make sure it doesn't need any input.
  
<center><img src='../_images/creating-your-own-docker-images4.png' alt='img' width='740'></center>
  
**Building a non-trivial Dockerfile**
  
Once we add `RUN` instructions to our Dockerfiles, we'll notice that building a Dockerfile can take seconds to sometimes tens of minutes because Docker is actually running the commands specified with `RUN`. For example, building a Dockerfile with `FROM ubuntu and RUN apt-get update`, will take the same time as us running apt-get update on ubuntu, which is 2 seconds for the example on the slide.
  
<center><img src='../_images/creating-your-own-docker-images5.png' alt='img' width='740'></center>
  
**Summary**
  
Here is a summary of the new commands and instructions you can refer back to when completing the exercises.
  
<center><img src='../_images/creating-your-own-docker-images6.png' alt='img' width='740'></center>
  
**Let's practice!**
  
Now that we've seen how to write a basic Dockerfile let's give it a go ourselves!

### Building your first image
  
Let's build your first image! We've created a Dockerfile for you, and you can see it in your current working directory using the `ls` command. You can look at its content using `cat Dockerfile` or using `nano`.
  
---
  
1. Using the terminal, enter the command to build an image from the Dockerfile in your current working directory.
2. Well done! While it's possible to build an image without naming it, we usually want to give our image a name. Using the terminal, enter the command to build an image called `my_first_image` from the Dockerfile in your current working directory.

In [None]:
%%sh
docker build .
docker build -t my_first_image .

Nicely built! Now that you know how to build an image from a Dockerfile, let's try adding some instructions to a Dockerfile.

### Working in the command-line
  
A Dockerfile is just a textfile and creating or editing it can be done using any text editor. However since the default way to work with Docker is through the Command Line Interface, it's convenient to also edit Dockerfiles using the command line. Let's refresh our memory on how to navigate the file system and create or edit a Dockerfile with the command line.
  
---
  
1. Create a file called `Dockerfile` in the current working directory.
- Use `touch Dockerfile`; the `touch` command will create an empty file for you.
- Or use `nano Dockerfile`, which will create an empty file but also open the `nano` text editor, which you then have to save using `CTRL+S` after which you can exit with `CTRL+X`.
2. Now that you've created a new file let's add a line of text to it.
- Open the file using `nano Dockerfile`.
- add `FROM ubuntu `to the start of the file.
- Use `CTRL+S` to save your changes.
- Followed by `CTRL+X` to exit `nano`.
3. Using `nano` to edit a file is often the most intuitive way; however, you can also use `echo` combined with a double pipe (`>>`) to append to files without opening them. Let's use `echo` to append `RUN apt-get update` to our Dockerfile.
- Type the first part of the command, `echo "RUN apt-get update"` which will print the text between the quotes, don't press enter yet.
- Then add the double pipe `>>`, which will redirect the output.
- Followed by Dockerfile to make the output of `echo` append to the Dockerfile.
- Now execute the command by pressing the enter key.
4. Well done! You successfully created and made changes to a file. Often while working in the shell, you want to quickly check the contents of a file without making changes to it. This is easily done using the `cat` command.
- Check the contents of the Dockerfile using the `cat` command, `cat` expects a filename as its first and only argument.

In [None]:
%%sh
touch Dockerfile
nano Dockerfile
echo "RUN apt-get update" >> Dockerfile
cat Dockerfile

Well done! Now that you've refreshed your memory on how to work with `touch`, `nano`, and `cat`, let's get back to using Docker and writing Dockerfiles.

### Editing a Dockerfile
  
Let's get familiar with the `RUN` instruction. We've created a Dockerfile for you. You can look at its content using `cat Dockerfile` or using `nano`. Like before, the Dockerfile already has a `FROM` instruction, but you'll be adding a `RUN` command this time.
  
---
  
1. Add the correct instruction to the end of the Dockerfile so that the `mkdir my_app` shell command is run when building the Dockerfile.
2. Using the terminal, run the command to build an image called `my_app` from the Dockerfile in your current working directory.

In [None]:
%%sh
echo "RUN mkdir my_app" >> Dockerfile
docker build -t my_app .

Nicely done! Now that you know how to build and edit Dockerfiles, let's create one from scratch.

### Creating your own Dockerfile
  
While it's possible download images for many use-cases. Often an image is not exactly what you need. In that case all you need to do is create a new image based on an existing image that comes close to your use case. Let's go through every step to create a Dockerfile from scratch to build on top of the existing ubuntu image, add instructions and then build it into an image.
  
---
  
1. Create a file called Dockerfile in the current working directory.
2. Add the first instruction to the Dockerfile so that it will build on top of the `ubuntu` image.
3. Add instructions to the Dockerfile so that it runs `apt-get update` and `apt-get install -y python3` when building the Dockerfile.
4. Using the terminal, run the command to build an image called `my_python_image` from the Dockerfile in your current working directory.

In [None]:
%%sh
touch Dockerfile
echo "FROM ubuntu" >> Dockerfile
echo "RUN apt-get update" >> Dockerfile; echo "RUN apt-get install -y python3" >> Dockerfile
docker build -t my_python_image .

From start to finish, well done! Now that we can create an image with python3, we'll see how we can add code to our images.

## Managing files in your image
  
We've seen the basics of building images, now we'll learn how to add files to our images.
  
**COPYing files into an image**
  
The `RUN` instruction allowed us to execute bash commands to create an image, but we can't use it to move files from our local file system onto the image we're building. To copy local files to our image we use the `COPY` instruction. The `COPY` instruction needs two parameters: first, we pass to it the path of the file we want to copy, including the name of the file we want to copy. The second parameter is the destination path inside the image. We can choose whether to end the destination path with a filename. If we do not pass a filename, the file will get its original name.
  
<center><img src='../_images/managing-files-in-your-image.png' alt='img' width='740'></center>
  
**COPYing folders**
  
If we don't specify a filename in the source path, then instead of just a single file, the entire contents of the folder will be copied, including sub-folders. For example, if we have a folder called pipeline_v3 with two files and a sub-folder with one file, we can copy both files and the subfolder with its file using the `COPY` instruction ending in pipeline_v3/.
  
<center><img src='../_images/managing-files-in-your-image1.png' alt='img' width='740'></center>
  
**Copy files from a parent directory**
  
It is not possible to copy files from a parent directory when building a Dockerfile. For example, let's say we are in the projects folder when we run docker build. A `COPY` instruction in the Dockerfile that tries to copy the `init.py` file from the parent directory of the current directory into the image will fail with the not found message we can see on the slide.
  
<center><img src='../_images/managing-files-in-your-image2.png' alt='img' width='740'></center>
  
**Downloading files**
  
Another common way to include files in an image is to download them during the image build. While there is an instruction that allows us to do this, the `ADD` instruction, best practice is to use several `RUN` instructions and bash commands to download and unzip files. First, use `curl` to download a file to a local directory. Then unzip it using the `unzip` command if it is an archive. Finally, once we don't need the zip file anymore, we can remove it with the `rm` command.
  
<center><img src='../_images/managing-files-in-your-image3.png' alt='img' width='740'></center>
  
**Downloading files efficiently**
  
Any instruction in a Dockerfile that downloads files will add to the size of the image. Even if the files are removed in a later instruction. To ensure images don't become unnecessarily big, we should download, unzip and remove the original file in a single `RUN` instruction. This can be done by chaining the commands using a backslash and double ampersand. The backslash makes it so bash commands can span multiple lines allowing us to keep our Dockerfile readable. The double ampersand tells the shell to execute the commands one after the other. Combining them allows us to create a single `RUN` instruction that is still easy to read over multiple lines. By using this best practice on downloading and unpacking archives, we ensure our image is as small as possible, making it easier to share and faster to run.
  
<center><img src='../_images/managing-files-in-your-image4.png' alt='img' width='740'></center>
  
**Summary**
  
Here is a summary of the new commands and instructions you can refer back to when completing the exercises.
  
<center><img src='../_images/managing-files-in-your-image5.png' alt='img' width='740'></center>
  
**Let's practice!**
  
Now that we know how to handle files in images let's get some practice!

### Copying files into an image
  
You've created an Ubuntu and python3-based image to run your data pipeline. Update your Dockerfile so your image includes the pipeline.py file in which you defined the pipeline.
  
---
  
1. To the end of the Dockerfile, add the Docker instruction, which copies the `pipeline.py` file in your current working directory (/home/repl) to the /app folder in the image you want to build.

In [None]:
%%sh
echo "COPY /home/repl/pipeline.py /app/" >> Dockerfile

Nice addition, copying is just the first step of managing files in an images. Let's try some harder file management next.

### Copying folders
  
After creating an ubuntu and python3 image with your pipeline python code in it, you realize you actually need your entire pipeline_v3 project in the Docker image to be able to install its dependencies. There is a Dockerfile in the current working directory to start from that already has python3 installed.
  
---
  
1. Add the instruction to copy all pipeline_v3 project files into the `/app` directory in your Docker image. You can find the files in the `/pipeline_v3/` directory, which is in the current working directory on your local machine.
2. Using the terminal, run the command to build an image called `pipeline_v3` from the Dockerfile in your current working directory.

In [None]:
%%sh
echo "COPY pipeline_v3/ app/" >> Dockerfile
docker build -t pipeline_v3 .

Nice one! You've copied files and folders, now let's try downloading them.

### Working with downloaded files
  
Your previous image worked, and you were able to finalize your pipeline python code! You can now create the next version of your image. Let's create a Dockerfile from scratch, add instructions and then build it.
  
---
  
1. Create a file called Dockerfile in the current working directory.
2. Add the first instruction to the Dockerfile so that it will build on top of the `ubuntu` image.
Add instructions to the Dockerfile so that it runs `apt-get update` and `apt-get install -y python3 curl unzip`.
3. Add instructions to the Dockerfile to:
Download the zip file from` https://assets.datacamp.com/production/repositories/6082/datasets/31a5052c6a5424cbb8d939a7a6eff9311957e7d0/pipeline_final.zip` to `/pipeline_final.zip`.  
Unzip the file  
And remove the zip  
You can use three separate instructions or make it a single instruction to keep your image smaller.  
4. Using the terminal, run the command to build an image called `pipeline` from the Dockerfile in your current working directory.

In [None]:
%%sh
# Create file
touch Dockerfile
# Adding instructions to get the image
echo "FROM ubuntu" >> Dockerfile; echo "RUN apt-get update" >> Dockerfile; echo "RUN apt-get install -y python3 curl unzip" >> Dockerfile
# The following command will append the RUN instructions to the end of the Dockerfile.
echo -e "RUN curl https://assets.datacamp.com/production/repositories/6082/datasets/31a5052c6a5424cbb8d939a7a6eff9311957e7d0/pipeline_final.zip -o /pipeline_final.zip\nRUN unzip /pipeline_final.zip\nRUN rm /pipeline_final.zip" >> Dockerfile
# Alternatively:
# 1. Open the Dockerfile using `nano Dockerfile`.
# 2. Add `RUN curl https://assets.datacamp.com/production/repositories/6082/datasets/31a5052c6a5424cbb8d939a7a6eff9311957e7d0/pipeline_final.zip -o /pipeline_final.zip` to the end of the Dockerfile.
# 3. Add `RUN unzip /pipeline_final.zip` to the end of the Dockerfile.
# 4. Add `RUN rm /pipeline_final.zip` to the end of the Dockerfile.
# 5. Save the file and exit nano using CTRL+S and CTRL+X
# Build pipeline
docker build -t pipeline .

Great job! You've shown that you can work with files when building your own images.