<a href="https://colab.research.google.com/github/micah-shull/AI_Agents/blob/main/125_Containerization.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

As a data scientist moving into building **AI agents** or more production-grade systems, learning about **containerization** and **CI/CD** will make your workflows much more reliable and easier to scale. Let’s break things down step by step.

---

## 1. Why Containerization Matters

Containerization is about **packaging your code with all its dependencies** so it runs consistently across different machines.

* **The Problem**: Your model/code works on your laptop, but when deployed on a server it fails because of version mismatches (Python, CUDA, TensorFlow, etc.).
* **The Solution (Docker)**: Docker creates a *container image* (like a lightweight virtual machine) that ensures your code runs the same way everywhere.

### Core Docker Concepts

* **Image**: A blueprint (e.g., Ubuntu + Python + your packages + your code).
* **Container**: A running instance of an image.
* **Dockerfile**: A script defining how to build an image.
* **Docker Hub**: A registry to share images (like GitHub for code).

---

## 2. CI/CD (Continuous Integration / Continuous Deployment)

CI/CD is about **automating testing and deployment**:

* **CI (Continuous Integration)**: Every time you push code to GitHub (or GitLab, etc.), tests automatically run (e.g., unit tests, linting, model checks).
* **CD (Continuous Deployment)**: If the tests pass, your code can automatically be deployed (for example, to a server, Kubernetes cluster, or cloud service).

For AI projects, this means:

* Your model training pipeline can be automatically tested.
* Data preprocessing code is validated.
* Deployment pipelines ensure your API or agent is always up-to-date.

---

## 3. How They Connect

* Docker ensures **your code runs reliably**.
* CI/CD ensures **your code is always tested and deployed automatically**.
* Together, they allow you to:

  * Train models reproducibly.
  * Deploy APIs/agents without “it works on my machine” issues.
  * Collaborate with engineers more smoothly.

---

## 4. Where to Start

Here’s a roadmap tailored for a data scientist:

1. **Learn Docker basics**

   * Write a simple `Dockerfile` for a Python script.
   * Build and run a container.
   * Push/pull images from Docker Hub.

2. **Use Docker in your ML workflow**

   * Containerize a training script with dependencies (NumPy, PyTorch, etc.).
   * Test running it on a different machine.

3. **Intro to CI/CD (with GitHub Actions)**

   * Set up a GitHub repo with a small Python script.
   * Add a GitHub Actions workflow that runs `pytest` on every push.

4. **Combine Docker + CI/CD**

   * Use GitHub Actions to build and push a Docker image automatically.
   * Deploy that image to a cloud service (AWS, GCP, Azure, or even Hugging Face Spaces).

5. **Level up: Orchestration & Scaling**

   * Once comfortable: learn Kubernetes (for scaling containers).
   * Tools like **MLflow, Airflow, or Prefect** integrate nicely with CI/CD pipelines.

---

## 5. Recommended Resources

* **Docker**:

  * [Docker Official Tutorial](https://docs.docker.com/get-started/)
  * "Docker for Data Scientists" blog posts / YouTube walkthroughs.

* **CI/CD**:

  * [GitHub Actions docs](https://docs.github.com/en/actions)
  * "CI/CD for Machine Learning" by Google Cloud or MLOps blogs.

* **MLOps Angle**:

  * *Practical MLOps* (book by Noah Gift)
  * Courses on full-stack ML deployment.






## 1. **Image** – The Blueprint

* **What it is**:
  An image is like a *recipe* or a *frozen snapshot* of an environment. It includes:

  * A base operating system (Ubuntu, Debian, Alpine, etc.)
  * Installed software (Python, CUDA, scikit-learn, PyTorch, etc.)
  * Your own code and files

* **Analogy**: Think of an **image as a cake recipe**. Once the recipe is written, you can bake (run) as many cakes (containers) as you want.

* **Example for ML**:
  You create an image that has:

  ```text
  Ubuntu + Python 3.10 + NumPy + Pandas + scikit-learn + your preprocessing script
  ```

  Anyone who pulls this image will have exactly that setup, no matter their computer.

---

## 2. **Container** – A Running Instance of an Image

* **What it is**:
  A container is like **a virtualized environment** running on your machine. It’s *based on an image*, but it’s live, with processes running inside.

* **Key idea**: You can run **multiple containers from the same image**. They are lightweight, isolated, and can be started/stopped in seconds.

* **Analogy**: If the image is the *recipe*, then the container is the *actual cake* you bake. You can bake many cakes from the same recipe.

* **Example for ML**:

  * Run a Jupyter Notebook inside a container.
  * Run an inference API in a container.
  * Run your preprocessing pipeline in one container and training in another.

---

## 3. **Dockerfile** – The Recipe

* **What it is**:
  A `Dockerfile` is just a text file with step-by-step instructions for building an image.

* **Structure** (simplified):

  ```dockerfile
  # Start with a base image
  FROM python:3.10-slim  

  # Install dependencies
  RUN pip install numpy pandas scikit-learn  

  # Copy your code into the container
  COPY my_script.py /app/my_script.py  

  # Set the working directory
  WORKDIR /app  

  # Command to run when the container starts
  CMD ["python", "my_script.py"]
  ```

* **Analogy**: The `Dockerfile` is your **recipe card**. If you hand it to someone, they can bake the same cake (build the same image).

* **Why it matters**:

  * Reproducibility: Same build steps every time.
  * Automation: Images can be rebuilt automatically in CI/CD pipelines.

---

## 4. **Docker Hub** – The Image Registry

* **What it is**:
  A cloud service (like GitHub but for images) where you can **store and share Docker images**.

* **How it works**:

  * You can `docker pull ubuntu` → this grabs an official Ubuntu image from Docker Hub.
  * You can `docker push myusername/myproject` → this uploads your custom image so teammates can use it.

* **Analogy**: Like GitHub for code, but instead of `git clone`, you use `docker pull`.

* **Why it matters for DS/ML**:

  * Share reproducible training environments with your team.
  * Publish inference servers for others to deploy.
  * Use pre-built images (e.g., `pytorch/pytorch:latest`) instead of installing everything yourself.

---

✅ **How they connect**:

* You **write a Dockerfile** to define how to build an **image**.
* You **run a container** from that image.
* You can **share the image** via **Docker Hub** so others can run it too.



Let’s first tackle the **OS side of things** so Docker makes more sense. Since you’ve only worked with **macOS**, the world of **Linux distributions** (like Ubuntu) might feel unfamiliar, but it’s actually simpler than it looks.

---

## 1. Operating Systems in Context

* **macOS**: User-friendly, polished, and optimized for Apple hardware.
* **Windows**: Dominant in desktop business and gaming.
* **Linux**: Open-source family of operating systems, widely used in servers, cloud computing, and data science.

👉 **Docker usually runs Linux inside containers**, even on Mac or Windows. That’s because the ecosystem of tools (Python packages, ML frameworks, cloud services) is mostly built for Linux first.

---

## 2. What is Ubuntu?

* **Ubuntu** is one of the most popular **Linux distributions (distros)**.
* It’s free, open-source, and very widely supported.
* Known for being user-friendly compared to other Linux distros.
* Maintained by Canonical, with new releases every 6 months.

### Why Ubuntu is popular in data/AI:

* Many ML libraries (PyTorch, TensorFlow, CUDA drivers) officially support Ubuntu first.
* Cloud providers (AWS, GCP, Azure) default to Ubuntu images when you spin up servers.
* Tons of documentation and community support.

---

## 3. Other Common Linux Distros in Docker

When you see a `FROM ...` line in a Dockerfile, it usually starts with one of these bases:

* **Debian**: Very stable, widely used. Ubuntu is based on Debian.
* **Alpine**: Super lightweight Linux distro (\~5 MB). Good for small containers, but sometimes tricky with scientific packages (missing system libraries).
* **CentOS / Rocky Linux**: Common in enterprise environments, though less popular in ML projects.
* **Fedora**: Cutting edge, updated frequently.

👉 As a data scientist, **Ubuntu or Debian** bases are your best bet (since ML packages install smoothly).

---

## 4. Why You Don’t Notice This on macOS

On your Mac:

* You install `brew install python` or `pip install pandas` and it just works.
* But under the hood, macOS is actually based on **Unix** (like Linux), so it shares many similarities.
* Docker hides complexity by letting you “pretend” you’re on Ubuntu, even if you’re running macOS or Windows.

So, when you run a container based on Ubuntu, your Mac doesn’t turn into Ubuntu — Docker just runs a tiny virtualized Ubuntu environment *inside* your Mac.

---

## 5. How It All Fits Together in Docker

* **Base OS (Ubuntu/Debian/Alpine)**: Defines the foundation (like the kitchen you start cooking in).
* **Your Python + packages**: Installed on top of that OS.
* **Your code**: Copied into the container.

That’s why many Dockerfiles start with something like:

```dockerfile
FROM ubuntu:22.04
```

or, more ML-friendly:

```dockerfile
FROM python:3.10-slim
```

(which is itself built on Debian/Ubuntu).



# Docker Set Up

Here’s a dead-simple, copy-pasteable walkthrough that works great on a Mac (Intel or Apple Silicon).

---

# 0) Prereqs (1 minute)

1. Install **Docker Desktop for Mac** if you haven’t yet.
2. Verify it’s running:

```bash
docker --version
docker run --rm hello-world
```

In [None]:
Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (arm64v8)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
 https://hub.docker.com/

For more examples and ideas, visit:
 https://docs.docker.com/get-started/

You’re “inside” your first container — congrats 🚀. Let’s break down your questions, then I’ll show you how to **peek inside images and containers** in different ways.

---

## 1. What is a **Bash Shell**?

* **Shell** = A command-line interface that lets you interact with the OS by typing commands.
* **Bash** = “Bourne Again SHell,” the most common Linux shell.
* When you typed:

  ```bash
  docker run -it ubuntu bash
  ```

  you told Docker:

  * Start a container from the `ubuntu` image.
  * Run `bash` inside it.
  * Give me an **interactive terminal** (`-it`) so I can type commands.

So now your terminal is connected to the shell *inside the container*, not your Mac.

---

## 2. Is this really inside the container?

Yes ✅

Let’s dissect your output:

```
root@ae3e71e3a2ab:/#
```

* `root` = You are the root (admin) user inside the container.
* `ae3e71e3a2ab` = The container’s ID (unique name).
* `/` = Current directory (the root of the Linux filesystem inside).

```
Linux ae3e71e3a2ab 6.10.14-linuxkit ...
```

* This shows the kernel (your Mac is running a lightweight Linux VM under the hood because macOS doesn’t run Linux natively).
* The container is running in **isolation** — its own filesystem, processes, etc.

```
bin  boot  dev  etc  home ...
```

* That’s a **full Linux filesystem**!
* Completely separate from your Mac files (unless you explicitly mount them in).

---

## 3. Peeking Inside Images and Containers

Here are some tricks you’ll love:

### 🔎 See all images you’ve pulled

```bash
docker images
```

You should see:

```
REPOSITORY   TAG       IMAGE ID       CREATED       SIZE
ubuntu       latest    9cbed7541129   ...           77MB
hello-world  latest    ...
```

### 🔎 See all containers (running and stopped)

```bash
docker ps -a
```

* `docker ps` → only running containers.
* `-a` → also show exited ones.

### 🔎 Inspect a running container

While you’re inside `ubuntu`, open a new terminal tab and run:

```bash
docker ps
```

Note the `CONTAINER ID`. Then:

```bash
docker exec -it <container_id> bash
```

This drops you into another shell **inside the same container**.

### 🔎 Explore the filesystem

Inside the container:

```bash
ls /
ls /etc
cat /etc/os-release
```

That last one shows the exact distro version (Ubuntu 22.04+).

### 🔎 List processes inside the container

```bash
ps aux
```

Usually you’ll see only a handful of processes, since the container is minimal.

### 🔎 Exit the container

Just type:

```bash
exit
```

You’ll return to your Mac’s shell.

---

## 4. Mental Model

* **Image** = template (like the `ubuntu:latest` filesystem).
* **Container** = a live instance of that image (with its own files, processes, and networking).
* **Bash shell** = your way of “stepping inside” and controlling it.

When you did `ls`, you weren’t looking at your Mac’s `/`, you were looking at **Ubuntu’s `/` inside the container**.




In [None]:
root@ae3e71e3a2ab:/# uname -a
Linux ae3e71e3a2ab 6.10.14-linuxkit #1 SMP Thu Aug 14 19:26:13 UTC 2025 aarch64 aarch64 aarch64 GNU/Linux
root@ae3e71e3a2ab:/# ls
bin   dev  home  media  opt   root  sbin  sys  usr
boot  etc  lib   mnt    proc  run   srv   tmp  var



What you’re seeing there **is the Ubuntu filesystem** inside your container. Let’s break it down:

---

## 1. Is this an Ubuntu instance?

Yes.
When you ran:

```bash
docker run -it ubuntu bash
```

Docker pulled the official `ubuntu:latest` image from Docker Hub. That image is basically a **compressed snapshot of the Ubuntu root filesystem**.

When the container starts, it gives you a shell (`bash`) into that filesystem. So yes — you’re sitting inside a minimal Ubuntu instance.

---

## 2. Do these folders *make up Ubuntu*?

Yes, those folders are the **building blocks of a Linux system**. Together they *are* the operating system’s environment.

Here’s what some of them mean (simplified):

* `/bin` – essential command binaries (`ls`, `cp`, `cat`)
* `/boot` – files needed to boot a Linux system (not really used in containers)
* `/dev` – device files (like hard drives, USBs)
* `/etc` – configuration files (e.g., `/etc/passwd` for users, `/etc/apt/` for package manager configs)
* `/home` – user home directories (`/home/micah` if this were a normal desktop)
* `/lib` – shared libraries needed by binaries in `/bin` and `/sbin`
* `/media` & `/mnt` – mount points for external drives or filesystems
* `/opt` – optional software (sometimes used for manually installed apps)
* `/proc` & `/sys` – virtual filesystems that expose kernel and process information
* `/root` – the home directory for the root user (you’re `root` inside the container)
* `/sbin` – system binaries (like `ifconfig`, `init`)
* `/srv` – service data (rarely used in containers)
* `/tmp` – temporary files
* `/usr` – user programs, libraries, and docs (`/usr/bin`, `/usr/lib`)
* `/var` – variable files (logs, caches, databases, etc.)

👉 Notice what’s *missing*: a desktop, a GUI, drivers. This is a **server-style, minimal Ubuntu** — just enough to run commands and programs.

---

## 3. So what’s the difference from “real Ubuntu”?

* On a desktop Ubuntu machine:

  * Same folders, plus extras like GUI packages, user accounts, hardware drivers.
* Inside Docker:

  * You get a **minimal root filesystem**.
  * It uses your Mac’s kernel (via Docker’s Linux VM), so it doesn’t need to include one.
  * That’s why the image is only \~77 MB instead of gigabytes.

So yes: those directories *are Ubuntu* — just the stripped-down, container-friendly version.

---

✅ Quick check for fun: inside your container, run:

```bash
cat /etc/os-release
```




In [None]:
root@ae3e71e3a2ab:/# cat /ect/os-release
cat: /ect/os-release: No such file or directory
root@ae3e71e3a2ab:/# cat /etc/os-release
PRETTY_NAME="Ubuntu 24.04.3 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04.3 LTS (Noble Numbat)"
VERSION_CODENAME=noble
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=noble
LOGO=ubuntu-logo

Poking around inside a container is *the best* way to build intuition about both Linux **and** Docker. Since containers are disposable, you can explore, break things, and just restart without fear. Let’s do a little guided tour.

---

# 🧭 1. Confirm Where You Are

Inside your running Ubuntu container, try:

```bash
whoami
pwd
```

* `whoami` → shows your current user (`root`)
* `pwd` → “print working directory” (you’ll start in `/`)


In [None]:
root@ae3e71e3a2ab:/# whoami
root
root@ae3e71e3a2ab:/# pwd
/


# 📂 2. Explore the Filesystem

```bash
ls /
ls /bin
ls /etc | head -n 10
```

* `ls` lists files/folders.
* `head -n 10` shows just the first 10 lines so you don’t flood the screen.



In [None]:
root@ae3e71e3a2ab:/# ls /
bin   dev  home  media  opt   root  sbin  sys  usr
boot  etc  lib   mnt    proc  run   srv   tmp  var
root@ae3e71e3a2ab:/# ls /bin
'['                        expiry             mount               skill
 addpart                   expr               mountpoint          slabtop
 apt                       factor             mv                  sleep
 apt-cache                 faillog            namei               snice
 apt-cdrom                 fallocate          nawk                sort
 apt-config                false              newgrp              split
 apt-get                   fgrep              nice                stat
 apt-key                   find               nisdomainname       stdbuf
 apt-mark                  findmnt            nl                  stty
 arch                      flock              nohup               su
 awk                       fmt                nproc               sum
 b2sum                     fold               nsenter             sync
 base32                    free               numfmt              tabs
 base64                    getconf            od                  tac
 basename                  getent             pager               tail
 basenc                    getopt             partx               tar
 bash                      gpasswd            passwd              taskset
 bashbug                   gpgv               paste               tee
 captoinfo                 grep               pathchk             tempfile

In [None]:
root@ae3e71e3a2ab:/# ls /etc | head -n 10
alternatives
apt
bash.bashrc
bindresvport.blacklist
cloud
cron.d
cron.daily
debconf.conf
debian_version
default

# ⚙️ 3. Play with Processes

See what’s running inside the container:

```bash
ps aux
```

Usually you’ll see just a couple of processes: your `bash` shell and `ps` itself.
Compare that to your Mac, where hundreds of processes run at once — containers are *super minimal*.


In [None]:
root@ae3e71e3a2ab:/# ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0   4296  3548 pts/0    Ss   20:33   0:00 bash
root        14  100  0.0   7628  3468 pts/0    R+   21:09   0:00 ps aux

# 🧑‍💻 4. Make & Edit Files

Create a new file:

```bash
echo "Hello from inside the container!" > myfile.txt
cat myfile.txt
```

Make a directory and move the file:

```bash
mkdir mydir
mv myfile.txt mydir/
ls mydir
```

In [None]:
root@ae3e71e3a2ab:/# mkdir mydir
root@ae3e71e3a2ab:/# ls
bin   dev  home  media  mydir  proc  run   srv  tmp  var
boot  etc  lib   mnt    opt    root  sbin  sys  usr

In [None]:
root@ae3e71e3a2ab:/# touch myfile.txt
root@ae3e71e3a2ab:/# ls
bin   dev  home  media  mydir       opt   root  sbin  sys  usr
boot  etc  lib   mnt    myfile.txt  proc  run   srv   tmp  var
root@ae3e71e3a2ab:/# mv myfile.txt mydir/
root@ae3e71e3a2ab:/# ls
bin   dev  home  media  mydir  proc  run   srv  tmp  var
boot  etc  lib   mnt    opt    root  sbin  sys  usr
root@ae3e71e3a2ab:/# cd mydir
root@ae3e71e3a2ab:/mydir# ls
myfile.txt

In [None]:
root@ae3e71e3a2ab:/mydir# cd ..
root@ae3e71e3a2ab:/# ls
bin   dev  home  media  mydir  proc  run   srv  tmp  var
boot  etc  lib   mnt    opt    root  sbin  sys  usr
root@ae3e71e3a2ab:/# ls mydir
myfile.txt
root@ae3e71e3a2ab:/#

# 📦 5. Use the Package Manager (apt)

Ubuntu uses `apt` for installing software. Try:

```bash
apt-get update        # refresh package list
apt-get install -y curl
```

Then use it:

```bash
curl https://httpbin.org/get
```

(That fetches a small JSON response from the internet.)

👉 Reminder: if you break the container by installing/removing stuff, no worries — just `exit` and start a fresh one with `docker run -it ubuntu bash`.



In [None]:
root@ae3e71e3a2ab:/# apt-get update
Get:1 http://ports.ubuntu.com/ubuntu-ports noble InRelease [256 kB]
Get:2 http://ports.ubuntu.com/ubuntu-ports noble-updates InRelease [126 kB]
Get:3 http://ports.ubuntu.com/ubuntu-ports noble-backports InRelease [126 kB]
Get:4 http://ports.ubuntu.com/ubuntu-ports noble-security InRelease [126 kB]
Get:5 http://ports.ubuntu.com/ubuntu-ports noble/universe arm64 Packages [19.0 MB]
Get:6 http://ports.ubuntu.com/ubuntu-ports noble/restricted arm64 Packages [113 kB]
Get:7 http://ports.ubuntu.com/ubuntu-ports noble/multiverse arm64 Packages [274 kB]
Get:8 http://ports.ubuntu.com/ubuntu-ports noble/main arm64 Packages [1776 kB]
Get:9 http://ports.ubuntu.com/ubuntu-ports noble-updates/multiverse arm64 Packages [39.2 kB]
Get:10 http://ports.ubuntu.com/ubuntu-ports noble-updates/universe arm64 Packages [1832 kB]
Get:11 http://ports.ubuntu.com/ubuntu-ports noble-updates/restricted arm64 Packages [3011 kB]
Get:12 http://ports.ubuntu.com/ubuntu-ports noble-updates/main arm64 Packages [1818 kB]
Get:13 http://ports.ubuntu.com/ubuntu-ports noble-backports/main arm64 Packages [48.8 kB]
Get:14 http://ports.ubuntu.com/ubuntu-ports noble-backports/universe arm64 Packages [35.6 kB]
Get:15 http://ports.ubuntu.com/ubuntu-ports noble-security/multiverse arm64 Packages [20.2 kB]
Get:16 http://ports.ubuntu.com/ubuntu-ports noble-security/universe arm64 Packages [1110 kB]
Get:17 http://ports.ubuntu.com/ubuntu-ports noble-security/restricted arm64 Packages [2882 kB]
Get:18 http://ports.ubuntu.com/ubuntu-ports noble-security/main arm64 Packages [1477 kB]
Fetched 34.1 MB in 2s (15.9 MB/s)

In [None]:
root@ae3e71e3a2ab:/# apt-get install -y curl
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  ca-certificates krb5-locales libbrotli1 libcurl4t64 libgssapi-krb5-2 libk5crypto3
  libkeyutils1 libkrb5-3 libkrb5support0 libldap-common libldap2 libnghttp2-14
  libpsl5t64 librtmp1 libsasl2-2 libsasl2-modules libsasl2-modules-db libssh-4 openssl
  publicsuffix

In [None]:
root@ae3e71e3a2ab:/# curl https://httpbin.org/get
{
  "args": {},
  "headers": {
    "Accept": "*/*",
    "Host": "httpbin.org",
    "User-Agent": "curl/8.5.0",
    "X-Amzn-Trace-Id": "Root=1-68c099e1-36e4032f09d572110efc442f"
  },
  "origin": "70.171.32.158",
  "url": "https://httpbin.org/get"
}
root@ae3e71e3a2ab:/#

# 🔎 6. Learn About the Kernel & Hardware

```bash
uname -a
df -h
free -m
```

* `uname -a` → system/kernel info (you’ll see Docker’s LinuxKit kernel).
* `df -h` → disk usage.
* `free -m` → memory usage.


In [None]:
root@ae3e71e3a2ab:/# uname -a
Linux ae3e71e3a2ab 6.10.14-linuxkit #1 SMP Thu Aug 14 19:26:13 UTC 2025 aarch64 aarch64 aarch64 GNU/Linux
root@ae3e71e3a2ab:/# df -h
Filesystem      Size  Used Avail Use% Mounted on
overlay         224G  1.2G  212G   1% /
tmpfs            64M     0   64M   0% /dev
shm              64M     0   64M   0% /dev/shm
/dev/vda1       224G  1.2G  212G   1% /etc/hosts
tmpfs           2.0G     0  2.0G   0% /proc/scsi
tmpfs           2.0G     0  2.0G   0% /sys/firmware

In [None]:
root@ae3e71e3a2ab:/# free -m
               total        used        free      shared  buff/cache   available
Mem:            3919         446        3122           0         490        3472
Swap:           1023           0        1023




# 🧹 7. Exiting and Returning

When you’re done:

```bash
exit
```

That stops the container and you’re back on your Mac.
List all containers you’ve run (even stopped ones):

```bash
docker ps -a
```

---

✅ This kind of exploration will make you *way* more comfortable in Linux.



🔥 This is the perfect way to get comfortable with Linux **and** Docker at the same time. Let’s treat the container like your personal sandbox and go through the **core terminal commands** that you’ll use constantly when working inside containers (and in real servers too).

---

# 🧭 Navigation & Orientation

These help you figure out *where you are* in the filesystem.

```bash
pwd        # print working directory (your current location)
ls         # list files in current dir
ls -l      # long listing (shows permissions, owner, size, date)
ls -a      # show hidden files (like .bashrc)
cd /etc    # change directory to /etc
cd ~       # go to home directory (~ expands to /root in containers)
cd -       # go back to the last directory
```

👉 Try: `cd /usr/bin && ls | head -n 20` — this shows the first 20 programs in `/usr/bin`.



In [None]:
root@ae3e71e3a2ab:/# pwd
/
root@ae3e71e3a2ab:/# ls
bin  boot  dev  etc  home  lib  media  mnt  mydir  opt  proc  root  run  sbin  srv  sys  tmp  usr  var
root@ae3e71e3a2ab:/# ls -l
total 60
lrwxrwxrwx   1 root root    7 Apr 22  2024 bin -> usr/bin
drwxr-xr-x   2 root root 4096 Apr 22  2024 boot
drwxr-xr-x   5 root root  360 Sep  9 19:59 dev
drwxr-xr-x   1 root root 4096 Sep  9 21:15 etc
drwxr-xr-x   3 root root 4096 Aug  5 02:19 home
lrwxrwxrwx   1 root root    7 Apr 22  2024 lib -> usr/lib
drwxr-xr-x   2 root root 4096 Aug  5 02:12 media
drwxr-xr-x   2 root root 4096 Aug  5 02:12 mnt
drwxr-xr-x   2 root root 4096 Sep  9 21:16 mydir
drwxr-xr-x   2 root root 4096 Aug  5 02:12 opt
dr-xr-xr-x 210 root root    0 Sep  9 19:59 proc
drwx------   2 root root 4096 Aug  5 02:18 root
drwxr-xr-x   4 root root 4096 Aug  5 02:19 run
lrwxrwxrwx   1 root root    8 Apr 22  2024 sbin -> usr/sbin
drwxr-xr-x   2 root root 4096 Aug  5 02:12 srv
dr-xr-xr-x  11 root root    0 Sep  9 19:59 sys
drwxrwxrwt   1 root root 4096 Sep  9 21:15 tmp
drwxr-xr-x   1 root root 4096 Aug  5 02:12 usr
drwxr-xr-x   1 root root 4096 Aug  5 02:18 var

In [None]:
root@ae3e71e3a2ab:/# ls -a
.   .dockerenv  boot  etc   lib    mnt    opt   root  sbin  sys  usr
..  bin         dev   home  media  mydir  proc  run   srv   tmp  var

In [None]:
root@ae3e71e3a2ab:/# cd /etc
root@ae3e71e3a2ab:/etc# cd ~
root@ae3e71e3a2ab:~# cd -
/etc
root@ae3e71e3a2ab:/etc# cd ~
root@ae3e71e3a2ab:~# pwd
/root

You’ve stumbled on **Linux file permissions**. That `drwx------` stuff is telling you **what kind of file it is** and **who can access it**. Let’s decode it step by step using your example.

---

## 1. The Structure of `ls -l` Output

Each line looks like this:

```
drwx------   2 root root 4096 Aug  5 02:18 root
```

Breaking it down:

* `drwx------` → permissions (what we’ll decode below)
* `2` → number of links (technical detail; usually directory entries)
* `root root` → owner user and group (here both are `root`)
* `4096` → size (in bytes; for directories, often 4096 as a placeholder)
* `Aug 5 02:18` → last modified date
* `root` → the name of the file/directory

---

## 2. First Character = File Type

The very first character tells you **what kind of thing** it is:

* `-` → regular file
* `d` → directory
* `l` → symbolic link (shortcut/pointer)
* `c` → character device
* `b` → block device
* `p` → named pipe
* `s` → socket

So in your list:

* `d...` = directory
* `l...` = symlink (`sbin -> usr/sbin`)
* `-...` = plain file

---

## 3. The Next 9 Characters = Permissions

They come in 3 groups (user, group, others):

```
drwx------   →  d = directory
                rwx = owner permissions
                --- = group permissions
                --- = others permissions
```

* `r` = read
* `w` = write
* `x` = execute (for files: run as program; for directories: “cd into”)

### Examples from your output:

* `drwx------` → a directory that **only the owner (root) can read, write, and enter**.
* `drwxr-xr-x` → a directory where:

  * Owner: read/write/execute
  * Group: read/execute
  * Others: read/execute
    (This is very common for system dirs like `/usr`.)
* `lrwxrwxrwx` → a symlink; everyone has read/write/execute (but it just points somewhere else).
* `drwxrwxrwt` → special one — that `t` at the end means **sticky bit** (used for `/tmp`).

  * It allows anyone to write files there, but only the file’s owner (or root) can delete them.

---

## 4. Special Bits (`s` and `t`)

* `t` = **sticky bit** → common in `/tmp` so people can’t delete each other’s files.
* `s` = **setuid/setgid** → programs that run with the owner/group’s privileges.

---

## 5. Try It Yourself

Inside your container, test this:

```bash
mkdir mydir
touch mydir/test.txt
ls -ld mydir
chmod 700 mydir
ls -ld mydir
chmod 755 mydir
ls -ld mydir
```

* `700` → `drwx------` (only you can use it)
* `755` → `drwxr-xr-x` (anyone can look inside, but only you can modify)

---

✅ So, all those `drwxr-xr-x` lines you see in `/` are just Linux telling you **“this is a directory, and here’s who can read/write/execute it.”**



# 📂 Working with Files & Directories

Creating, copying, moving, deleting.

```bash
touch test.txt             # create empty file
echo "hello world" > a.txt # write text into a file
cat a.txt                  # print file contents
mkdir mydir                # make a directory
cp a.txt mydir/            # copy a file
mv a.txt mydir/b.txt       # move/rename a file
ls mydir                   # see your files
rm mydir/b.txt             # remove a file
rmdir mydir                # remove an empty directory
rm -r mydir                # remove a directory with contents
```

👉 Safe experiment: create a folder called `playground`, put a few files inside, then delete it.


In [None]:
root@ae3e71e3a2ab:/mydir# ls
a.txt  myfile.txt  test.txt
root@ae3e71e3a2ab:/mydir# cp a.txt b.txt
root@ae3e71e3a2ab:/mydir# ls
a.txt  b.txt  myfile.txt  test.txt
root@ae3e71e3a2ab:/mydir# cp test.txt test2.txt
root@ae3e71e3a2ab:/mydir# ls
a.txt  b.txt  myfile.txt  test.txt  test2.txt


# 📖 Viewing File Contents

Useful when debugging configs or logs.

```bash
cat /etc/os-release          # print whole file
head -n 5 /etc/passwd        # first 5 lines
tail -n 5 /etc/passwd        # last 5 lines
less /etc/passwd             # scroll through file (q to quit)
```



In [None]:
root@ae3e71e3a2ab:/mydir# head -n 5 /etc/passwd
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
root@ae3e71e3a2ab:/mydir#

In [None]:
root@ae3e71e3a2ab:/mydir# tail -n 5 /etc/passwd
list:x:38:38:Mailing List Manager:/var/list:/usr/sbin/nologin
irc:x:39:39:ircd:/run/ircd:/usr/sbin/nologin
_apt:x:42:65534::/nonexistent:/usr/sbin/nologin
nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin
ubuntu:x:1000:1000:Ubuntu:/home/ubuntu:/bin/bash
root@ae3e71e3a2ab:/mydir#


# ⚙️ System Info & Processes

Check what’s going on inside the container.

```bash
whoami            # current user
uname -a          # kernel info
ps aux            # list processes
top               # interactive process viewer (q to quit)
df -h             # disk space usage
free -m           # memory usage
```

👉 Notice how few processes are running compared to your Mac — containers are super lightweight.



In [None]:
top - 21:47:08 up  1:22,  0 user,  load average: 0.30, 0.24, 0.21
Tasks:   2 total,   1 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.1 us,  0.0 sy,  0.0 ni, 99.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   3919.4 total,   3115.3 free,    453.1 used,    490.9 buff/cache
MiB Swap:   1024.0 total,   1024.0 free,      0.0 used.   3466.3 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
    1 root      20   0    4296   3548   3036 S   0.0   0.1   0:00.88 bash
 3002 root      20   0    8732   4800   2752 R   0.0   0.1   0:00.06 top



In [None]:
root@ae3e71e3a2ab:/mydir# df -h
Filesystem      Size  Used Avail Use% Mounted on
overlay         224G  1.2G  212G   1% /
tmpfs            64M     0   64M   0% /dev
shm              64M     0   64M   0% /dev/shm
/dev/vda1       224G  1.2G  212G   1% /etc/hosts
tmpfs           2.0G     0  2.0G   0% /proc/scsi
tmpfs           2.0G     0  2.0G   0% /sys/firmware

In [None]:
root@ae3e71e3a2ab:/mydir# free -m
               total        used        free      shared  buff/cache   available
Mem:            3919         454        3114           0         490        3465
Swap:           1023           0        1023


# 📦 Installing & Managing Software

Ubuntu uses `apt` as its package manager.

```bash
apt-get update                  # refresh package list
apt-get install -y nano         # install a simple text editor
nano myfile.txt                 # open file in nano (ctrl+O to save, ctrl+X to exit)
apt-get install -y curl         # install curl (for HTTP requests)
curl https://httpbin.org/get    # fetch test data
```



In [None]:
root@ae3e71e3a2ab:/mydir# apt-get install -y nano
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Suggested packages:
  hunspell
The following NEW packages will be installed:
  nano
0 upgraded, 1 newly installed, 0 to remove and 1 not upgraded.
Need to get 281 kB of archives.
After this operation, 909 kB of additional disk space will be used.

In [None]:
root@ae3e71e3a2ab:/mydir# nano myfile.txt
root@ae3e71e3a2ab:/mydir#
root@ae3e71e3a2ab:/mydir# ls
a.txt  b.txt  myfile.txt  mynanofile.txt  test.txt  test2.txt
root@ae3e71e3a2ab:/mydir# pwd
/mydir
root@ae3e71e3a2ab:/mydir# ls
a.txt  b.txt  myfile.txt  mynanofile.txt  test.txt  test2.txt
root@ae3e71e3a2ab:/mydir# cat mynanofile.txt
here is my new text doc i created using nano wow!!
now how do i get out of here??

# 🔐 Permissions & Ownership (Important!)

Every file has **permissions** and **owners**. Check them with `ls -l`.

```bash
ls -l
-rw-r--r--  1 root root   12 Sep  9 12:34 a.txt
```

Breakdown:

* `rw- r-- r--` → permissions (owner can read/write, group can read, others can read)
* `root root` → owner user & group

Commands to play with:

```bash
chmod 600 a.txt     # only owner can read/write
chmod 644 a.txt     # owner read/write, everyone else read
chown root:root a.txt   # change owner (needs root)
```


In [None]:
root@ae3e71e3a2ab:/mydir# ls -l
total 12
-rw-r--r-- 1 root root 12 Sep  9 21:38 a.txt
-rw-r--r-- 1 root root 12 Sep  9 21:42 b.txt
-rw-r--r-- 1 root root  0 Sep  9 21:16 myfile.txt
-rw-r--r-- 1 root root 82 Sep  9 21:50 mynanofile.txt
-rw-r--r-- 1 root root  0 Sep  9 21:37 test.txt
-rw-r--r-- 1 root root  0 Sep  9 21:42 test2.txt











# 🧹 Cleanup & Exit

When you’re done experimenting:

```bash
history   # see your past commands
clear     # clear the screen
exit      # leave the container
```

---

✅ These are the “bread and butter” commands you’ll use in **every container and server session**. Once you get muscle memory for these, Linux stops feeling foreign.



Make a tiny project

Here’s how to do it step by step:

---

## 1. Make a project folder

On your Mac (outside Docker), in your terminal:

```bash
mkdir hello-docker
cd hello-docker
```

In [None]:
root@ae3e71e3a2ab:/# mkdir hello-docker
root@ae3e71e3a2ab:/# ls
bin   dev  hello-docker  lib    mnt    opt   root  sbin  sys  usr
boot  etc  home          media  mydir  proc  run   srv   tmp  var


## 2. Create the Python file

Use your favorite editor (VS Code, nano, or even `touch` + `open`).

Create a file called **`hello.py`**:

```bash
nano hello.py
```

Paste this inside (the code you shared):

```python
import sys, platform, os

def main():
    name = os.getenv("NAME", "world")
    print(f"Hello, {name} 👋")
    print("Python:", platform.python_version())
    print("OS:", platform.platform())
    print("Args:", sys.argv[1:])

if __name__ == "__main__":
    main()
```

Save (Ctrl+O in nano) and exit (Ctrl+X).



In [None]:
root@ae3e71e3a2ab:/hello-docker# nano hello.py
root@ae3e71e3a2ab:/hello-docker# ls
hello.py
root@ae3e71e3a2ab:/hello-docker# cat hello.py
import sys, platform, os

def main():
    name = os.getenv("NAME", "world")
    print(f"Hello, {name} 👋")
    print("Python:", platform.python_version())
    print("OS:", platform.platform())
    print("Args:", sys.argv[1:])

if __name__ == "__main__":
    main()


## 3. Create a `requirements.txt`

For now it can be empty, but it’s good practice:

```bash
touch requirements.txt
```



In [None]:
root@ae3e71e3a2ab:/hello-docker# touch requirements.txt
root@ae3e71e3a2ab:/hello-docker# ls
hello.py  requirements.txt
root@ae3e71e3a2ab:/hello-docker# ls -l
total 4
-rw-r--r-- 1 root root 267 Sep  9 21:58 hello.py
-rw-r--r-- 1 root root   0 Sep  9 21:59 requirements.txt
root@ae3e71e3a2ab:/hello-docker# ls -a
.  ..  hello.py  requirements.txt
root@ae3e71e3a2ab:/hello-docker#

## 4. Create a `.dockerignore` (keeps junk out of your image)

```bash
nano .dockerignore

Paste
__pycache__/
*.pyc
.env
.git
```
Save (Ctrl+O, Enter) and exit (Ctrl+X).

Now, when you docker build, Docker will skip those files.


In [None]:
root@ae3e71e3a2ab:/hello-docker# nano .dockerignore
root@ae3e71e3a2ab:/hello-docker# ls
hello.py  requirements.txt
root@ae3e71e3a2ab:/hello-docker# ls -a
.  ..  .dockerignore  hello.py  requirements.txt
root@ae3e71e3a2ab:/hello-docker#



## 1. What does `.dockerignore` do?

* When you run `docker build`, Docker needs to send your **project folder** (called the *build context*) into the Docker daemon.
* By default, **everything in that folder** gets sent — all files, subfolders, caches, logs, secrets, Git history, etc.
* `.dockerignore` tells Docker: *“Skip these files, they don’t belong in the image.”*

**Why this matters**:

* **Smaller images**: You don’t want your `.git` folder (hundreds of MBs) inside.
* **Faster builds**: Less stuff to copy → quicker build times.
* **Security**: Keeps sensitive files (like `.env`) out of your container.

---

## 2. Why the period before the filename?

In Linux/Unix systems:

* Files or folders that start with `.` are **hidden files**.
* They don’t show up with a plain `ls` — you need `ls -a` to see them.
* Hidden files are usually **configuration files** (like `.bashrc`, `.gitignore`, `.dockerignore`).

So yes, `.dockerignore` is a **hidden file** by convention.

---

## 3. Why is `.env` in the ignore list?

* `.env` files typically store **environment variables**, such as:

  ```
  API_KEY=supersecret123
  DATABASE_URL=postgres://user:pass@host/db
  ```
* You use them to configure apps **without hardcoding secrets into code**.
* If you copied `.env` into your Docker image, you’d risk:

  * Accidentally publishing secrets if you push the image to Docker Hub.
  * Baking secrets into the image permanently, making them hard to rotate.

**Best practice**:

* Keep `.env` local.
* Pass env variables at runtime using `-e` or `--env-file`:

  ```bash
  docker run --rm -e API_KEY=supersecret123 myimage
  ```
* That way, secrets stay outside the image.

---

## 4. Analogy

Think of `.dockerignore` like a **packing checklist** for a trip.

* By default, Docker tries to shove your *entire house* (project folder) into the suitcase (the image build context).
* `.dockerignore` says: *“Don’t bring the garage junk, trash, or sensitive documents.”*
* You end up with a **lighter, safer suitcase**.

---

✅ So, `.dockerignore` = cleaner, safer, faster builds.
✅ The `.` = hidden config file convention.
✅ `.env` excluded = keeps your secrets out of the image.



🙌 — let’s build you a **professional `.dockerignore` template** that covers the most common cases for **data science / ML projects**. This will keep your images small, fast, and free of secrets or junk.

---

# 📦 Recommended `.dockerignore` for Data/ML Projects

```dockerignore
# Python cache
__pycache__/
*.pyc
*.pyo
*.pyd

# Virtual environments
venv/
env/
.venv/

# Jupyter Notebook checkpoints
.ipynb_checkpoints/

# OS files
.DS_Store
Thumbs.db

# Git and version control
.git
.gitignore
.gitattributes

# Logs & debug
*.log
*.out
*.err

# Data & models (better to mount these at runtime!)
data/
datasets/
*.csv
*.tsv
*.parquet
*.h5
*.hdf5
*.pth
*.pt
*.ckpt
*.joblib
*.pkl

# Large results
outputs/
results/
checkpoints/

# Environment & config (secrets should not be baked in!)
.env
*.secret
*.key
```

---

# 🧠 Why These Matter

* **Caches & compiled code**: (`__pycache__`, `.pyc`) → don’t waste image space.
* **Virtual envs**: (`venv/`) → Docker already has its own environment.
* **Jupyter**: `.ipynb_checkpoints/` → just clutter.
* **OS junk**: `.DS_Store` (Mac), `Thumbs.db` (Windows).
* **Git stuff**: `.git` history isn’t needed inside the container.
* **Logs**: don’t need them in your base image.
* **Datasets/models**: often huge → better mounted at runtime with `-v` or downloaded in a pipeline.
* **Secrets**: `.env`, API keys, certificates → never bake into images.

---

# ⚡ Best Practice Workflow

1. **Code & requirements** → baked into the image.
2. **Data & models** → mounted or downloaded at runtime.
3. **Secrets/config** → passed in as env vars:

   ```bash
   docker run --rm -e API_KEY=$API_KEY myimage
   ```

---

✅ With this `.dockerignore`, your builds will be **lean, secure, and reproducible** — the hallmark of good container practice.


## 5. Create a Dockerfile

```bash
nano Dockerfile
```

Paste:

```dockerfile
FROM python:3.11-slim

WORKDIR /app

COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt || true

COPY hello.py ./

CMD ["python", "hello.py"]
```




## 6. Build the Docker image

Run this in the project folder:

```bash
docker build -t hello-docker:0.1 .
```

the **Dockerfile** is one of the most important pieces in Docker, and it’s a bit different from `requirements.txt`, though they work together. Let’s break it down.

---

## 1. What is a Dockerfile?

* A **Dockerfile** is a **recipe** that tells Docker how to build an **image**.
* Each line is an **instruction**.
* When you run `docker build`, Docker reads this file line by line and builds your image in layers.

Think of it like a **blueprint** for your project environment.

---

## 2. Dockerfile vs. `requirements.txt`

* **`requirements.txt`**: lists Python packages you need (`numpy==1.26.4`, `pandas==2.2.1`, etc.).
* **Dockerfile**: defines the *entire environment* — not just Python, but also:

  * Which operating system (Ubuntu, Debian, Alpine)
  * Which Python version
  * System packages (e.g. `apt-get install git wget`)
  * Copying your source code into the image
  * What command to run when the container starts

So `requirements.txt` is **one input** to the Dockerfile, but the Dockerfile is **the master recipe**.

---

## 3. Your Dockerfile (Explained)

Here’s your current file:

```dockerfile
FROM python:3.11-slim        # Start from a base image with Python 3.11 installed

WORKDIR /app                 # Set the working directory inside the container

COPY requirements.txt ./     # Copy requirements into the image
RUN pip install --no-cache-dir -r requirements.txt || true
                             # Install dependencies (if any)

COPY hello.py ./             # Copy your code into the container

CMD ["python", "hello.py"]   # Default command when the container runs
```

---

## 4. What Usually Goes Into a Dockerfile?

Typical sections for a real ML/AI project:

1. **Base image**

   ```dockerfile
   FROM python:3.11-slim
   ```

   (Or `FROM pytorch/pytorch:2.3.0-cuda11.8-cudnn8-runtime` if you need GPUs.)

2. **System dependencies**

   ```dockerfile
   RUN apt-get update && apt-get install -y \
       git wget curl build-essential \
    && rm -rf /var/lib/apt/lists/*
   ```

3. **Python dependencies**

   ```dockerfile
   COPY requirements.txt .
   RUN pip install --no-cache-dir -r requirements.txt
   ```

4. **Copy your project code**

   ```dockerfile
   COPY . .
   ```

5. **Environment variables (optional)**

   ```dockerfile
   ENV PYTHONUNBUFFERED=1
   ENV MODEL_PATH=/app/models/model.pkl
   ```

6. **Entrypoint/command**

   ```dockerfile
   CMD ["python", "main.py"]
   ```

---

## 5. Setting Versions

Yes ✅ — the Dockerfile is where you **pin versions** to guarantee reproducibility. Examples:

* Pin the **base image**:

  ```dockerfile
  FROM python:3.11.5-slim
  ```
* Pin system dependencies:

  ```dockerfile
  RUN apt-get install -y libgl1=1.3.2-1
  ```
* Pin Python libraries (via `requirements.txt`):

  ```text
  numpy==1.26.4
  pandas==2.2.1
  scikit-learn==1.5.0
  ```

That way, 6 months later, you (or a teammate) can rebuild the image and get the **exact same environment**.

---

## 6. Analogy

* `requirements.txt` = the **grocery list** (ingredients).
* `Dockerfile` = the **cooking recipe** (what kitchen you’re in, how you prepare food, and what to cook when you serve).

---

✅ So yes: the Dockerfile is where you define the **full reproducible environment** for your project, and `requirements.txt` is just one piece of it.







## 1. Images vs. Containers

* **Image** = the *blueprint* (like a frozen snapshot of a filesystem + metadata).
* **Container** = a *running instance* of that image.

So when you do:

```bash
docker build -t hello-docker:0.1 .
```

You’re creating a **new image** on your machine, tagged `hello-docker:0.1`.

When you do:

```bash
docker run hello-docker:0.1
```

You spin up a **new container** based on that image.

---

## 2. Where You Are Right Now

You mentioned earlier you were inside an Ubuntu container (`docker run -it ubuntu bash`).

That means:

* You’re working **inside a container** that was started from the `ubuntu:latest` image.
* Inside that container, you created files (`hello.py`, `requirements.txt`, `Dockerfile`).

So when you type:

```bash
docker build -t hello-docker:0.1 .
```

you’re asking Docker (the daemon) to build an image from your **current directory** (`.`).
But — here’s the catch — Docker is running on your Mac, not inside the container you’re in.

👉 Unless you’ve mounted your current project folder into that Ubuntu container, your Mac’s Docker daemon doesn’t know about the files you created inside the container.

---

## 3. The Usual Workflow

Most of the time, you don’t develop *inside* a container. Instead, you:

1. Create project files (`hello.py`, `Dockerfile`, `.dockerignore`) on your host machine (Mac).
2. Run `docker build ...` from your Mac terminal (outside any container).
3. Docker builds the image.
4. You run containers from that image.

That way, you never lose files when containers stop, and Docker has direct access to your build context (your project folder).

---

## 4. Why You’re Confused Right Now

* You’re *inside* an Ubuntu container and creating files there.
* But that container is “sandboxed.” When you exit, everything could be lost unless you commit or mount it.
* If you run `docker build` inside that container, you’re building an image in the container’s environment, not your Mac’s main Docker environment.

---

## 5. How to Fix / Best Practice

If you want your image to be **usable outside this temporary Ubuntu container**, do this:

1. Exit the Ubuntu container:

   ```bash
   exit
   ```

2. On your Mac, go to the project folder (`hello-docker/`) where you saved your `hello.py`, `Dockerfile`, `.dockerignore`, and `requirements.txt`.

3. Build the image there:

   ```bash
   docker build -t hello-docker:0.1 .
   ```

4. Run it:

   ```bash
   docker run --rm hello-docker:0.1
   ```

Now you’ve got a proper image stored on your Mac’s Docker engine — reusable and portable.

---

✅ So: Yes, there is already a container running Ubuntu. But building `hello-docker:0.1` is about creating a **new, reusable image** on your host machine. That’s the real power of Docker — once built, you can share that image anywhere.





## 7. Run the container

```bash
docker run --rm hello-docker:0.1
```

You should see something like:

```
Hello, world 👋
Python: 3.11.9
OS: Linux-6.10.14-linuxkit-aarch64-with-glibc2.36
Args: []
```

Pass in env vars + args:

```bash
docker run --rm -e NAME="Micah" hello-docker:0.1 foo bar
```

Output:

```
Hello, Micah 👋
Python: 3.11.9
OS: Linux-...
Args: ['foo', 'bar']
```

---

✅ Boom — you’ve now built your **first custom image + container** 🎉. This is the core workflow:

* Write code →
* Dockerize with a `Dockerfile` →
* Build →
* Run anywhere.

---

👉 Want me to show you how to **publish this image to Docker Hub** so you can run it from *any machine* with a single `docker run` command?


this is one of the most important things to get comfortable with in Docker: moving **in and out of containers**. Let’s break it down step by step.

---

## 1. How to Exit a Container

If you’re currently inside a running container (you see a prompt like `root@<id>:/#`), you’ve got a few options:

* **Graceful exit (stops the container)**

  ```bash
  exit
  ```

  or
  press `Ctrl+D`

  → This ends your shell, the container **stops**, and you’re back on your Mac terminal.

* **Detach (leave it running in background)**
  Press `Ctrl+P` then `Ctrl+Q` (in sequence).
  → This leaves the container running, but you’re no longer attached to it.

---

## 2. Seeing What’s Running

Back on your Mac terminal:

```bash
docker ps        # shows running containers
docker ps -a     # shows all containers (running or stopped)
```

You’ll see something like:

```
CONTAINER ID   IMAGE     COMMAND   CREATED        STATUS
ae3e71e3a2ab   ubuntu    "bash"    5 minutes ago  Exited (0) ...
```

---

## 3. Getting Back Into a Container

You have a few ways:

* **Restart and attach to a stopped container**

  ```bash
  docker start -ai <container_id>
  ```

  (`-a` attach, `-i` interactive)

* **Attach to a running container**

  ```bash
  docker exec -it <container_id> bash
  ```

  → This opens a *new shell* inside it. (Best option for production use.)

---

## 4. Running a Fresh Container

Sometimes it’s cleaner to just start a new one:

```bash
docker run -it ubuntu bash
```

→ Gives you a brand new Ubuntu shell every time.

---

## 5. Mental Model

* **Image** = frozen template (like `ubuntu:latest`).
* **Container** = a *running or stopped instance* of an image.
* **exit** = stops the container (unless you detached).
* **docker ps -a** = shows where your containers are.
* **docker start / exec** = gets you back in.

---

✅ So, to answer your question simply:

* To get out: `exit`
* To get back in: `docker ps -a` → copy container ID → `docker start -ai <id>` (if stopped) or `docker exec -it <id> bash` (if running).



In [None]:
micahshull@Micahs-iMac ~ % docker ps
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES
micahshull@Micahs-iMac ~ % docker ps -a
CONTAINER ID   IMAGE     COMMAND   CREATED       STATUS                      PORTS     NAMES
ae3e71e3a2ab   ubuntu    "bash"    3 hours ago   Exited (0) 17 seconds ago             peaceful_feistel

— you exited, and since that container was running `bash` as its main process, when you typed `exit`, the process ended and the container **stopped**.

Let’s unpack what you’re seeing:

---

## 1. `docker ps`

```bash
docker ps
```

Shows only **currently running** containers.
👉 Output was empty, so no containers are running right now.

---

## 2. `docker ps -a`

```bash
docker ps -a
```

Shows **all containers** (running *and* stopped).

You see:

```
ae3e71e3a2ab   ubuntu    "bash"    3 hours ago   Exited (0) 17 seconds ago   peaceful_feistel
```

* **CONTAINER ID**: `ae3e71e3a2ab` → unique ID for this container.
* **IMAGE**: `ubuntu` → the image it was created from.
* **COMMAND**: `"bash"` → the process it was running.
* **STATUS**: `Exited (0)` → stopped cleanly (0 = success).
* **NAMES**: `peaceful_feistel` → Docker gives containers random names if you don’t provide one.

---

## 3. Restarting That Container

Since it stopped, you can start it again:

```bash
docker start -ai ae3e71e3a2ab
```

* `-a` = attach so you see its output.
* `-i` = interactive (so you can type commands).

This will drop you back into your Ubuntu shell.

---

## 4. Opening a New Shell Inside It

If you want to open *another* shell without restarting:

```bash
docker exec -it ae3e71e3a2ab bash
```

This works only if the container is **still running**. Since yours is exited, you’d need `docker start` first.

---

## 5. Running a Fresh Container

You can always just start a new one (clean slate):

```bash
docker run -it ubuntu bash
```

Every `docker run` from an image creates a **brand new container**.

---

## 6. Mental Model

* **Image**: blueprint (`ubuntu`)
* **Container**: instance of that image (`peaceful_feistel`)
* **Exited**: container isn’t gone, it’s just paused until you restart it
* **Run vs Start**:

  * `docker run` → makes a new container from an image
  * `docker start` → restarts an existing stopped container

---

✅ You’re in control now: you can either **restart that stopped container** or **spin up a new one** anytime.



## Docker Ignore File

In [None]:
# Python cache
__pycache__/
*.pyc
*.pyo
*.pyd

# Virtual environments
venv/
env/
.venv/

# Jupyter Notebook checkpoints
.ipynb_checkpoints/

# OS files
.DS_Store
Thumbs.db

# Git and version control
.git
.gitignore
.gitattributes

# Logs & debug
*.log
*.out
*.err

# Data & models (better to mount these at runtime!)
data/
datasets/
*.csv
*.tsv
*.parquet
*.h5
*.hdf5
*.pth
*.pt
*.ckpt
*.joblib
*.pkl

# Large results
outputs/
results/
checkpoints/

# Environment & config (secrets should not be baked in!)
.env
*.secret
*.key
