A small tool I wrote to stop lying to myself about containers.
This is not a product.
This is not a runtime. This is not a replacement for Docker, containerd, runc, or Kubernetes.
This is a learning and diagnostic tool written after too many production incidents where the abstractions disappeared and only the Linux kernel was left to argue with.
If you are looking for something ergonomic, this is the wrong place. If you want to understand why things break, keep reading.
I have spent enough time in production to notice a pattern:
Everything looks clean on Day-1.
Everything is YAML, dashboards, and green checks.
Day-2 is different.
Day-2 is:
- OOM kills with no obvious cause
- services that ignore
SIGTERM - containers that refuse to shut down
- zombie processes accumulating quietly
- “it works locally” but not under load
When those things happen, Docker stops helping. Kubernetes stops helping. Logs stop helping.
What’s left is always the same:
- processes
- signals
- namespaces
- cgroups
- kernel behavior
At some point I had to admit something uncomfortable:
I could use containers very well,
but I could not always explain their failures without hand-waving.
So instead of reading more blog posts, I decided to build the smallest possible thing that would force me to confront the kernel directly.
That thing is procjail.
Containers are not magic.
A “container” is just:
- a Linux process
- started with a modified view of the system (namespaces)
- constrained by kernel-enforced resource limits (cgroups)
- often running as PID 1, whether it wants to or not
Everything else is tooling layered on top of that fact.
procjail removes the layers.
procjail:
- creates new Linux namespaces explicitly
- PID
- mount
- UTS
- mounts
/procso process visibility is real - optionally applies cgroup v2 memory limits
- launches a target program as PID 1
- forwards signals explicitly
- enforces a graceful shutdown window
- lets the kernel do exactly what the kernel does
Nothing more.
Every behavior in procjail can be explained by pointing to:
- a syscall
- a namespace
- a cgroup file
- a kernel rule
If it can’t be explained that way, it doesn’t belong here.
This is important.
procjail does not:
- pull images
- manage registries
- set up networking
- configure overlay filesystems
- hide
/procor/sys - paper over kernel behavior
Those are Day-1 conveniences.
This project is about Day-2 failures.
If there is one reason this tool exists, it is this:
PID 1 is not a normal process.
When a process runs as PID 1:
- default signal handling changes
- ignored signals stay ignored
- zombie reaping becomes your responsibility
- exiting PID 1 tears down the entire environment
Most applications are not written with this in mind. Most engineers don’t notice until production breaks.
procjail forces you to be PID 1.
There is no init system. There is no supervisor. There is only you and the kernel.
I expected to:
- confirm how namespaces are wired together
- better understand cgroup memory limits
- see how signals propagate
That all happened.
I did not expect how little code it takes to reproduce real container failure modes.
A few syscalls. A few mounts. One badly behaved process.
And suddenly:
SIGTERMdoesn’t shut things down- children outlive their parents
- memory limits kill processes abruptly
- debugging requires reading
/procdirectly
It became very clear very quickly that containers are thin abstractions.
That’s not a criticism. It’s a warning.
By default, procjail remounts / as read-only.
Not because it’s secure. But because it is educational.
It exposes an assumption many applications make:
“Of course I can write to
/tmpor/var/log.”
In real production environments, that assumption is often wrong.
When something breaks because the filesystem is read-only, the lesson is immediate and concrete.
There is an escape hatch (--rw), but the failure is the point.
When a cgroup memory limit is hit, the kernel does not negotiate. It kills processes.
procjail applies memory limits explicitly through cgroup v2 so you can:
- trigger OOM kills
- observe exit codes
- correlate behavior with kernel enforcement
This mirrors real infrastructure incidents almost uncomfortably well.
Build it:
go build ./cmd/procjail
Run a program inside it:
sudo ./procjail /bin/sh
Apply a memory limit:
sudo ./procjail --memory 64M ./your-binary
Make the filesystem writable (if you must):
sudo ./procjail --rw ./your-binary
Examples
The examples in this repository are not “hello world”.
They are deliberately chosen to demonstrate:
PID namespace behavior
signal handling
shutdown semantics
what happens when things go wrong
They are meant to be run, observed, and reasoned about — not copied into production.
Who this is for
This project is for:
engineers who debug production incidents
people who have been paged at 3am
anyone who wants fewer surprises from containers
people who prefer truth over convenience
If you are early in your journey, this might feel harsh.
That’s okay. Linux is harsh.
Who this is not for
This is not for:
people looking for a container runtime
people who want abstractions to disappear
people who want ergonomic defaults
people who want safety rails
Those things already exist — and they are good at what they do.
This is about understanding what’s underneath them.
Closing thoughts
I am not advocating that everyone stop using Docker or Kubernetes.
I am advocating that we stop treating them as magic.
The kernel is doing the real work.
When things fail, it is the kernel you are debugging — whether you like it or not.
procjail exists to make that impossible to forget.
License
MIT.
Use it, break it, learn from it.
Just don’t lie to yourself about what it’s doing.