procjail

A small tool I wrote to stop lying to myself about containers.

Preface

This is not a product.

This is not a runtime. This is not a replacement for Docker, containerd, runc, or Kubernetes.

This is a learning and diagnostic tool written after too many production incidents where the abstractions disappeared and only the Linux kernel was left to argue with.

If you are looking for something ergonomic, this is the wrong place. If you want to understand why things break, keep reading.

Why I wrote this

I have spent enough time in production to notice a pattern:

Everything looks clean on Day-1.
Everything is YAML, dashboards, and green checks.

Day-2 is different.

Day-2 is:

OOM kills with no obvious cause
services that ignore SIGTERM
containers that refuse to shut down
zombie processes accumulating quietly
“it works locally” but not under load

When those things happen, Docker stops helping. Kubernetes stops helping. Logs stop helping.

What’s left is always the same:

processes
signals
namespaces
cgroups
kernel behavior

At some point I had to admit something uncomfortable:

I could use containers very well,
but I could not always explain their failures without hand-waving.

So instead of reading more blog posts, I decided to build the smallest possible thing that would force me to confront the kernel directly.

That thing is procjail.

The core idea

Containers are not magic.

A “container” is just:

a Linux process
started with a modified view of the system (namespaces)
constrained by kernel-enforced resource limits (cgroups)
often running as PID 1, whether it wants to or not

Everything else is tooling layered on top of that fact.

procjail removes the layers.

What procjail does

procjail:

creates new Linux namespaces explicitly
- PID
- mount
- UTS
mounts /proc so process visibility is real
optionally applies cgroup v2 memory limits
launches a target program as PID 1
forwards signals explicitly
enforces a graceful shutdown window
lets the kernel do exactly what the kernel does

Nothing more.

Every behavior in procjail can be explained by pointing to:

a syscall
a namespace
a cgroup file
a kernel rule

If it can’t be explained that way, it doesn’t belong here.

What procjail very intentionally does not do

This is important.

procjail does not:

pull images
manage registries
set up networking
configure overlay filesystems
hide /proc or /sys
paper over kernel behavior

Those are Day-1 conveniences.

This project is about Day-2 failures.

PID 1 is the entire point

If there is one reason this tool exists, it is this:

PID 1 is not a normal process.

When a process runs as PID 1:

default signal handling changes
ignored signals stay ignored
zombie reaping becomes your responsibility
exiting PID 1 tears down the entire environment

Most applications are not written with this in mind. Most engineers don’t notice until production breaks.

procjail forces you to be PID 1.

There is no init system. There is no supervisor. There is only you and the kernel.

What I expected to learn

I expected to:

confirm how namespaces are wired together
better understand cgroup memory limits
see how signals propagate

That all happened.

What I did not expect

I did not expect how little code it takes to reproduce real container failure modes.

A few syscalls. A few mounts. One badly behaved process.

And suddenly:

SIGTERM doesn’t shut things down
children outlive their parents
memory limits kill processes abruptly
debugging requires reading /proc directly

It became very clear very quickly that containers are thin abstractions.

That’s not a criticism. It’s a warning.

Why the root filesystem is read-only by default

By default, procjail remounts / as read-only.

Not because it’s secure. But because it is educational.

It exposes an assumption many applications make:

“Of course I can write to /tmp or /var/log.”

In real production environments, that assumption is often wrong.

When something breaks because the filesystem is read-only, the lesson is immediate and concrete.

There is an escape hatch (--rw), but the failure is the point.

Resource limits are not suggestions

When a cgroup memory limit is hit, the kernel does not negotiate. It kills processes.

procjail applies memory limits explicitly through cgroup v2 so you can:

trigger OOM kills
observe exit codes
correlate behavior with kernel enforcement

This mirrors real infrastructure incidents almost uncomfortably well.

How to use procjail

Build it:

go build ./cmd/procjail


Run a program inside it:

sudo ./procjail /bin/sh


Apply a memory limit:

sudo ./procjail --memory 64M ./your-binary


Make the filesystem writable (if you must):

sudo ./procjail --rw ./your-binary

Examples

The examples in this repository are not “hello world”.

They are deliberately chosen to demonstrate:

PID namespace behavior

signal handling

shutdown semantics

what happens when things go wrong

They are meant to be run, observed, and reasoned about — not copied into production.


Who this is for

This project is for:

engineers who debug production incidents

people who have been paged at 3am

anyone who wants fewer surprises from containers

people who prefer truth over convenience

If you are early in your journey, this might feel harsh.
That’s okay. Linux is harsh.

Who this is not for

This is not for:

people looking for a container runtime

people who want abstractions to disappear

people who want ergonomic defaults

people who want safety rails

Those things already exist — and they are good at what they do.

This is about understanding what’s underneath them.

Closing thoughts

I am not advocating that everyone stop using Docker or Kubernetes.

I am advocating that we stop treating them as magic.

The kernel is doing the real work.
When things fail, it is the kernel you are debugging — whether you like it or not.

procjail exists to make that impossible to forget.

License

MIT.
Use it, break it, learn from it.

Just don’t lie to yourself about what it’s doing.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
cmd/procjail		cmd/procjail
examples		examples
internal/cgroup		internal/cgroup
.gitignore		.gitignore
README.md		README.md
go.mod		go.mod

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

procjail

Preface

Why I wrote this

The core idea

What procjail does

What procjail very intentionally does not do

PID 1 is the entire point

What I expected to learn

What I did not expect

Why the root filesystem is read-only by default

Resource limits are not suggestions

How to use procjail

About

Uh oh!

Releases

Packages

Languages

Emmanuel326/procjail

Folders and files

Latest commit

History

Repository files navigation

procjail

Preface

Why I wrote this

The core idea

What procjail does

What procjail very intentionally does not do

PID 1 is the entire point

What I expected to learn

What I did not expect

Why the root filesystem is read-only by default

Resource limits are not suggestions

How to use procjail

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages