Skip to content

A small Go tool for process isolation and cgroup-based resource limits, built to understand container Day-2 failures.

Notifications You must be signed in to change notification settings

Emmanuel326/procjail

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

procjail

A small tool I wrote to stop lying to myself about containers.


Preface

This is not a product.

This is not a runtime. This is not a replacement for Docker, containerd, runc, or Kubernetes.

This is a learning and diagnostic tool written after too many production incidents where the abstractions disappeared and only the Linux kernel was left to argue with.

If you are looking for something ergonomic, this is the wrong place. If you want to understand why things break, keep reading.


Why I wrote this

I have spent enough time in production to notice a pattern:

Everything looks clean on Day-1.
Everything is YAML, dashboards, and green checks.

Day-2 is different.

Day-2 is:

  • OOM kills with no obvious cause
  • services that ignore SIGTERM
  • containers that refuse to shut down
  • zombie processes accumulating quietly
  • “it works locally” but not under load

When those things happen, Docker stops helping. Kubernetes stops helping. Logs stop helping.

What’s left is always the same:

  • processes
  • signals
  • namespaces
  • cgroups
  • kernel behavior

At some point I had to admit something uncomfortable:

I could use containers very well,
but I could not always explain their failures without hand-waving.

So instead of reading more blog posts, I decided to build the smallest possible thing that would force me to confront the kernel directly.

That thing is procjail.


The core idea

Containers are not magic.

A “container” is just:

  • a Linux process
  • started with a modified view of the system (namespaces)
  • constrained by kernel-enforced resource limits (cgroups)
  • often running as PID 1, whether it wants to or not

Everything else is tooling layered on top of that fact.

procjail removes the layers.


What procjail does

procjail:

  • creates new Linux namespaces explicitly
    • PID
    • mount
    • UTS
  • mounts /proc so process visibility is real
  • optionally applies cgroup v2 memory limits
  • launches a target program as PID 1
  • forwards signals explicitly
  • enforces a graceful shutdown window
  • lets the kernel do exactly what the kernel does

Nothing more.

Every behavior in procjail can be explained by pointing to:

  • a syscall
  • a namespace
  • a cgroup file
  • a kernel rule

If it can’t be explained that way, it doesn’t belong here.


What procjail very intentionally does not do

This is important.

procjail does not:

  • pull images
  • manage registries
  • set up networking
  • configure overlay filesystems
  • hide /proc or /sys
  • paper over kernel behavior

Those are Day-1 conveniences.

This project is about Day-2 failures.


PID 1 is the entire point

If there is one reason this tool exists, it is this:

PID 1 is not a normal process.

When a process runs as PID 1:

  • default signal handling changes
  • ignored signals stay ignored
  • zombie reaping becomes your responsibility
  • exiting PID 1 tears down the entire environment

Most applications are not written with this in mind. Most engineers don’t notice until production breaks.

procjail forces you to be PID 1.

There is no init system. There is no supervisor. There is only you and the kernel.


What I expected to learn

I expected to:

  • confirm how namespaces are wired together
  • better understand cgroup memory limits
  • see how signals propagate

That all happened.


What I did not expect

I did not expect how little code it takes to reproduce real container failure modes.

A few syscalls. A few mounts. One badly behaved process.

And suddenly:

  • SIGTERM doesn’t shut things down
  • children outlive their parents
  • memory limits kill processes abruptly
  • debugging requires reading /proc directly

It became very clear very quickly that containers are thin abstractions.

That’s not a criticism. It’s a warning.


Why the root filesystem is read-only by default

By default, procjail remounts / as read-only.

Not because it’s secure. But because it is educational.

It exposes an assumption many applications make:

“Of course I can write to /tmp or /var/log.”

In real production environments, that assumption is often wrong.

When something breaks because the filesystem is read-only, the lesson is immediate and concrete.

There is an escape hatch (--rw), but the failure is the point.


Resource limits are not suggestions

When a cgroup memory limit is hit, the kernel does not negotiate. It kills processes.

procjail applies memory limits explicitly through cgroup v2 so you can:

  • trigger OOM kills
  • observe exit codes
  • correlate behavior with kernel enforcement

This mirrors real infrastructure incidents almost uncomfortably well.


How to use procjail

Build it:

go build ./cmd/procjail


Run a program inside it:

sudo ./procjail /bin/sh


Apply a memory limit:

sudo ./procjail --memory 64M ./your-binary


Make the filesystem writable (if you must):

sudo ./procjail --rw ./your-binary

Examples

The examples in this repository are not “hello world”.

They are deliberately chosen to demonstrate:

PID namespace behavior

signal handling

shutdown semantics

what happens when things go wrong

They are meant to be run, observed, and reasoned about — not copied into production.


Who this is for

This project is for:

engineers who debug production incidents

people who have been paged at 3am

anyone who wants fewer surprises from containers

people who prefer truth over convenience

If you are early in your journey, this might feel harsh.
That’s okay. Linux is harsh.

Who this is not for

This is not for:

people looking for a container runtime

people who want abstractions to disappear

people who want ergonomic defaults

people who want safety rails

Those things already exist — and they are good at what they do.

This is about understanding what’s underneath them.

Closing thoughts

I am not advocating that everyone stop using Docker or Kubernetes.

I am advocating that we stop treating them as magic.

The kernel is doing the real work.
When things fail, it is the kernel you are debugging — whether you like it or not.

procjail exists to make that impossible to forget.

License

MIT.
Use it, break it, learn from it.

Just don’t lie to yourself about what it’s doing.

About

A small Go tool for process isolation and cgroup-based resource limits, built to understand container Day-2 failures.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages