Let's say I have a very simple Python app for saying hello to people:

In [1]:
# NOTE: this is just a utility function for printing this file
# and others throughout the course of the notebook, this is not
# the code for the app itself!!
from pygments import highlight
from pygments.formatters import Terminal256Formatter
from pygments.lexers import PythonLexer

def print_file(fname):
    with open(fname, "r") as f:
        text = f.read()
    print(highlight(text, PythonLexer(), Terminal256Formatter()), end="")

print_file("app/__main__.py")

[38;5;28;01mimport[39;00m [38;5;21;01margparse[39;00m


[38;5;28;01mdef[39;00m [38;5;21mgreet[39m():
    parser [38;5;241m=[39m argparse[38;5;241m.[39mArgumentParser()
    parser[38;5;241m.[39madd_argument([38;5;124m"[39m[38;5;124mname[39m[38;5;124m"[39m)
    args [38;5;241m=[39m parser[38;5;241m.[39mparse_args()
    [38;5;28mprint[39m([38;5;124mf[39m[38;5;124m"[39m[38;5;124mHello [39m[38;5;132;01m{[39;00margs[38;5;241m.[39mname[38;5;132;01m}[39;00m[38;5;124m"[39m)


[38;5;28;01mif[39;00m [38;5;18m__name__[39m [38;5;241m==[39m [38;5;124m"[39m[38;5;124m__main__[39m[38;5;124m"[39m:
    greet()


To ensure this app can be deployed reproducibly, we'll build it into an Apptainer container image

In [2]:
print_file("app.def")

Bootstrap: docker
From: python:[38;5;241m3.10[39m[38;5;241m.12[39m[38;5;241m-[39mslim[38;5;241m-[39mbullseye
Stage: build
[38;5;241m%[39mfiles
[38;5;241m.[39m [38;5;241m/[39mopt[38;5;241m/[39mapp

[38;5;241m%[39mpost
python [38;5;241m-[39mm pip install [38;5;241m-[39me [38;5;241m/[39mopt[38;5;241m/[39mapp


In [3]:
! apptainer build -f app.sif app.def

[34mINFO:   [0m User not listed in /etc/subuid, trying root-mapped namespace
[34mINFO:   [0m The %post section will be run under fakeroot
[34mINFO:   [0m Starting build...


Getting image source signatures
Copying blob 63cd35141f3a skipped: already exists  
[1A[JCopying blob 63cd35141f3a skipped: already exists  
Copying blob 7d676dc8a994 skipped: already exists  
[2A[JCopying blob 63cd35141f3a skipped: already exists  
Copying blob 7d676dc8a994 skipped: already exists  
Copying blob 14726c8f7834 skipped: already exists  
[3A[JCopying blob 63cd35141f3a skipped: already exists  
Copying blob 7d676dc8a994 skipped: already exists  
Copying blob 14726c8f7834 skipped: already exists  
Copying blob 03bdd165d0d2 skipped: already exists  
[4A[JCopying blob 63cd35141f3a skipped: already exists  
Copying blob 7d676dc8a994 skipped: already exists  
Copying blob 14726c8f7834 skipped: already exists  
Copying blob 03bdd165d0d2 skipped: already exists  
Copying blob 428bad6fa242 skipped: already exists  
Copying config 84be3abda9 done  
Writing manifest to image destination
Storing signatures
2023/08/26 11:51:42  info unpack layer: sha256:14726c8f78342865030f97a

Now we can run our app inside the container, which has taken care of building all of our suprisingly complex dependencies:

In [4]:
! apptainer exec app.sif python /opt/app/app Thom

Hello Thom


Now suppose we want to submit a batch of greetings across the computing grid, we could deploy this local Apptainer image via Condor using the following submit file

In [5]:
print_file("app.sub")

universe [38;5;241m=[39m vanilla

executable [38;5;241m=[39m [38;5;241m/[39musr[38;5;241m/[39mlocal[38;5;241m/[39m[38;5;28mbin[39m[38;5;241m/[39mpython
arguments [38;5;241m=[39m [38;5;124m"[39m[38;5;124m/opt/app/app Thom[39m[38;5;124m"[39m
transfer_executable [38;5;241m=[39m [38;5;28;01mFalse[39;00m

MY[38;5;241m.[39mSingularityImage [38;5;241m=[39m [38;5;124m"[39m[38;5;124m$ENV(PWD)/app.sif[39m[38;5;124m"[39m
requirements [38;5;241m=[39m (HAS_SINGULARITY[38;5;241m=[39m?[38;5;241m=[39m[38;5;28;01mTrue[39;00m)

output [38;5;241m=[39m app[38;5;241m-[39m$(ProcId)[38;5;241m.[39mout
error [38;5;241m=[39m app[38;5;241m-[39m$(ProcId)[38;5;241m.[39merr
log [38;5;241m=[39m app[38;5;241m-[39m$(ProcId)[38;5;241m.[39mlog

request_cpus [38;5;241m=[39m [38;5;241m1[39m
request_memory [38;5;241m=[39m [38;5;241m10[39m

queue [38;5;241m4[39m


In [6]:
! condor_submit app.sub

Submitting job(s)....
4 job(s) submitted to cluster 410084.


In [7]:
import os, time

# hacky way to wait for all the jobs to complete
while True:
    for i in range(4):
        if not os.path.exists(f"app-{i}.out"):
            break
        with open(f"app-{i}.out", "r") as f:
            if not f.read():
                break
    else:
        break
    print("Waiting...")
    time.sleep(2)

print("Jobs complete! Results:\n")
for i in range(4):
    with open(f"app-{i}.out", "r") as f:
        print(f"Job {i}: ", f.read())
    for suffix in ["out", "err", "log"]:
        os.remove(f"app-{i}.{suffix}")

Waiting...


Waiting...
Waiting...
Waiting...
Waiting...
Jobs complete! Results:

Job 0:  Hello Thom

Job 1:  Hello Thom

Job 2:  Hello Thom

Job 3:  Hello Thom



Once we're happy with our application, we can follow the steps [here](https://computing.docs.ligo.org/guide/dhtc/containers/#publishing) to make this container available at `/cvmfs/singularity.opensciencegrid.org` so that any other users who want to send greetings can just create submit files that point at our container and be good to go!

So far so good. But what if I'm a user who thinks that the built-in greeting is too formal? Rather than saying "Hello," I'd like to just say "Hi." I would clone the repo, `git checkout -b less-formal-greeting`, and make the trivial code change:

In [8]:
with open("app/__main__.py", "r") as f:
    script = f.read()
script = script.replace("Hello", "Hi")
with open("app/__main__.py", "w") as f:
    f.write(script)

Now let's re-run our local container with the newly updated code (we'll keep using our local container, but in principle you would point to the one at `/cvmfs`):

In [9]:
! apptainer exec app.sif python /opt/app/app Thom

Hello Thom


But wait, it's still being too formal! That's because we edited our local copy of the Python code, but the code that lives inside the container is still the code we copied into it at build time! We could rebuild the container again, copying in the new code and then re-installing the package, but the build itself took around a minute. That seems like a lot of time to waste for such a simple change.

Luckily, there's an even simpler, and cleaner, fix for this. We installed our application _editably_ (that's the `-e` in the `pip install` command in the Apptainer def file), so we can just mount our local copy of the code into the container at the appropriate location and our changes will automatically be reflected.

In [10]:
! apptainer exec -B .:/opt/app app.sif python /opt/app/app Thom

Hi Thom


But if we want to see how these changes look when we distribute them with Condor, we have no way of mounting our local directory to the appropriate place inside the container:

In [11]:
! condor_submit app.sub

Submitting job(s)....
4 job(s) submitted to cluster 410085.


In [12]:
while True:
    for i in range(4):
        if not os.path.exists(f"app-{i}.out"):
            break
        with open(f"app-{i}.out", "r") as f:
            if not f.read():
                break
    else:
        break
    print("Waiting...")
    time.sleep(2)

print("Jobs complete! Results:\n")
for i in range(4):
    with open(f"app-{i}.out", "r") as f:
        print(f"Job {i}: ", f.read())
    for suffix in ["out", "err", "log"]:
        os.remove(f"app-{i}.{suffix}")

Waiting...
Waiting...
Waiting...
Waiting...
Waiting...
Waiting...
Waiting...
Waiting...
Waiting...
Waiting...
Jobs complete! Results:

Job 0:  Hello Thom

Job 1:  Hello Thom

Job 2:  Hello Thom

Job 3:  Hello Thom



As noted [here](https://htcondor.readthedocs.io/en/latest/admin-manual/singularity-support.html), there is a Condor submit argument called `container_target_dir` that would support such functionality, but it's not currently enabled on LDG. That means the options for enabling development of distributed applications inside containers are (roughly in order of desirability):

1. Removing the `MY.SingularityImage` argument from our submit file, setting the `executable` to `apptainer`, then specifying the desired bind flags ourselves as `arguments`, probably through some wrapper. This will probably be the easiest path in the short term, but I'd prefer to use Condor's built-in Apptainer support since I'm sure there are lots of little things that can go wrong that they've thought of that I never will.
2. Maintaining base images in `/cvmfs/singularity.opensciencegrid.org` that have installed the necessary dependencies and take care of other more intensive environment set up, then have local def files that boostrap from these images do the comparatively lightweight install of the local package. Besides being annoying to have to run an extra `apptainer build` command every time you make a minor change to the code, this has the danger of becoming really messy as folks start needing to make changes to the base containers. In this scenario, they edit the def files of the base images then rebuild them locally, then have to edit the `Bootstrap` header of the package def file to point to these newly built images. Then they need to be sure to change the header back before pushing their code, and meanwhile the built final image has lost a lot of the "self-contained" nature that conatiners are designed for in the first place.
3. Just rebuilding the entire container for every minor change. This keeps the def files clean, but for most practical environments would be so unwieldy that to me it doesn't represent a realistic option.
4. Use build args to establish where in the container the package code should get copied so that folks can point to their home directory (which automatically gets bound in at run time). This would make everyone's build of the container, and even builds from different locations for the same user, completely different, and it would be impossible to build a single container that all users could use, which is sort of the whole idea to begin with.