If we want to create temporary files for an experiment, we need to make sure that they get cleaned up and removed. The `tempfile.TemporaryFile` object, will do this mostly automatically, but much of the time (for our NetCDF experiments at least), we need more than just a file handle that gets returned by `TemporaryFile`. We need an actual named file, so we can give the filename to the NetCDF library. `tempfile.NamedTemporaryFile` gives us this, but then we're responsible for cleaning up the open file pointers and deleting the file ourselves. This *seems* easy to do, but in reality, it's easy to leak file pointers when mixing file open methods.

Here's an example:

In [1]:
!df -h /dev/sda1

Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1       110G   78G   28G  75% /


We start out with 78GB used on the disk. Let's create a 1GB temporary file.

In [2]:
import os
import tempfile

fd, filename = tempfile.mkstemp(suffix='.bin', dir=os.getcwd())
print(filename)
f = open(filename, 'wb')
for i in range(1024 ** 2):
    f.write(b'\0' * 1024)


/home/james/code/git/netcdf-tutorial/notebooks/tmp2avl21.bin


In [3]:
!df -h /dev/sda1

Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1       110G   79G   27G  75% /


We see that we've used up an extra 1GB of disk space, which is what we expect. OK, pretend we run some experiment, we're done with our file and we want to get rid of it. We'll close the file descriptor `fd` and remove the path.

In [4]:
os.close(fd)
os.remove(filename)

In [5]:
!df -h /dev/sda1

Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1       110G   79G   27G  75% /


What the?! We just removed the file, but we're still using the disk space. Did it not get removed?

In [6]:
!ls -lh $filename

ls: cannot access /home/james/code/git/netcdf-tutorial/notebooks/tmp2avl21.bin: No such file or directory


No, it's definitely gone. What's going on? It turns out that `mkstemp` returns an open file descriptor, but we *also* have an open file object from our call to `open()`. If *either* one of these hasn't been closed, then the disk space never gets freed up.

Notice that `f` is still open.

In [7]:
f

<_io.BufferedWriter name='/home/james/code/git/netcdf-tutorial/notebooks/tmp2avl21.bin'>

In [8]:
f.close()

In [9]:
!df -h /dev/sda1

Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1       110G   78G   28G  75% /


So once we properly close the file object, the disk space gets reclaimed by the Operating System. The solution was pretty obvious the way that we did it this way, but in an earlier version of the code, I had ignored the file descriptor returned from `mkstemp` as so:

```python
_, filename = tempfile.mkstemp(suffix='.bin', dir=os.getcwd())
```

I was like, "I don't need the file descriptor, because I'm just going to open it later on anyways with the NetCDF library". So I threw away the fd. But it turns out that that object doesn't just disappear. It still gets created with a 0 refcount, so it lives on in the interpreter until it gets garbage collected. Which could be never. So, the disk space essentially never got released until I restarted the interpreter. Not OK.

The correct way to handle this is with a `ContextManager` and a `with` statment. This ensures that everything gets cleaned up properly, and closed in the end.

In [10]:
with tempfile.NamedTemporaryFile(suffix='.bin', dir=os.getcwd(), delete=False) as f:
    some_fd = f.file
    for i in range(1024 ** 2):
        f.write(b'\0' * 1024)

In [11]:
!df -h /dev/sda1

Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1       110G   79G   27G  75% /


In [12]:
os.remove(f.name)

In [13]:
!df -h /dev/sda1

Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1       110G   78G   28G  75% /
