# Cleanup

We generated a lot of intermediate files that we might not need anymore. I recommend you do **not** run this script in one go, but line by line to see what you would like to keep. This script will always keep:

* The neural net files
* The denoised tomograms

In [1]:
from os.path import join, exists

import sys

import os
from glob import glob
import subprocess


In [25]:
# do you really want to remove redundant files ? 
remove = False

# do you also want to remove the even/odd/full stacks (i.e. tiltseries)?
removestacks = False

# do you also want to remove the even/odd raw half-tomograms?
remove_half_tomos = False 

### Raw frames
remove original .tif files from /frames as we have the full aligned frames in /frames/full so these files are no longer needed. This is to save space on the cluster. The default is to keep them, but it would be good practice to set the statement to True below, and hereby remove these now unnecessary files.

In [None]:
if remove:
    !\rm frames/*.tif

### single frames

we have our even and odd frames which we turned into stacks already. They are therefore completely redundant.

In [None]:
if remove:
    !\rm frames/even/*.mrc
    !\rm frames/odd/*.mrc
    !\rm frames/full/*.mrc

### training data
We can also remove the training data, as this takes up as much space as a full tomogram! Even if you want to retrain the network, it is pretty fast to make a new training set from the half-tomograms. So, it is absolutely fine to remove the training data **unless** it is important that you have the **exact same** training data. (e.g. when you want to compare effect of a different number of training iterations.

In [None]:
if remove:
    !\rm train_data/train_data.npz

### half-tomograms & IMOD files
The tomogram folders still contain the raw stacks and imod files. We can remove these. If you plan on retraining the network (e.g. with different number of iterations or number of training volumes/slices), keep these files, as it'll save you reconstruction time. The imod files in the full folder are kept, as these might be needed if you ever want to remake the tomograms and take up very little space.

**note that this keeps the tomostack.st files**

In [24]:
if remove_half_tomos:
    # aproach is as follows:
    # first find all items and folders in tomogram, from that list remove
    # (note: this is not deleting file) the tomostack.st. Then go through
    # remaining files and remove. If its a folder remove its contents and then delete folder
    
    files_even = glob('frames/even/tomogram/' + '*')
    try:
        files_even.remove('frames/even/tomogram/tomostack.st')
    except ValueError:
        pass # do nothing    
    for file in files_even:
        if os.path.isfile(file):         
            os.remove(file)
        if os.path.isdir(file):
            subfiles = glob(file + '/*')
            for subfile in subfiles:
                os.remove(subfile)
            os.rmdir(file)          

    
    files_odd = glob('frames/odd/tomogram/' + '*')
    try:
        files_odd.remove('frames/odd/tomogram/tomostack.st')
    except ValueError:
        pass # do nothing    
    for file in files_odd:
        if os.path.isfile(file):         
            os.remove(file)
        if os.path.isdir(file):
            subfiles = glob(file + '/*')
            for subfile in subfiles:
                os.remove(subfile)
            os.rmdir(file)          


['frames/even/tomogram/test', 'frames/even/tomogram/test.txt', 'frames/even/tomogram/test2.png', 'frames/even/tomogram/tomostack.st']
-----------------------------------------
['frames/even/tomogram/test', 'frames/even/tomogram/test.txt', 'frames/even/tomogram/test2.png']
['frames/even/tomogram/tomostack.st']


### Tilt series
We still have the full motioncorrected tiltseries for the even/odd/full tomograms. You should have the _full_ one already on your local device when you were reconstructing the full tomogram, hence you don't really need these anymore. You might want to keep these if you plan on changing the reconstruction in imod at a later point. (e.g. you might want to try with goldremover, or with different SIRT parameters)

In [None]:
if removestacks:
    !\rm frames/even/tomograms/tomostack.st
    !\rm frames/odd/tomograms/tomostack.st