# How to delete butler collections. STAFF ONLY

Users should be encouraged to take up space, and not try to delete things from the butler.

In [None]:
import os
import getpass
from lsst.daf.butler import Butler, DatasetType, CollectionType, Datastore

In [None]:
config = 'dp02'
butler = Butler(config)

## Identify collections to delete

In [None]:
# my_outputCollection = 'u/melissagraham/coadd_recreation_nb'
my_outputCollection = 'u/melissagraham/custom_coadd_window1_test1'

Get the list of all the collections with that name which already exist.

These were all made in draft_Create_Custom_Coadds.ipynb

The collection 'u/melissagraham/coadd_recreation_nb' was a CHAINED collection that was made.

In [None]:
for c in sorted(butler.registry.queryCollections()):
    if c.find(my_outputCollection) > -1:
        print(c)

In [None]:
del butler

## To delete stuff

I will need a butler with write permissions.

Instantiate it with only `my_outputCollection`, for safety.

In [None]:
tmpButler = Butler(config, collections=my_outputCollection, writeable=True)

### delete collections

`tmpButler.registry.removeCollection` will work for removing a collection. This is a CHAINED collection.

**WARNING** that the following would remove the deepcoadd you'd already made!!

In [None]:
# tmpButler.registry.removeCollection('u/melissagraham/coadd_recreation_nb')
tmpButler.registry.removeCollection('u/melissagraham/custom_coadd_window1_test1')

Check:

In [None]:
for c in sorted(tmpButler.registry.queryCollections()):
    if c.find(my_outputCollection) > -1:
        print(c)

### delete runs

But 'removeCollection' will not work to remove the results of a RUN collection.

This does not work: <br>
`tmpButler.registry.removeCollection('u/melissagraham/coadd_recreation_nb/20220610T171249Z')`
<br>

And neither does: <br>
`tmpButler.registry.removeCollection('u/melissagraham/coadd_recreation_nb/TestWindow1')`
<br>

Instead, do this:

In [None]:
# tmpButler.removeRuns(['u/melissagraham/coadd_recreation_nb/TestWindow1'])
# tmpButler.removeRuns(['u/melissagraham/coadd_recreation_nb/TestWindow2'])
# tmpButler.removeRuns(['u/melissagraham/coadd_recreation_nb/TestWindow3'])

tmpButler.removeRuns(['u/melissagraham/custom_coadd_window1_test1/20230215T021158Z'])
tmpButler.removeRuns(['u/melissagraham/custom_coadd_window1_test1/20230215T164015Z'])
tmpButler.removeRuns(['u/melissagraham/custom_coadd_window1_test1/20230215T165818Z'])
tmpButler.removeRuns(['u/melissagraham/custom_coadd_window1_test1/20230215T172824Z'])
tmpButler.removeRuns(['u/melissagraham/custom_coadd_window1_test1/20230215T175539Z'])

See? No output, nothing left.

In [None]:
for c in sorted(tmpButler.registry.queryCollections()):
    if c.find(my_outputCollection) > -1:
        print(c)

### bulk removal

If you want to do a bulk removal.

In [None]:
# for c in sorted(tmpButler.registry.queryCollections()):
#     if c.find(my_outputCollection) > -1:
#         print('Removing: ', c)
#         tmpButler.removeRuns([c])

See? All gone:

In [None]:
for c in sorted(tmpButler.registry.queryCollections()):
    if c.find(my_outputCollection) > -1:
        print(c)

In [None]:
del tmpButler

<br>
<br>
<br>

## No, none of this was the way.

Just leaving these explorations here, in case.

From the sqlalche.me link, an "Integrity Error" is "Exception raised when the relational integrity of the database is affected, e.g. a foreign key check fails."

Let's explore more to try and figure out why I can't delete that collection.

Double check the type of collection it is.

In [None]:
# tmpButler.registry.getCollectionType('u/melissagraham/coadd_recreation_nb/20220610T171249Z')

Since this is a RUN collection type, it should be removable, so it says here: https://pipelines.lsst.io/py-api/lsst.daf.butler.Registry.html#lsst.daf.butler.Registry.removeCollection

The above also specifies: 
_"If this is a RUN collection, all datasets and quanta in it are also fully removed. This requires that those datasets be removed (or at least trashed) from any datastores that hold them first."_

OK so we must first remove the dataset from the datastore: https://pipelines.lsst.io/modules/lsst.daf.butler/datastores.html

The above says the default configuration values can be inspected at `$DAF_BUTLER_DIR/python/lsst/daf/butler/configs` 

In [None]:
# os.system('ls $DAF_BUTLER_DIR/python/lsst/daf/butler/configs')

In [None]:
# filename_datastore_yaml = '$DAF_BUTLER_DIR/python/lsst/daf/butler/configs/datastore.yaml'

In [None]:
# os.system('more '+filename_datastore_yaml)

There is a `lsst.daf.butler.Datastore.remove` which will _"Indicate to the Datastore that a Dataset can be removed"_. https://pipelines.lsst.io/py-api/lsst.daf.butler.Datastore.html#lsst.daf.butler.Datastore.remove

So now figure out how to use that `Datastore.remove` function on the datasets for my collection.

> **STOP HERE:** not sure that messing with the datastores is the way to go, check with Clare... 

In [None]:
# tmpDatastore = Datastore( ??? )

In [None]:
# tmpButler.datastore.remove( ??? )

In [None]:
# del tmpButler