# Figure out how to delete butler collections.

Deleting collections is a necessary component of draft_Create_Custon_Coadds.ipynb

Users will want to know how to clean up their butler collections.

In [1]:
import os
import getpass
from lsst.daf.butler import Butler, DatasetType, CollectionType, Datastore

In [2]:
config = 'dp02'
butler = Butler(config)









## Identify collections to delete

In [3]:
my_outputCollection = 'u/melissagraham/coadd_recreation_nb'

Get the list of all the collections with that name which already exist.

These were all made in draft_Create_Custom_Coadds.ipynb

In [4]:
for c in sorted(butler.registry.queryCollections()):
    if c.find(my_outputCollection) > -1:
        print(c)

u/melissagraham/coadd_recreation_nb/20220610T171249Z
u/melissagraham/coadd_recreation_nb/20220610T182343Z
u/melissagraham/coadd_recreation_nb/20220610T184028Z
u/melissagraham/coadd_recreation_nb/20220610T185057Z
u/melissagraham/coadd_recreation_nb/20220610T190249Z
u/melissagraham/coadd_recreation_nb/20220610T190623Z
u/melissagraham/coadd_recreation_nb/20220615T205142Z
u/melissagraham/coadd_recreation_nb/20220615T212442Z
u/melissagraham/coadd_recreation_nb/20220622T190819Z
u/melissagraham/coadd_recreation_nb/20220622T193047Z
u/melissagraham/coadd_recreation_nb/20220622T194232Z
u/melissagraham/coadd_recreation_nb/20220622T194815Z
u/melissagraham/coadd_recreation_nb/20220622T195818Z
u/melissagraham/coadd_recreation_nb/20220622T235340Z
u/melissagraham/coadd_recreation_nb/20220623T023518Z
u/melissagraham/coadd_recreation_nb/20220623T024126Z


I want to delete all of those old mistakes. Let's start with the first one.

In [5]:
butler.registry.removeCollection('u/melissagraham/coadd_recreation_nb/20220610T171249Z')

ReadOnlyDatabaseError: Cannot delete from read-only table dc2_20210215.collection.

<br>

OK, based on the above, it seems I don't have permission to delete what I want.

Let's try again, this time using a temporary butler with write permissions.

In [6]:
del butler
tmpButler = Butler(config, writeable=True)

In [7]:
tmpButler.registry.removeCollection('u/melissagraham/coadd_recreation_nb/20220610T171249Z')

IntegrityError: (psycopg2.errors.ForeignKeyViolation) update or delete on table "dataset" violates foreign key constraint "fkey_dataset_location_dataset_id_dataset_id" on table "dataset_location"
DETAIL:  Key (id)=(fc6ffba8-6b69-4a31-a0d9-1b3ef1882cee) is still referenced from table "dataset_location".

[SQL: DELETE FROM dc2_20210215.collection WHERE dc2_20210215.collection.name IN (%(name_1_1)s)]
[parameters: {'name_1_1': 'u/melissagraham/coadd_recreation_nb/20220610T171249Z'}]
(Background on this error at: https://sqlalche.me/e/14/gkpj)

<br>

OK interesting, this is different from the previous error. 

From the sqlalche.me link, an "Integrity Error" is "Exception raised when the relational integrity of the database is affected, e.g. a foreign key check fails."

Let's explore more to try and figure out why I can't delete that collection.

Double check the type of collection it is.

In [8]:
tmpButler.registry.getCollectionType('u/melissagraham/coadd_recreation_nb/20220610T171249Z')

<CollectionType.RUN: 1>

Since this is a RUN collection type, it should be removable, so it says here: https://pipelines.lsst.io/py-api/lsst.daf.butler.Registry.html#lsst.daf.butler.Registry.removeCollection

The above also specifies: 
_"If this is a RUN collection, all datasets and quanta in it are also fully removed. This requires that those datasets be removed (or at least trashed) from any datastores that hold them first."_

OK so we must first remove the dataset from the datastore: https://pipelines.lsst.io/modules/lsst.daf.butler/datastores.html

The above says the default configuration values can be inspected at `$DAF_BUTLER_DIR/python/lsst/daf/butler/configs` 

In [9]:
os.system('ls $DAF_BUTLER_DIR/python/lsst/daf/butler/configs')

datastores
datastore.yaml
dimensions.yaml
registry.yaml
repo_transfer_formats.yaml
storageClasses.yaml


0

In [10]:
filename_datastore_yaml = '$DAF_BUTLER_DIR/python/lsst/daf/butler/configs/datastore.yaml'

In [11]:
os.system('more '+filename_datastore_yaml)

::::::::::::::
/opt/lsst/software/stack/stack/miniconda3-py38_4.9.2-4.0.0/Linux64/daf_butler/g23f6d281cf+3bb374b9e1/python/lsst/daf/butler/configs/datastore.yaml
::::::::::::::
datastore:
  # Use file datastore as a default for a default butler
  cls: lsst.daf.butler.datastores.fileDatastore.FileDatastore


0

There is a `lsst.daf.butler.Datastore.remove` which will _"Indicate to the Datastore that a Dataset can be removed"_. https://pipelines.lsst.io/py-api/lsst.daf.butler.Datastore.html#lsst.daf.butler.Datastore.remove

So now figure out how to use that `Datastore.remove` function on the datasets for my collection.

> **STOP HERE:** not sure that messing with the datastores is the way to go, check with Clare... 

In [12]:
# tmpDatastore = Datastore( ??? )

In [13]:
# tmpButler.datastore.remove( ??? )

In [14]:
del tmpButler