# Additional operations with bagit

This notebook uses the same samples as the previous one, but
demonstrates additional actions like validating bags and checking for errors.

In [None]:
import bagit

## Content Validation: is_valid()

One benefit of using this packaging approach is that it is simple, in the sense that 
it only exists as files on a disc or server and does not require any specialized software
to see the files or decompress them. In addition, this approach allows you as a digital curator,
librarian, or archivist, to receive, store, and preserve digital assets even when you may not 
have all of the information about what these assets are or how they might be used. In the words
of the BagIt spec, the contents of a bag are "opaque", that is, it is possible to verify that the 
content is accurate whether or not you can display it, render it viewable or processable with software,
or the contents are subject to rights management or proprietary restrictions.

The specification and structure of a BagIt bag make it possible to check the contents 
without "seeing" them. This is made possible because we can see if the bag is **complete**,
and we can also check to see if the bag is **valid**. 

* A **complete** bag is one that has all of the required 
elements of a bag: a BagIt declaration (`bagit.txt`), a payload (the `data` directory), and a payload manifest
(the list of files and checksums, located in the top-level directory, probably called 
something like `manifest-sha256.txt`). 
* A **valid** bag is one that is complete and for which it is possible to check each file in the
payload, calculate a checksum for it, and verify that the checksum is the same as the one listed in the manifest, indicating that the contents have not changed. 

To assess the bag that was created above, we can again use the `bagit` library, which has an `is_valid()` function.
This function will check to see if the bag is indeed an object that we can validate is a well-formed BagIt object. For demonstration, the next two cells use the `sample-bag-1-valid` folder, which is
an already-created bag included in the GitHub repo.

In [None]:
# load the bag
test_bag = bagit.Bag('sample-bag-1-valid/')

In [None]:
# check to see if the bag is valid
if test_bag.is_valid():
    print("yay :)")
else:
    print("boo :(")

- what output did you get above?

In [None]:
validity = test_bag.is_valid()

print(validity, type(validity))

- Note that `is_valid()` returns a boolean value (True/False)
- in a script, this would allow you to do validity testing and create update or correction optiosn

In [None]:
# what if the bag is not valid
not_a_valid_bag = bagit.Bag('sample-bag-2-invalid/')

not_a_valid_bag.is_valid()

The `validate` method allows a closer look at what sort of errors the validator is finding.

In [None]:
try:
    not_a_valid_bag.validate()

except bagit.BagValidationError as error_msg:
    print(error_msg)

- Reading error message activity: what sort of error comes up with this bag `sample-bag-2-invalid`?