## Filter results - find top containers with no location
The Aspace API is mosly built around getting and editing objects by thier IDs. In order to do a bulk operation, you'll need to have some way to arrive at a list of item IDs. You get all IDs for a type of object in a repository, all child IDs from a known Item, or if you have time, look at every item.

### Example, Get a sample of top contianers
First, we'll need to log in again:

In [None]:
# Log in again

# we need our request making tool, which we 
# can get by importing it like so:
import json
import requests

# first, we'll save bits of connection information (on the right)
# as variables (on the right)

USER = ''
PASS = ''
HOST = ''

# Here's our authentication function. It will return a session
# token if it works, or the value False if if doesn't


def aspace_auth(host, username, password):
    auth = requests.post(host + '/users/' + username + '/login',
                        params={'password' : password})
    if auth.status_code == 200:
        token = auth.json()['session']
        headers = {'X-ArchivesSpace-Session': token}
        return(headers)
    else:
        return(False)

headers = aspace_auth(HOST, USER, PASS)
print(headers)

### API reference: archival object children
https://archivesspace.github.io/archivesspace/api/#get-repositories-repo_id-top_containers

You can view the documentation for the API call for getting all top level containers from a repoitory above. I find the documentation a bit confusing here, but you must include some parameter to tell the API how many IDs you want (the results are paginated). Below I'm passing in a page number and how many results per page as query string parameters in the format `?page=X&page_size=X`.

In [None]:
# The format for the API call is/repositories/:repo_id/top_containers.
# You can fill in the details fo the call below.

container_req = requests.get(HOST +'/repositories/2/top_containers?page=5&page_size=250',
                             headers=headers)

In [None]:
# We'll save the json we got back as a variable called "containers"
containers = container_req.json()


In [None]:
# now let's see what it looks like
containers.keys()


In [None]:
# lets get the total number of pages (we'll use this later)
containers['last_page']

In [None]:
# also, dump it here to get abetter look
containers

### What does this mean

Looks like we're getting a few facts back. We learn how many pages of results there are, what the first page of results is, and what page we're viewing now. Also included in the results key is a list of json objects that represent our containers.

In [None]:
# Let's look at one container record for kicks
containers['results'][0]

### Filter for empty locations

Now, we can use a loop to do something with the first list of containers, let's check for top level containers that have no location codes. Looking at the above example example, we can see container_locations contains location information as a list. We'll check for any top level containers where this list is emtpy. 

In [None]:
# OK container location is represented as a list
# below, we'll add the empty containers to a new
# list called empty_locations

# start with an empty list
empty_location_containers = []

# for each container in our sample of containers
for container in containers['results']:
    # if the value of the continaer_locations key is an empty list
    if container['container_locations'] == []:
        # add it to our empty list
        empty_location_containers.append(container)

In [None]:
# check the length of our list
len(empty_location_containers)

In [None]:
empty_location_containers

In [None]:
# Maybe you want to save these records to a file.
# Glossing over the file writing syntax. Docs here:
# https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files

with open('empty_locations_sample.jsonl', 'w') as fh:
    for record in empty_location_containers:
        # Here we're writing our record, and a newline character
        # so we end up with one record per line
        fh.write(json.dumps(record) + '\n')

### Putting it all together

So Now we've done some exploration and pieced together a workflow to identify empty containers, lets put it all together in one script now.

### STOP HERE

In [None]:
# import libraries

import requests
import json

# first, we'll save bits of connection information (on the right)
# as variables (on the right)

USER = ''
PASS = ''
HOST = ''

# The last page of results (I'm cheating
# a bit because I already know what the last page is from our 
# previous excersise). this takes a long time, so we'll pretent
# you have a lot less top level containers

#LAST_PAGE = 120

LAST_PAGE = 3
PAGE_SIZE = 250

# Here's our authentication function. It will return a session
# token if it works, or the value False if if doesn't


def aspace_auth(host, username, password):
    auth = requests.post(host + '/users/' + username + '/login',
                        params={'password' : password})
    if auth.status_code == 200:
        token = auth.json()['session']
        headers = {'X-ArchivesSpace-Session': token}
        return(headers)
    else:
        return(False)

headers = aspace_auth(HOST, USER, PASS)

# OK, let's get those empty containers!

# We'll sart on page 1
page = 1

# While this condition is true (1 is less than or equal to 3 in this case)
while page <= LAST_PAGE:
    # Get a page of containers, notice how we're filling in
    # the value of page and page size each time.
    container_req = requests.get(HOST +'/repositories/2/top_containers?page='
                                 + str(page)
                                 + '&page_size=' 
                                 + str(PAGE_SIZE),
                                 headers=headers)
    
    # I put this in so there's some output and we don't get
    # too impatient
    if container_req.status_code <= 299:
        print("Got result page: " + container_req.url)
    
    # We're interested in the results
    containers = container_req.json()['results']
    
    # We'll open a file to write our empty continers to
    fh = open('page_' + str(page) + '_continaers.jsonl', 'w')
    
    # We'll do our filter for each set of results
    for container in containers:
        # if the value of the continaer_locations key is an empty list
        if container['container_locations'] == []:
            # add write it to our file
            fh.write(json.dumps(container) + '\n')
    
    # finally, add 1 to the page so the next time the loop
    # runs it will get the next page
    page = page + 1
    
