# UMedia API Documentation

## 1. Introduction
This tutorial introduces Python code you can use to access data and metadata from UMedia.

### 1.1. Installation of Python3

In this tutorial, we will illustrate how to use Python to obtain data from [UMedia](https://umedia.lib.umn.edu/). We will use Python3 (more specifically, version 3.7.4), which you can download from the [Python website](https://www.python.org/downloads/).

### 1.2. Required packages

Once you're ready to use Python, let's import several required packages: 

* 'json' is a python built-in module that functions as a json encoder and decoder. 
* 'requests' is a Python module that you can use to send all kinds of HTTP requests. In this tutorial, you will learn how to use requests to send simple HTTP requests in Python. 

In [None]:
import requests
import json

## 2. Formatting URLs

Creating URLs is an important step to use the UMedia API. We need the URLs to make the request to the website. If the API is the car, the URL is the steering wheel of the car. Without it, the API won't go in the right direction.

For more info on how to format URLs for UMedia, see the Libraries' [Digital Collections APIs page](https://www.lib.umn.edu/digital/apis).

## 3. Using the Packages

### 3.1. Download JSON from UMedia

Let's use the ```requests``` and ```json``` libraries, and their built-in functions, to download data from UMedia. We'll focus below on the "Digitizing Immigrant Letters" collection. 

Let's create a URL and store it in a variable called ```my_url```. We will concatenate strings using the "plus operation" in Python to combine different elements of the URL: 

In [None]:
base_url = "https://umedia.lib.umn.edu/search.json?"
my_filter = "facets%5Bcollection_name_s"
my_key = "%5D%5B%5D=Digitizing+Immigrant+Letters"

my_url = base_url + my_filter + my_key
print(my_url)

If you look at the results for the URL above, you see they only include the first 20 items from the search results. To access data for up to 50 items at a time through a single API call, we can add ```&rows=50``` at the end of the URL:

In [None]:
my_item_num = "&rows=50"

my_url_50 = my_url + my_item_num

To keep things simple, we will use the ```my_url``` variable here, which includes 20 items. 

Now let's use the ```get()``` function in the ```requests``` package to access the data from our ```my_url``` variable: 

In [None]:
r = requests.get(my_url)

The variable called ```r``` includes our search results as a response object, which can be difficult to read and understand. You can view the contents of the object by running ```print(r.contents)```. 

We can use the ```loads()``` function in ```json``` to read the result into JSON, which will be easier to work with:

In [None]:
data = json.loads(r.text)

The information has been stored in the variable called ```data```, which is a Python list of Python dictionaries. 

Let's take a look at how many items are stored in the ```data``` variable. This can be done with the Python ```len()``` function:

In [None]:
print(len(data))

You can access an item in the data list by referring to its index: ```data[0]``` refers to the first item in the list, ```data[1]``` the second, and so on. We use a zero instead of a one for the first item in the list because Python (and many other programming languages) start indexes for lists from zero.

In [None]:
data[0]

It's important to note that because our ```data``` list starts from the index ```data[0]``` the index will end at ```data[19]```. For a list of 20 items we can't look for ```data[20]``` because it is actually indicating the 21st item in the list (which doesn't exist)! Looking for ```data[20]``` will give us a "list index" error.

In [None]:
# Python is comfortable with this
data[19]

In [None]:
# Python would complain for this
data[20]

The next concern would be: what specific data elements are stored in each item from the ```data``` list? 

To answer this question, we need to dive into the inner dictionary and explore what information is stored there. Each dictionary includes a list of keys and values. We can tell it's a Python dictionary instead of a Python list becuase it's contained by curly brackets: 
```
{'id': 'p16022coll264:133', 'collection_name': 'Digitizing Immigrant Letters'}
```
In the example above, the first key is 'id' and its value is 'p16022coll264:133'. The second key is 'collection_name' with a value of 'Digitizing Immigrant Letters'.

We can use the following code to store all of the keys for the dictionary in ```data[0]``` into a variable called ```keys``` and print them out using for-loop. 

In [None]:
keys = list(data[0].keys())

for i in range(len(keys)):
    print(keys[i])

From the above results, we can see all of the keys that we can use to obtain values such as title, creator, transcript, page count, type, date added, etc. We can reference these keys to take a look at the title and the transcript of the first item: 

In [None]:
print(data[0]["title"])

In [None]:
print(data[0]["transcription"])

To print out the titles for all 20 items we grabbed from UMedia in the ```data``` list, we can use another for-loop: 

In [None]:
for i in range(len(data)):
    print(data[i]["title"])

Now, we can work with our ```data``` variable to obtain more information about specific items in our UMedia search results. Let's consider how to save the transcripts for all of the items in our Python list to a text file.

This function takes two arguments: the first specifies the name of the file and the second specifies the task we want to do. In our case, we want to write data to the file, so  we'll use the symbol "w+" in the second argument. See [the Python documentation](https://docs.python.org/3/library/functions.html) for more information about the "open()" function and its usage. 

The following line of code does three things: 

1. Creates a file object called "f"
2. Creates a text file called "letters.txt" on your computer
3. Specifies that we want to write to the text file, ```"w+"```

In [None]:
f = open("letters.txt", "w+")

After opening (or creating) the file, we can write data to it using the ```write()``` function. We can use a for-loop to iteratively write out both the transcripts and titles for each item into our letters.txt file. 

In [None]:
for i in range(len(data)):
    f.write(data[i]["title"] + "\n" + data[i]["transcription"] + "\n\n")

There should now be a file called "letters.txt" in the same directory with this code. When opening it, you will see the title along with the transcript of each item. 

Remember to close a file every time you open one. This avoid memory leak issues, which can lower your CPU performance. The code to close your f file object is:

In [None]:
f.close()

### 3.2. Dowload Objects from UMedia

In this section, we will discuss how to use the API in Python to download digital objects from UMedia. We will focus here on how to download image files. 

To download images, we can use another built-in module in Python called ```shutil```. This module allows us to perform a variety of operations, such as copying and deletion, on local files. See [the Python documentation](https://docs.python.org/3/library/shutil.html) to learn more about ```shutil```. 


First import the tool:

In [None]:
import shutil

Let's create a URL for the image we want to download. As an example, we will use the image from [How to construct IIIF calls to UMedia documentation](https://www.lib.umn.edu/digital/apis#iiif). 

In [None]:
image_url = "https://cdm16022.contentdm.oclc.org/digital/iiif/p16022coll208/4833/full/full/0/default.jpg"

Next, we can use the ```requests``` package to open the url image. 

In [None]:
resp = requests.get(image_url, stream = True)

We need to set stream to 'True' in ```get()```, which we didn't do previously when working with UMedia metadata. This is because the image information is binary and not available in the JSON format, as before. 

Next, we can open the file to store the image that we called above:

In [None]:
local_file = open('local_image.jpg', 'wb')

The second argument of the ```open()``` function is ```wb``` instead of ```w+``` because we need to write binary data to the file, indicated by the 'b' in 'wb'. 

Before writing the data to a 'local_image.jpg' file that we just created, we first need to set the ```resp``` variable object to ```raw.decode_content = True```! Otherwise the size of this file will always be zero, and thus cannot be opened by any computer. 

In [None]:
resp.raw.decode_content = True

Now, we can write the image data to the file. This is done by one line of code, using the ```copyfileobj()``` function from the ```shutil``` module. Don't forget to delete the image url response object before celebrating: 

In [None]:
# write the data into the local_file
shutil.copyfileobj(resp.raw, local_file)

# remove the image response object
del resp

After running the code above, you should find that the image called "local_image.jpg" is in the same directory as the code! Now you know how to download images from UMedia. 

This is the end of the tutorial. Thank you for reading! 