# Using the NMDC Runtime API

## Introduction

In this tutorial, I'll show you how you can use a Python notebook to interact with the NMDC Runtime API.

Specifically, I'll show you how you can use a Python notebook to (a) submit HTTP requests, (b) parse HTTP responses, and (c) authenticate an HTTP client.

## Getting help

In case you have questions about the contents of this notebook, you can post them as [GitHub issues](https://github.com/microbiomedata/nmdc-runtime/issues/new) in the `microbiomedata/nmdc-runtime` GitHub repository, in which this notebook resides. NMDC team members regularly review open issues. In case you don't have a GitHub account, you can email your questions to the [NMDC Support Team](mailto:support@microbiomedata.org).

## 1. Install dependencies

Before you can access the NMDC Runtime APIâ€”which runs as an HTTP serviceâ€”you'll need an HTTP client. A popular HTTP client for Python is called `requests`. You can install it on your computer by running the following cell:



In [2]:
%pip install requests

Now that the `requests` package is installed, you can use it to send HTTP requests to HTTP servers. For example, you can run the following cell to submit an HTTP GET request to an example HTTP server:

In [6]:
import requests

# Submit an HTTP GET request to an example HTTP server.
response = requests.get("https://jsonplaceholder.typicode.com/posts/1")

Now that you've submitted the HTTP request, the `response` variable contains information about the HTTP response the example HTTP server sent back. You can examine it by running the following cells:

In [7]:
# Get the HTTP status code from the response.
response.status_code

In [8]:
# Parse the response as a JSON string.
response.json()

If the first of those cells outputs the number `200` and the second one outputs a Python dictionary having several keys (including `id` and `title`), you are good to go!

> In case those cells did not output those things, here are some troubleshooting tips: (1) check your Internet connection, (2) visit the same URL from the example above, in your web browser, and (3) review the [documentation](https://requests.readthedocs.io/en/latest/) of the `requests` package.

Now that you can access _an_ HTTP server, let's access the **NMDC Runtime API**.

## 2. Access an NMDC Runtime API endpoint

The NMDC Runtime API has a variety of API endpoints that you send HTTP requests to.

> The full list of API endpoints is listed in the NMDC Runtime API's [API documentation](https://api.microbiomedata.org/docs).

One of the API endpoints that I like to send HTTP requests to is `/studies`. That API endpoint responds with a list of all the studies that exist in the NMDC database!

You can run the following cell to send an HTTP GET request to that API endpoint:

In [9]:
response = requests.get("https://api.microbiomedata.org/studies")

Now that you have received an HTTP response from the endpoint, you can examine it like before. You can see the JSON dataâ€”in this case, a list of studiesâ€”by running the code in this cell:

In [11]:
response.json()

Whoa! That's a lot of output. Let's break it down.

You can run the following cell to see only its top-level properties:

In [16]:
response.json().keys()

The `meta` property contains data _about the response_, such as pagination parameters and search filter criteria.

The `results` property contains the requested dataâ€”in this case, a list of studies.

You can ignore the `group_by` property. According to the NMDC Runtime API's API documentation, `group_by` is not implemented yet.

Let's display just the `meta` property:

In [17]:
response.json()["meta"]

According to the `meta` property, there are 32 studies in the database.

> Note: At the time of this writing, there are 32. When you run the cell, you may see a different number. The database is constantly changing.

Let's count the studies we received in the `results` list:

In [18]:
len(response.json()["results"])

The `results` list contains only 25 studiesâ€”as opposed to 32. That's because this endpoint uses [pagination](https://en.wikipedia.org/wiki/Pagination#In_Database), and the default page size happens to be 25.

You can customize the page size like this:

In [21]:
# Resend the same HTTP request, but include a higher page size than the default of 25.
response = requests.get("https://api.microbiomedata.org/studies?per_page=100")

# Count the studies in the `results` list.
len(response.json()["results"])

There they are!

You can use the `per_page` parameter to customize the number of items you want to receive per HTTP response.

You can use other parameters to customize the response in other ways, too. For example, you can run the following cell to request only studies whose `ecosystem_category` value is `Aquatic`, and request that the API response contain at most two studies.

In [29]:
response = requests.get("https://api.microbiomedata.org/studies?filter=ecosystem_category:Aquatic&per_page=2&sort_by=name")

# Print the number of studies in the response.
print(len(response.json()["results"]))

# Print their names in the order in which they appear in the response.
for study in response.json()["results"]:
    print(study["name"])

**Congratulations!** You've used a Python notebook to retrieve data residing in the NMDC database, via the NMDC Runtime API. ðŸŽ‰

## 3. Access a _protected_ NMDC Runtime API endpoint

TODO