# Python implementation of the EMu API

This notebook provides examples of how to use this library to query the EMu REST API. Axiell provides a full description of the API [here](https://help.emu.axiell.com/emurestapi/latest/).

In [None]:
from xmu import EMuAPI, contains, exact, exists, phonetic, range_, stemmed

Create an EMuAPI object by passing the path to a config file. An example config file is included in this directory.

In [None]:
api = EMuAPI(config_path="../emurestapi.toml")

Use the `search()` method to find a catalog record. This method accepts the following arguments:

- The first argument is the backend name of the EMu module
- The **select** parameter is a list of fields to return. IRN is always included in results and does not need to be specified.
- The **filter_** parameter is a dictionary of fields and values to match. If multiple field-value pairs are given, all must match.
- The **limit** parameter sets the number of records to return per page. A limit up to 1000 records per page seems to be generally safe, but higher limits may produce file size errors, especially when large numbers of fields are returned.

In [None]:
resp = api.search(
    "ebibliography",
    select=["RefTitle"],
    filter_={"RefTitle": "dinosaur"},
    limit=1,
)

You can get the first record using the `first()` method on the response object:

In [None]:
rec = resp.first()
rec

Use the `retrieve()` method to get the same record by IRN:

In [None]:
resp = api.retrieve("ebibliography", rec["irn"], select=["RefTitle"])
resp.first()

Most searches return more than one record. We can see the total number of records for a given search using `hits`:

In [None]:
resp = api.search(
    "ebibliography",
    select=["RefTitle"],
    filter_={"RefTitle": "dinosaur"},
    limit=1,
)
resp.hits

We can use a for loop to iterate through the results. When the `autopage` attribute on the `EMuAPI` object is set to "true" (as is the case for the config file shared here), the loop will page through *all* results, automatically making new requests to the API when necessary:

In [None]:
resp = api.search(
    "ebibliography",
    select=["RefTitle"],
    filter_={"RefTitle": "dinosaur"},
    limit=10,
)
records = []
for i, rec in enumerate(resp):
    records.append(rec)
    if i >= 10:
        print(f"Found the {len(records)}th record!")
        break

We can view the request parameters, including the compiled query, using the `params` attribute on the `EMuAPIResponse` object:

In [None]:
resp.params

## Working with attachments

I was not able to figure out how to query or return fields from attached records, for example, a Collections Event record linked in Catalog. For the time being, attachments are loaded using the `DeferredAttachment` class, which populates them only when they are accessed directly. In the code below, the attachment field is initially populated with a stub:

In [None]:
resp = resp = api.search(
    "ebibliography",
    select=["RefContributorsRef_tab"],
    filter_={"RefTitle": "dinosaur"},
    limit=1,
)
rec = resp.first()
rec

The attachment populates automatically when it is accessed:

In [None]:
rec["RefContributorsRole_grp"][0]["RefContributorsRef"]

## Complex filters

The API provides operators to perform most searches that can be done in the client. Each operator is implemented with its own function.

When multiple fields are included in a query, records must match on all fields:

In [None]:
resp = api.search(
    "ebibliography",
    select=["RefTitle", "RefJournalBookTitle"],
    filter_={"RefTitle": "dinosaur", "RefJournalBookTitle": "paleontology"},
    limit=1,
)
print(f"Matched {resp.hits:,} records")
resp.first()

When multiple values are passed to a single key, records must match on at least one term in the list. For example, the following query matches records with titles containing either Triassic, Jurassic, or Cretaceous:

In [None]:
resp = api.search(
    "ebibliography",
    select=["RefTitle", "RefJournalBookTitle"],
    filter_={"RefTitle": ["triassic", "jurassic", "cretaceous"]},
    limit=1,
)
print(f"Matched {resp.hits:,} records")
resp.first()

Use `contains()` to match records where each word in the query term appears in the specified field:

In [None]:
resp = api.search(
    "ebibliography",
    select=["RefPublicationCity"],
    filter_={"RefPublicationCity": contains("New York")},
    limit=10,
)
print(f"Matched {resp.hits:,} records")
resp.first()

Use `exact()` to restrict the search to fields that exactly match a query term:

In [None]:
resp = api.search(
    "ebibliography",
    select=["RefPublicationCity"],
    filter_={"RefPublicationCity": exact("New York")},
    limit=10,
)
print(f"Matched {resp.hits:,} records")
resp.first()

Use `exists()` with `True` to search for records where a field is not null:

In [None]:
resp = api.search(
    "ebibliography",
    select=["RefTitle"],
    filter_={"RefTitle": exists(True)},
    limit=10,
)
print(f"Matched {resp.hits:,} records")
resp.first()

Or with `False` for is null:

In [None]:
resp = api.search(
    "ebibliography",
    select=["RefTitle"],
    filter_={"RefTitle": exists(False)},
    limit=10,
)
print(f"Matched {resp.hits:,} records")
resp.first()

Use `phonetic()` to search for records that sound like the query term:

In [None]:
resp = api.search(
    "eparties",
    select=["NamLast"],
    filter_={"NamLast": phonetic("smith")},
    limit=1,
)
print(f"Matched {resp.hits:,} records")
resp.first()

Use `range_()` to search for records with values within a certain range. This function allows the user to specify keyword arguments for combinations of the four inquality operators. For example, to find publications published in 2024:

In [None]:
resp = api.search(
    "ebibliography",
    select=["RefDate"],
    filter_={"RefDate": range_(gte="2024-01-01", lt="2025-01-01")},
    limit=1,
)
print(f"Matched {resp.hits:,} records")
resp.first()

Use `stemmed()` to search for records that have the same stem as a query term:

In [None]:
resp = api.search(
    "ebibliography",
    select=["RefTitle"],
    filter_={"RefTitle": stemmed("dinosaur")},
    limit=1,
)
print(f"Matched {resp.hits:,} records")
resp.first()

Most of these operators can also be accessed using the search syntax from the EMu client:

- contains: `search term`
- exact: `2000` or `\^term\$` or `\^\"search term\"\$`
- is not null: `\+` or `\*`
- is null: `\!\+` or `\!\*`
- phonetic: `@term`
- range_: `>1` or `<=100` or `>2000 <2025`
- stemmed: `\~term`

Note that some operations that are possible in the client, including phrase (which is defined in the API) and case-sensitive searches (which are not), have not been implemented.

Finally, the select and filter_ arguments can also include the compiled values required for the API. This can be useful for complex searches that do not lend themselves to using the functions above:

In [None]:
resp = api.search(
    "ebibliography",
    select=["data.RefTitle"],
    filter_={
        "OR": [
            {"data.RefTitle": {"contains": {"value": "dinosaur"}}},
            {"data.RefTitle": {"contains": {"value": "cretaceous"}}},
        ]
    },
    limit=1,
)
resp.first()