### xDD Document-store API Basic Usage

The xDD document-store currently supports a simple set of operations for retrieving full documents and individual pages from the xDD corpus.

**Note**: The document-store API requires simple API key authentication. Place your API key in a file called `config.json` in this directory
prior to running the following examples.
```json
{
    "API_KEY": "<your api key>"
}
```

In [None]:
import requests
import json
from IPython.display import FileLink

BASE_URL = "https://xdddev.chtc.io/documentstore-api"

with open("./config.json") as config:
    API_KEY = json.load(config)["API_KEY"]



#### List Documents

The `/documents` endpoint returns a paginated list of article metadata, alphabetically sorted by title.

In [None]:
PAGE=2
PER_PAGE=25
sample_page = requests.get(f"{BASE_URL}/documents?page={PAGE}&per_page={PER_PAGE}").json()
sample_document = sample_page[0]
print(json.dumps(sample_page, indent=2))

### Retrieve an individual document's metadata

The `/documents/{document_id}` endpoint returns a single document's metadata

In [None]:
sample_document_id = sample_document['id']
sample_document_metadata = requests.get(f'{BASE_URL}/documents/{sample_document_id}').json()

print(json.dumps(sample_document_metadata, indent=2))

### Retrieve the full content of an individual document

The `/documents/{document_id}/content` endpoint returns the full PDF contents of a document. Note the required `X-Api-Key` header for retrieving PDF content

In [None]:
with open('./sample.pdf','wb') as pdf_writer:
    content = requests.get(
        f'{BASE_URL}/documents/{sample_document_id}/content', 
        headers={'X-Api-Key': API_KEY}
    ).content
    pdf_writer.write(content)

print(f"Document '{sample_document['title']}' downloaded at {FileLink('./sample.pdf')}")


### Retrieve a single page from a given document

The `/documents/{document_id}/page/{page_num}` endpoint returns a single page of the specified document. 
The `content_type={pdf|webp|svg}` query parameter can be used to retrieve the page in a variety of content types.

In [None]:
PAGE_NUM = 0
with open('./sample_page.pdf','wb') as pdf_writer:
    content = requests.get(
        f'{BASE_URL}/documents/{sample_document_id}/page/{PAGE_NUM}', 
        headers={'X-Api-Key': API_KEY}
    ).content
    pdf_writer.write(content)

with open('./sample_page.webp','wb') as webp_writer:
    content = requests.get(
        f'{BASE_URL}/documents/{sample_document_id}/page/{PAGE_NUM}?content_type=webp', 
        headers={'X-Api-Key': API_KEY}
    ).content
    webp_writer.write(content)

print(f"Document page {PAGE_NUM} downloaded at {FileLink('./sample_page.pdf')}")
print(f"Also downloaded as webp at {FileLink('./sample_page.webp')}")


### Retrieve a higlighted snippet from a given document

The `/documents/{document_id}/page/{page_num}/snippet/{x0},{y0},{x1},{y1}` endpoint returns a single page of the specified document with the given region highlighted. 
The `content_type={pdf|webp|svg}` query parameter can be used to retrieve the snippet in a variety of content types.

In [None]:
PAGE_NUM = 0
SNIPPET = [25, 25, 725, 900]

snippet = ','.join(str(s) for s in SNIPPET)

with open('./sample_snippet.pdf','wb') as pdf_writer:
    content = requests.get(
        f'{BASE_URL}/documents/{sample_document_id}/page/{PAGE_NUM}/snippet/{snippet}', 
        headers={'X-Api-Key': API_KEY}
    ).content
    pdf_writer.write(content)

with open('./sample_snippet.webp','wb') as webp_writer:
    content = requests.get(
        f'{BASE_URL}/documents/{sample_document_id}/page/{PAGE_NUM}/snippet/{snippet}?content_type=webp', 
        headers={'X-Api-Key': API_KEY}
    ).content
    webp_writer.write(content)

print(f"Document snippet page={PAGE_NUM}; bounds={snippet} downloaded at {FileLink('./sample_snippet.pdf')}")
print(f"Also downloaded as webp at {FileLink('./sample_snippet.webp')}")
