Skip to content

Latest commit

 

History

History
422 lines (351 loc) · 10.5 KB

storage.mdx

File metadata and controls

422 lines (351 loc) · 10.5 KB
title description
Storage
Documentation for how to call the Storage APIs using the Aryn SDK

import DocSetMetadata from '/snippets/docset_metadata.mdx'; import DocumentMetadata from '/snippets/document_metadata.mdx'; import Document from '/snippets/document.mdx';

Please find the documentation for how to call the Storage APIs using the Aryn SDK below. All parameters are optional unless specified otherwise.

DocSet Functions

Functions for managing document sets (DocSets) which are collections of documents.

Create DocSet

Create a new DocSet to store documents.

String name for the DocSet Optional dictionary of additional properties Optional Schema object defining document properties Optional dictionary of prompts for the DocSet ```python from aryn_sdk.client.client import Client

docset = client.create_docset(name="My DocSet") docset_id = docset.docset_id

</Accordion>

<Accordion title="Return Value">
A DocSetMetadata object containing:
<DocSetMetadata />
</Accordion>
</AccordionGroup>

### Get DocSet
Retrieve metadata for a DocSet.
<AccordionGroup>
<Accordion title="Parameters">
<ParamField path="docset_id" type="string" required>
  The unique identifier of the DocSet to retrieve
</ParamField>
</Accordion>

<Accordion title="Example">
```python
docset = client.get_docset(docset_id="your-docset-id")
A DocSetMetadata object containing: - `HTTPError 403`: "No Aryn API Key provided" - `HTTPError 403`: "Invalid Aryn API key" - `HTTPError 403`: "Expired Aryn API key" - `HTTPError 404`: "DocSet not found" - `HTTPError 5xx`: Internal Server Error

List DocSets

List all DocSets in the account. Number of items per page

Token for pagination Filter DocSets by exact name match ```python docsets = client.list_docsets().get_all() for docset in docsets: print(f"DocSet: {docset.name}") ``` A paginated list of DocSetMetadata objects, each containing: - `HTTPError 403`: "No Aryn API Key provided" - `HTTPError 403`: "Invalid Aryn API key" - `HTTPError 403`: "Expired Aryn API key" - `HTTPError 5xx`: Internal Server Error

Delete DocSet

Delete a DocSet and all its documents. The unique identifier of the DocSet to delete

```python client.delete_docset(docset_id="your-docset-id") ``` The metadata of the deleted DocSet - `HTTPError 403`: "No Aryn API Key provided" - `HTTPError 403`: "Invalid Aryn API key" - `HTTPError 403`: "Expired Aryn API key" - `HTTPError 404`: "DocSet not found" - `HTTPError 5xx`: Internal Server Error

Document Functions

Functions for managing individual documents within DocSets.

List Documents

List all documents in a DocSet. ID of the DocSet containing the documents

Number of items per page Token for pagination ```python docs = client.list_docs(docset_id="your-docset-id") for doc in docs: print(f"Document: {doc.name}") ``` A paginated list of DocumentMetadata objects, each containing: - `HTTPError 403`: "No Aryn API Key provided" - `HTTPError 403`: "Invalid Aryn API key" - `HTTPError 403`: "Expired Aryn API key" - `HTTPError 404`: "DocSet not found" - `HTTPError 400`: "Invalid filter parameters" - `HTTPError 5xx`: Internal Server Error

Get Document

Get a document by ID. The unique identifier of the DocSet containing the document

The unique identifier of the document to retrieve Boolean to include document elements Boolean to include binary data ```python doc = client.get_doc(docset_id="your-docset-id", doc_id="your-doc-id") ``` A Document object containing: - `HTTPError 403`: "No Aryn API Key provided" - `HTTPError 403`: "Invalid Aryn API key" - `HTTPError 403`: "Expired Aryn API key" - `HTTPError 404`: "Document not found" - `HTTPError 5xx`: Internal Server Error

Delete Document

Delete a document by ID. The unique identifier of the DocSet containing the document

The unique identifier of the document to delete ```python client.delete_doc(docset_id="your-docset-id", doc_id="your-doc-id") ``` The metadata of the deleted document - `HTTPError 403`: "No Aryn API Key provided" - `HTTPError 403`: "Invalid Aryn API key" - `HTTPError 403`: "Expired Aryn API key" - `HTTPError 404`: "Document not found" - `HTTPError 5xx`: Internal Server Error

Get Document Binary

Get the binary content of a document. The unique identifier of the DocSet containing the document

The unique identifier of the document to retrieve The file object to write the binary content to ```python output = "output.pdf" client.get_doc_binary(docset_id="your-docset-id", doc_id="your-doc-id", file=output) ``` The binary content of the document - `HTTPError 403`: "No Aryn API Key provided" - `HTTPError 403`: "Invalid Aryn API key" - `HTTPError 403`: "Expired Aryn API key" - `HTTPError 404`: "Document not found" - `HTTPError 5xx`: Internal Server Error

Properties Functions

Functions for managing document properties.

Update Document Properties

Update properties of a document. The unique identifier of the DocSet containing the document

The unique identifier of the document to update List of ReplaceOperation objects defining property updates ```python from aryn_sdk.types import ReplaceOperation

updates = [ ReplaceOperation( path="/properties/status", value="reviewed" ) ] client.update_doc_properties( docset_id="your-docset-id", doc_id="your-doc-id", operations=updates )

</Accordion>

<Accordion title="Return Value">
The updated Document object containing:
<Document />
</Accordion>

<Accordion title="Exceptions">
- `HTTPError 403`: "No Aryn API Key provided"
- `HTTPError 403`: "Invalid Aryn API key"
- `HTTPError 403`: "Expired Aryn API key"
- `HTTPError 404`: "Document not found"
- `HTTPError 5xx`: Internal Server Error
</Accordion>
</AccordionGroup>

### Extract Properties
Extract properties from a document.
<AccordionGroup>
<Accordion title="Parameters">
<ParamField path="docset_id" type="string" required>
  The unique identifier of the DocSet containing the documents
</ParamField>

<ParamField body="schema" type="object" required>
  Schema object defining properties to extract
</ParamField>
</Accordion>

<Accordion title="Example">
```python
from aryn_sdk.types.schema import Schema, SchemaField

schema = Schema(fields=[
    SchemaField(name="category", field_type="string")
])
client.extract_properties(docset_id="your-docset-id", schema=schema)
A job status object containing: - `exit_status`: The exit status of the job - `HTTPError 403`: "No Aryn API Key provided" - `HTTPError 403`: "Invalid Aryn API key" - `HTTPError 403`: "Expired Aryn API key" - `HTTPError 404`: "DocSet not found" - `HTTPError 5xx`: Internal Server Error

Delete Properties

Delete properties from a document. The unique identifier of the DocSet containing the documents

Schema object defining properties to delete ```python client.delete_properties(docset_id="your-docset-id", schema=schema) ``` A job status object - `HTTPError 403`: "No Aryn API Key provided" - `HTTPError 403`: "Invalid Aryn API key" - `HTTPError 403`: "Expired Aryn API key" - `HTTPError 404`: "DocSet not found" - `HTTPError 5xx`: Internal Server Error