From d9e5f5da9ac4fcf45a8469cdc46139b48921e8f5 Mon Sep 17 00:00:00 2001
From: Paul Cornell <paul@unstructured.io>
Date: Tue, 15 Apr 2025 16:54:40 -0700
Subject: [PATCH] UI/API document elements: doc updates

---
 ui/document-elements.mdx | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/ui/document-elements.mdx b/ui/document-elements.mdx
index 48b122d6..938252d8 100644
--- a/ui/document-elements.mdx
+++ b/ui/document-elements.mdx
@@ -27,7 +27,8 @@ Here's an example of what an element might look like:
 
 Every element has a [type](#element-type); an [element_id](#element-id); the extracted `text`; and some [metadata](#metadata) which might 
 vary depending on the element type, file structure, and some additional settings that are applied during 
-[partitioning](/ui/partitioning), chunking, summarizing, and embedding.
+[partitioning](/ui/partitioning), [chunking](/ui/chunking), and [enriching](/ui/enriching/overview). Optionally, the element can also have an 
+[embeddings](/ui/embedding) derived from the `text`; the length of `embeddings` depends on the embedding model that is used.
 
 ## Element type
 
@@ -43,18 +44,21 @@ Here are some examples of the element types your file might contain:
 | Element type        | Description                                                                                                                                          |
 |---------------------|------------------------------------------------------------------------------------------------------------------------------------------------------|
 | `Address`           | A text element for capturing physical addresses.                                                                                                     |
+| `CodeSnippet`       | A text element for capturing code snippets.                                                                                                          |
 | `EmailAddress`      | A text element for capturing email addresses.                                                                                                        |
 | `FigureCaption`     | An element for capturing text associated with figure captions.                                                                                       |
 | `Footer`            | An element for capturing document footers.                                                                                                           |
+| `FormKeysValues`    | An element for capturing key-value pairs in a form.                                                                                                  | 
 | `Formula`           | An element containing formulas in a file.                                                                                                            |
 | `Header`            | An element for capturing document headers.                                                                                                           |
 | `Image`             | A text element for capturing image metadata.                                                                                                         |
 | `ListItem`          | `ListItem` is a `NarrativeText` element that is part of a list.                                                                                      |
 | `NarrativeText`     | `NarrativeText` is an element consisting of multiple, well-formulated sentences. This excludes elements such titles, headers, footers, and captions. |
 | `PageBreak`         | An element for capturing page breaks.                                                                                                                |
+| `PageNumber`        | An element for capturing page numbers.                                                                                                               |
 | `Table`             | An element for capturing tables.                                                                                                                     |
 | `Title`             | A text element for capturing titles.                                                                                                                 |
-| `UncategorizedText` | Base element for capturing free text from within files.                                                                                              |
+| `UncategorizedText` | Base element for capturing free text from within files. Applies to extracted text not associated with bounding boxes if the input is a PDF file.     |
 
 If you apply chunking, you will also see the `CompositeElement` type. 
 `CompositeElement` is a chunk formed from text (non-`Table`) elements. 
@@ -172,6 +176,7 @@ Documents can include additional file metadata, based on the specified source co
 - `date_created`
 - `date_modified`
 - `date_processed`
+- `permissions_data`
 - `record_locator`
 - `url`
 - `version`