diff --git a/sources/platform/actors/development/actor_definition/output_schema.md b/sources/platform/actors/development/actor_definition/output_schema.md
index b0b1452907..0020d8eec2 100644
--- a/sources/platform/actors/development/actor_definition/output_schema.md
+++ b/sources/platform/actors/development/actor_definition/output_schema.md
@@ -1,87 +1,23 @@
---
title: Output schema
sidebar_position: 3
-description: Output schema is designed to help Actor developers present the results to users in an attractive and comprehensive output UI.
+description: Learn how to define and present your output schema in an user-friendly output UI.
slug: /actors/development/actor-definition/output-schema
---
-**Output schema is designed to help Actor developers present the results to users in an attractive and comprehensive output UI.**
+# Output Schema Specification
----
-
- It is recommended to show the most important fields in a curated Overview visualization configured using output schema specification, while all available fields are automatically available in the "All fields" view.
-
-In the future, output schema will also help with strict output data format validation, which will make integrations more solid and easier to set up.
-
-## Specification version 1
-
-An Actor's output schema defines the structure and both API and visual representation of data produced by an Actor. Output configuration files have to be located in the `.actor` folder in the Actor's root directory.
-
-## How to organize files in the .actor folder: Two options
-
-**A)** all config options are being set in a **.actor/actor.json** file, e.g.:
-
-```json
-// file: .actor/actor.json
-{
- "actorSpecification": 1,
- "name": "this-is-book-library-scraper",
- "title": "Book Library scraper",
- "version": "1.0.0",
- "storages": {
- "dataset": {
- "actorSpecification": 1,
- "fields": {},
- "views": {
- "overview": {
- "title": "Overview",
- "transformation": {},
- "display": {}
- }
- }
- }
- }
-}
-```
-
-**B)** **.actor/actor.json** links to other sub-config files in the same folder, e.g.:
+**Learn how to define and present your output schema in an user-friendly output UI.**
-```json
-// file: .actor/actor.json
-{
- "actorSpecification": 1,
- "name": "this-is-book-library-scraper",
- "title": "Book Library scraper",
- "version": "1.0.0",
- "storages": {
- "dataset": "./dataset_schema.json"
- }
-}
-```
-
-```json
-// file: .actor/dataset_schema.json
-{
- "actorSpecification": 1,
- "fields": {},
- "views": {
- "overview": {
- "title": "Overview",
- "transformation": {},
- "display": {}
- }
- }
-}
-```
+---
-Both options are valid. The user can choose based on their own needs.
+The output schema defines the structure and representation of data produced by an Actor, both in the API and the visual user interface.
-## Basic Template
+## Example
-Imagine there is an Actor that calls `Actor.pushData()` to store data into dataset e.g.
+Let's consider an example Actor that calls `Actor.pushData()` to store data into dataset:
-```javascript
-// file: main.js
+```javascript title="main.js"
import { Actor } from 'apify';
// Initialize the JavaScript SDK
await Actor.init();
@@ -90,28 +26,28 @@ await Actor.init();
* Actor code
*/
await Actor.pushData({
- ___EXAMPLE_NUMERIC_FIELD___: 10,
- ___EXAMPLE_PICTURE_URL_FIELD___: 'https://www.google.com/images/branding/googlelogo/2x/googlelogo_color_92x30dp.png',
- ___EXAMPLE_LINK_URL_FIELD___: 'https://google.com',
- ___EXAMPLE_TEXT_FIELD___: 'Google',
- ___EXAMPLE_BOOLEAN_FIELD___: true,
- ___EXAMPLE_DATE_FIELD___: new Date(),
- ___EXAMPLE_ARRAY_FIELD___: ['#hello', '#world'],
- ___EXAMPLE_OBJECT_FIELD___: {},
+ numericField: 10,
+ pictureUrl: 'https://www.google.com/images/branding/googlelogo/2x/googlelogo_color_92x30dp.png',
+ linkUrl: 'https://google.com',
+ textField: 'Google',
+ booleanField: true,
+ dateField: new Date(),
+ arrayField: ['#hello', '#world'],
+ objectField: {},
});
+
// Exit successfully
await Actor.exit();
```
-Let's say we are going to use a single file to set up an Actor's output tab UI. The following template can be used as a `.actor/actor.json` configuration.
+To set up the Actor's output tab UI using a single configuration file, use the following template for the `.actor/actor.json` configuration:
-```json
-// file: .actor/actor.json
+```json title=".actor/actor.json"
{
"actorSpecification": 1,
- "name": "___ENTER_ACTOR_NAME____",
- "title": "___ENTER_ACTOR_TITLE____",
+ "name": "Actor Name",
+ "title": "Actor Title",
"version": "1.0.0",
"storages": {
"dataset": {
@@ -121,48 +57,48 @@ Let's say we are going to use a single file to set up an Actor's output tab UI.
"title": "Overview",
"transformation": {
"fields": [
- "___EXAMPLE_PICTURE_URL_FIELD___",
- "___EXAMPLE_LINK_URL_FIELD___",
- "___EXAMPLE_TEXT_FIELD___",
- "___EXAMPLE_BOOLEAN_FIELD___",
- "___EXAMPLE_ARRAY_FIELD___",
- "___EXAMPLE_OBJECT_FIELD___",
- "___EXAMPLE_DATE_FIELD___",
- "___EXAMPLE_NUMERIC_FIELD___"
+ "pictureUrl",
+ "linkUrl",
+ "textField",
+ "booleanField",
+ "arrayField",
+ "objectField",
+ "dateField",
+ "numericField"
]
},
"display": {
"component": "table",
"properties": {
- "___EXAMPLE_PICTURE_URL_FIELD___": {
+ "pictureUrl": {
"label": "Image",
"format": "image"
},
- "___EXAMPLE_LINK_URL_FIELD___": {
+ "linkUrl": {
"label": "Link",
"format": "link"
},
- "___EXAMPLE_TEXT_FIELD___": {
+ "textField": {
"label": "Text",
"format": "text"
},
- "___EXAMPLE_BOOLEAN_FIELD___": {
+ "booleanField": {
"label": "Boolean",
"format": "boolean"
},
- "___EXAMPLE_ARRAY_FIELD___": {
+ "arrayField": {
"label": "Array",
"format": "array"
},
- "___EXAMPLE_OBJECT_FIELD___": {
+ "objectField": {
"label": "Object",
"format": "object"
},
- "___EXAMPLE_DATE_FIELD___": {
+ "dateField": {
"label": "Date",
"format": "date"
},
- "___EXAMPLE_NUMERIC_FIELD___": {
+ "numericField": {
"label": "Number",
"format": "number"
}
@@ -175,94 +111,38 @@ Let's say we are going to use a single file to set up an Actor's output tab UI.
}
```
-The template above defines the configuration for the default dataset output view. Under the **views** property, there is one view with the title **Overview**. The view configuration consists of two basic steps: 1) set up how to fetch the data, aka **transformation,** and 2) set up how to **display** the data fetched in step 1). The default behaviour is that the Output tab UI table will display **all the fields** from `transformation.fields` **in that same order**. Theoretically, there should be no need to set up `display.properties` at all. However, it can be customized in case it is visually worth setting up some specific display format or column labels. The customization is carried out by using one of the `transformation.fields` names inside `display.properties` and overriding either the label or the format, as demonstrated in the basic template above.
+The template above defines the configuration for the default dataset output view. Under the `views` property, there is one view titled _Overview_. The view configuartion consists of two main steps:
+
+1. `transformation` - set up how to fetch the data.
+2. `display` - set up how to visually present the fetched data.
-A 2-step configuration (transform & display) was implemented to provide a way to fetch data in the format presented in both API and UI consistently. Consistency between API data and UI data is crucial for Actor end-users for them to experience the same results in both API and UI. Thus for the best end-user experience, we recommend overriding as few display properties as possible.
+The default behavior of the Output tab UI table is to display all fields from `transformation.fields` in the specified order. You can customize the display properties for specific formats or column labels if needed.
-Example of an Actor output UI generated using basic template:

-## Example with inline comments
+## Structure
+
+Output configuration files need to be located in the `.actor` folder within the Actor's root directory.
+
+You have two choices of how to organize files withing the `.actor` folder.
+
+### Single configuration file
-```json5
-// file: .actor/actor.json
+```json title=".actor/actor.json"
{
- "actorSpecification": 1, // mandatory
- "name": "this-is-book-library-scraper", // mandatory, unique name of an Actor
- "title": "Book Library scraper", // mandatory, the human readable name of an Actor
- "version": "1.0.0", // mandatory
- "storages": { // mandatory
- "dataset": { // mandatory
- "actorSpecification": 1, // mandatory
- "fields": {}, // mandatory, but it can be an empty object for now
- "views": { // mandatory
- "overview": { // mandatory, but it does not have to be "overview", one can choose any name, multiple views are possible within views object
- "title": "Overview", // mandatory, one can choose any other title
- "transformation": { // mandatory
- "fields": [ // mandatory, fields property supports basic JSONPath selectors
- "isbn", // important, an order of fields in this array matches the order of columns in visualisation UI
- "picture",
- "title",
- "buyOnlineUrl",
- "author",
- "longBookDescription",
- "anObjectWithDeepStructure.pageCount",
- "anObjectWithDeepStructure.buyOnlineUrl",
- "anObjectWithDeepStructure.isRead",
- "anObjectWithDeepStructure.lastReadTime",
- "anArray",
- "anObject"
- ],
- "flatten": [ // optional, flattened objects are easily available for as display.properties keys
- "anObjectWithDeepStructure"
- ]
- },
- "display": { // mandatory
- "component": "table", // mandatory
- "properties": { // mandatory
- "isbn": { // optional, use transformation.fields values there as keys
- "label": "ISBN", // optional, define "label" only in case you would like to overide the basic field name capitalisation in table UI
- // "format": "text" // optional, "text" format is default, use only in case you would like to overide the default format settings
- },
- "picture": {
- "label": "Cover",
- "format": "image" // optional, in this case the format is overriden to show "image" instead of image link "text". "image" format only works with .jpeg, .png or other image format urls.
- },
- // "title": { // does not have to be specified, default behaviour will show the field correctly
- // "label": "Title",
- // "format": "text"
- // },
- "buyOnlineUrl": {
- "label": "URL",
- "format": "link"
- },
- // "author": {
- // "label": "Author",
- // "format": "text"
- // },
- "longBookDescription": {
- "label": "Description"
- },
- "anObjectWithDeepStructure.pageCount": { // use "." for sub-keys of flattened objects
- "label": "# pages",
- "format": "number"
- },
- "anObjectWithDeepStructure.isRead": {
- "label": "Have been read?",
- "format": "boolean"
- },
- "anObjectWithDeepStructure.lastReadTime": {
- "label": "Last read time",
- "format": "date"
- },
- "anObjectExample": {
- "label": "Some Object"
- },
- "anArrayExample": {
- "label": "Some Array"
- }
- }
- }
+ "actorSpecification": 1,
+ "name": "this-is-book-library-scraper",
+ "title": "Book Library scraper",
+ "version": "1.0.0",
+ "storages": {
+ "dataset": {
+ "actorSpecification": 1,
+ "fields": {},
+ "views": {
+ "overview": {
+ "title": "Overview",
+ "transformation": {},
+ "display": {}
}
}
}
@@ -270,56 +150,88 @@ Example of an Actor output UI generated using basic template:
}
```
-### Nested structures
+### Separate configuration files
+
+```json title=".actor/actor.json"
+{
+ "actorSpecification": 1,
+ "name": "this-is-book-library-scraper",
+ "title": "Book Library scraper",
+ "version": "1.0.0",
+ "storages": {
+ "dataset": "./dataset_schema.json"
+ }
+}
+```
+
+```json title=".actor/dataset_schema.json"
+{
+ "actorSpecification": 1,
+ "fields": {},
+ "views": {
+ "overview": {
+ "title": "Overview",
+ "transformation": {},
+ "display": {}
+ }
+ }
+}
+```
+
+Both of these methods are valid so choose one that suits your needs best.
+
+## Handle nested structures
-The most frequently used data formats present the data in a tabular format (Output tab table, Excel, CSV). In case an Actor produces nested JSON structures, there is a need to transform the nested data into a flat tabular format. You can flatten the data in following ways:
+The most frequently used data formats present the data in a tabular format (Output tab table, Excel, CSV). If your Actor produces nested JSON structures, you need to transform the nested data into a flat tabular format. You can flatten the data in the following ways:
-**1)** use `transformation.flatten` to flatten the nested structure of specified fields. Flatten transforms the nested object into a flat structure. e.g. with `flatten:["foo"]`, the object `{"foo": {"bar": "hello"}}` is turned into `{"foo.bar": "hello"}`. Once the structure is flattened, it is necessary to use the flattened property name in both `transformation.fields` and `display.properties`, otherwise, fields might not be fetched or configured properly in the UI visualization.
+- Use `transformation.flatten` to flatten the nested structure of specified fields. This transforms the nested object into a flat structure. e.g. with `flatten:["foo"]`, the object `{"foo": {"bar": "hello"}}` is turned into `{"foo.bar": "hello"}`. Once the structure is flattened, it's necessary to use the flattened property name in both `transformation.fields` and `display.properties`, otherwise, fields might not be fetched or configured properly in the UI visualization.
-**2)** use `transformation.unwind` to deconstruct the nested children into parent objects.
+- Use `transformation.unwind` to deconstruct the nested children into parent objects.
-**3)** change the output structure in an Actor from nested to flat before the results are saved in the dataset.
+- Change the output structure in an Actor from nested to flat before the results are saved in the dataset.
## Dataset schema structure definitions
+The dataset schema structure defines the various components and properties that govern the organization and representation of the output data produced by an Actor. It specifies the structure of the data, the transformations to be applied, and the visual display configurations for the Output tab UI.
+
### DatasetSchema object definition
-| Property | Type | Required | Description |
-| ------------------ | ---------------------------- | -------- | -------------------------------------------------------------------------------------------------- |
-| actorSpecification | integer | true | Specifies the version of dataset schema
structure document.
Currently only version 1 is available. |
-| fields | JSONSchema compatible object | true | Schema of one dataset object.
Use JsonSchema Draft 2020–12 or
other compatible formats. |
-| views | DatasetView object | true | An object with a description of an API
and UI views. |
+| Property | Type | Required | Description |
+| --- | --- | --- | --- |
+| `actorSpecification` | integer | true | Specifies the version of dataset schema
structure document.
Currently only version 1 is available. |
+| `fields` | JSONSchema compatible object | true | Schema of one dataset object.
Use JsonSchema Draft 2020–12 or
other compatible formats. |
+| `views` | DatasetView object | true | An object with a description of an API
and UI views. |
### DatasetView object definition
-| Property | Type | Required | Description |
-| -------------- | ------------------------- | -------- | ----------------------------------------------------------------------------------------------------- |
-| title | string | true | The title is visible in UI in the Output tab
as well as in the API. |
-| description | string | false | The description is only available in the API response.
The usage of this field is optional. |
-| transformation | ViewTransformation object | true | The definition of data transformation
is applied when dataset data are loaded from
Dataset API. |
-| display | ViewDisplay object | true | The definition of Output tab UI visualization. |
+| Property | Type | Required | Description |
+| --- | --- | --- | --- |
+| `title` | string | true | The title is visible in UI in the Output tab
and in the API. |
+| `description` | string | false | The description is only available in the API response. |
+| `transformation` | ViewTransformation object | true | The definition of data transformation
applied when dataset data is loaded from
Dataset API. |
+| `display` | ViewDisplay object | true | The definition of Output tab UI visualization. |
### ViewTransformation object definition
-| Property | Type | Required | Description |
-| -------- | -------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| fields | string[] | true | Selects fields that are going to be presented in the output.
The order of fields matches the order of columns
in visualization UI. In case the fields value
is missing, it will be presented as "undefined" in the UI. |
-| unwind | string | false | Deconstructs nested children into parent object,
e.g.: with `unwind:["foo"]`, the object `{"foo": {"bar": "hello"}}`
is turned into `{"bar": "hello"}`. |
-| flatten | string[] | false | Transforms nested object into flat structure.
eg: with `flatten:["foo"]` the object `{"foo":{"bar": "hello"}}`
is turned into `{"foo.bar": "hello"}`. |
-| omit | string | false | Removes the specified fields from the output.
Nested fields names can be used there as well. |
-| limit | integer | false | The maximum number of results returned.
Default is all results. |
-| desc | boolean | false | By default, results are sorted in ascending based
on the write event into the dataset. desc:true param
will return the newest writes to the dataset first. |
+| Property | Type | Required | Description |
+| --- | --- | --- | --- |
+| `fields` | string[] | true | Selects fields to be presented in the output.
The order of fields matches the order of columns
in visualization UI. If a field value
is missing, it will be presented as **undefined** in the UI. |
+| `unwind` | string | false | Deconstructs nested children into parent object,
For example, with `unwind:["foo"]`, the object `{"foo": {"bar": "hello"}}`
is transformed into `{"bar": "hello"}`. |
+| `flatten` | string[] | false | Transforms nested object into flat structure.
For example, with `flatten:["foo"]` the object `{"foo":{"bar": "hello"}}`
is transformed into `{"foo.bar": "hello"}`. |
+| `omit` | string | false | Removes the specified fields from the output.
Nested fields names can be used as well. |
+| `limit` | integer | false | The maximum number of results returned.
Default is all results. |
+| `desc` | boolean | false | By default, results are sorted in ascending based on the write event into the dataset.
If `desc:true`, the newest writes to the dataset will be returned first. |
### ViewDisplay object definition
-| Property | Type | Required | Description |
-| ---------- | ------------------------------------------------------------------------------------------------------------------ | -------- | ---------------------------------------------------------------------------------------------------------------------------- |
-| component | string | true | Only component "table" is available. |
-| properties | Object | false | Object with keys matching the `transformation.fields`
and ViewDisplayProperty as values. In case properties are not set
the table will be rendered automatically with fields formatted as Strings,
Arrays or Objects. |
+| Property | Type | Required | Description |
+| --- | --- | --- | --- |
+| `component` | string | true | Only the `table` component is available. |
+| `properties` | Object | false | An object with keys matching the `transformation.fields`
and `ViewDisplayProperty` as values. If properties are not set, the table will be rendered automatically with fields formatted as `strings`, `arrays` or `objects`. |
### ViewDisplayProperty object definition
-| Property | Type | Required | Description |
-| -------- | ------------------------------------------------------- | -------- | ---------------------------------------------------------------------------------------------- |
-| label | string | false | In case the data are visualized as in Table view.
The label will be visible table column's header. |
-| format | enum(text, number, date, link,
boolean, image, array, object) | false | Describes how output data values are formatted
in order to be rendered in the output tab UI. |
+| Property | Type | Required | Description |
+| --- | --- | --- | --- |
+| `label` | string | false | In the Table view, the label will be visible as the table column's header. |
+| `format` | One of