Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
6784670
feat: revamp airdrop docs
patricijabrecko Mar 24, 2025
ad88f55
Add missing pages
patricijabrecko Mar 24, 2025
f12d500
Update getting started
patricijabrecko Mar 24, 2025
0e73c47
Update overview
patricijabrecko Mar 24, 2025
1b6813e
Fix diagrams in overview
patricijabrecko Mar 24, 2025
3fc1c3f
Add end to overview diagram
patricijabrecko Mar 24, 2025
39d760e
Move some content from overview to how does it work
patricijabrecko Mar 24, 2025
439746c
Update connecting to the external system page
patricijabrecko Mar 24, 2025
7c49abd
Update Getting started
patricijabrecko Mar 25, 2025
1c23f3e
Updates
patricijabrecko Mar 25, 2025
626be24
Update how does it work section
patricijabrecko Mar 25, 2025
06cb02f
musings on state
GasperSenk Mar 25, 2025
9aa52c7
Update publish to marketplace
patricijabrecko Mar 25, 2025
b6e64a8
Add first steps of development
patricijabrecko Mar 26, 2025
dc4624f
Add first steps of development to navigation
patricijabrecko Mar 26, 2025
9b545f3
Move some things around
patricijabrecko Mar 26, 2025
2d8083a
Snap-in configuration
patricijabrecko Mar 26, 2025
fe2cad9
Split data extraction into sections
patricijabrecko Mar 27, 2025
f7648c7
Some text fixes
patricijabrecko Mar 28, 2025
316a82e
Add metadata and data extraction
patricijabrecko Mar 31, 2025
e8d766a
Shuffle things around
patricijabrecko Mar 31, 2025
2068b7d
Merge branch 'main' into extend-adaas-docs
patricijabrecko Apr 2, 2025
ef2f14c
Add information from SDK readme
patricijabrecko Apr 2, 2025
2e7f6a0
Fix metadata-extraction file formatting
patricijabrecko Apr 2, 2025
2720168
Merge branch 'main' into extend-adaas-docs
patricijabrecko Apr 3, 2025
95f4a2c
Minor text changes
patricijabrecko Apr 3, 2025
568b2a3
Fix public.yml
patricijabrecko Apr 3, 2025
1e6cdff
Fix public.yml again
patricijabrecko Apr 3, 2025
7421130
Merge branch 'main' into extend-adaas-docs
radovanjorgic Apr 4, 2025
abb979e
Some fixes and add supported object types
patricijabrecko Apr 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 6 additions & 3 deletions fern/docs/pages/airdrop/attachments-extraction.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
For the attachment extraction phase of the import process, the extractor has to upload each
attachment to DevRev's S3 using the `S3Interact` API.
In the attachment extraction phase, the snap-in has to upload each attachment to DevRev and associate it with its parent data object.

## Triggering event

Expand Down Expand Up @@ -29,7 +28,10 @@ with an event of type `EXTRACTION_ATTACHMENTS_DONE`.
If attachment extraction fails the snap-in must respond to Airdrop with a message with an event of
type `EXTRACTION_ATTACHMENTS_ERROR`.

## Response from the snap-in
## Implementation

Attachments extraction is already provided by SDK, but if you need to customize it for your use case,
it should be implemented in the [attachments-extraction.ts](https://github.com/devrev/adaas-template/blob/main/code/src/functions/extraction/workers/attachments-extraction.ts) file.

After uploading an attachment or a batch of attachments, the extractor also has to prepare and
upload a file specifying the extracted and uploaded attachments.
Expand All @@ -43,6 +45,7 @@ The uploaded artifact is structured like a normal artifact containing extracted
## Examples

Here is an example of an SSOR attachment file:

```json lines
{
"id": {
Expand Down
72 changes: 45 additions & 27 deletions fern/docs/pages/airdrop/data-extraction.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -27,27 +27,27 @@ The restarting is immediate (in case of `EXTRACTION_DATA_PROGRESS`) or delayed
(in case of `EXTRACTION_DATA_DELAY`).

Once the data extraction is done, the snap-in must respond to Airdrop with a message with event of
type `EXTRACTION_DATA_DONE`.
type `EXTRACTION_DATA_DONE`.

If data extraction failed in any moment of extraction, the snap-in must respond to Airdrop with a
message with event of type `EXTRACTION_DATA_ERROR`.

## Response from the snap-in
## Implementation

During the data extraction phase, the snap-in uploads batches of extracted items (the recommended
batch size is 2000 items) formatted in JSONL (JSON Lines format), gzipped, and submitted as an
artifact to S3Interact (with tooling from `@devrev/adaas-sdk`).
Data extraction should be implemented in the [data-extraction.ts](https://github.com/devrev/adaas-template/blob/main/code/src/functions/extraction/workers/data-extraction.ts) file.

During the data extraction phase, the snap-in uploads batches of extracted items (with tooling from `@devrev/adaas-sdk`).

Each artifact is submitted with an `item_type`, defining a separate domain object from the
external system and matching the `record_type` in the provided metadata.
Item types defined when uploading extracted data must validate the declarations in the metadata file.

Extracted data must be normalized:

- Null values: All fields without a value should either be omitted or set to null.
For example, if an external system provides values such as "", –1 for missing values,
those must be set to null.
For example, if an external system provides values such as "", –1 for missing values,
those must be set to null.
- Timestamps: Full-precision timestamps should be formatted as RFC3339 (`1972-03-29T22:04:47+01:00`),
and dates should be just `2020-12-31`.
and dates should be just `2020-12-31`.
- References: references must be strings, not numbers or objects.
- Number fields must be valid JSON numbers (not strings).
- Multiselect fields must be provided as an array (not CSV).
Expand All @@ -58,17 +58,17 @@ All other fields are contained within the `data` attribute.

```json {2-4}
{
"id": "2102e01F",
"created_date": "1972-03-29T22:04:47+01:00",
"modified_date": "1970-01-01T01:00:04+01:00",
"data": {
"actual_close_date": "1970-01-01T02:33:18+01:00",
"creator": "b8",
"owner": "A3A",
"rca": null,
"severity": "fatal",
"summary": "Lorem ipsum"
}
"id": "2102e01F",
"created_date": "1972-03-29T22:04:47+01:00",
"modified_date": "1970-01-01T01:00:04+01:00",
"data": {
"actual_close_date": "1970-01-01T02:33:18+01:00",
"creator": "b8",
"owner": "A3A",
"rca": null,
"severity": "fatal",
"summary": "Lorem ipsum"
}
}
```

Expand All @@ -86,14 +86,32 @@ You can also generate example data to show the format the data has to be normali
echo '{}' | chef-cli fuzz-extracted -r issue -m external_domain_metadata.json > example_issues.json
```

## Deploying and testing the snap-in
## State handling

Since each snap-in invocation is a separate runtime instance (with a maximum execution time of 12 minutes),
it does not know what has been previously accomplished or how many records have already been extracted.
To enable information passing between invocations and runs, support has been added for saving a limited amount
of data as the snap-in `state`. Snap-in `state` persists between phases in one sync run as well as between multiple sync runs.
You can access the `state` through SDK's `adapter` object.

Once you have implemented data extraction, you should deploy your snap-in to your test organization and run an import.
A snap-in must consult its state to obtain information on when the last successful forward sync started.

To deploy the snap-in, run `make auth` and `make deploy` in the snap-in repository.
Then, activate the snap-in by running `devrev snap_in activate`.
- The snap-in's `state` is loaded at the start of each invocation and saved at its end.
- The snap-in's `state` must be a valid JSON object.
- Each sync direction (to DevRev and from DevRev) has its own `state` object that is not shared.
- The snap-in `state` should be smaller than 1 MB, which maps to approximately 500,000 characters.

After activation, you can create an import in the DevRev UI, which will initially reach the 'waiting for user input' stage.
During this phase, you can verify your data extraction implementation is working correctly.
Effective use of the state and breaking down the problem into smaller chunks are crucial for good performance and user experience. Without knowing what has been processed, the snap-in extracts the same data multiple times, using valuable API capacity and time, and possibly duplicates the data inside DevRev or the external application.

Relevant documentation can be found in the [Snap-in development](/snapin-development/locally-testing-snap-ins) section.
The snap-in starter template contains an [example](https://github.com/devrev/adaas-template/blob/main/code/src/functions/extraction/index.ts) of a simple state. Adding more data to the state can help with pagination and rate limiting by saving the point at which extraction was left off.

To test the state in development, you can decrease the timeout between snap-in invocations.

```typescript
await spawn<DummyExtractorState>({
...,
option: {
timeout: 1 * 60 * 1000; // 1 minute in milliseconds
}
});
```
27 changes: 27 additions & 0 deletions fern/docs/pages/airdrop/deploy-to-organization.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
Once you're ready to test your snap-in in a production environment, you can deploy the snap-in to your organization.

Follow these steps:

1. Copy `.env.example` to a new file named `.env` and fill in the required variables.
2. Deploy a draft version of your snap-in to your organization by using `make deploy`.
3. Install the snap-in in your DevRev by going to **Settings** > **Snap-ins** > **Install snap-in**.
4. Set up the connection under **Settings** > **Airdrops** > **Connections**.
5. Create an import at **Settings** > **Airdrops** > **Airdrop**.

This step is also a prerequisite for publishing the snap-in on the DevRev marketplace.

### Observability

To observe logs from your snap-in in your development environment:

```bash
devrev snap_in_package logs | jq
```

To open logs in your favorite editor:

```bash
devrev snap_in_package logs | code -
```

For more information, refer to [Debugging](/snapin-development/debugging).
71 changes: 37 additions & 34 deletions fern/docs/pages/airdrop/external-sync-units-extraction.mdx
Original file line number Diff line number Diff line change
@@ -1,52 +1,55 @@
In the external sync unit extraction phase, the extractor is expected to obtain a list of external
sync units that it can extract with the provided credentials and send it to Airdrop in its response.

An _external sync unit_ refers to a single unit in the external system that is being airdropped to DevRev.
In some systems, this is a project; in some it is a repository; in support systems it could be
called a brand or an organization.
What a unit of data is called and what it represents depends on the external system's domain model.
It usually combines contacts, users, work-like items, and comments into a unit of domain objects.

Some external systems may offer a single unit in their free plans,
while their enterprise plans may offer their clients to operate many separate units.

The external sync unit ID is the identifier of the sync unit (project, repository, or similar)
in the external system.
For GitHub, this would be the repository, for example `cli` in `github.com/devrev/cli`.

## Triggering event
In the external sync unit extraction phase, the snap-in is expected to obtain a list of external
sync units that it can extract from the external system API and send it to Airdrop in its response.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove the extra newline?

We haven't defined what an "external sync unit" is. Lets define it or speak in more general terms.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One paragraph above defines external sync unit :)

External sync unit extraction is executed only during the initial import.
It extracts external sync units available in the external system, so that the end user can choose
which external sync unit should be airdropped during the creation of an **Import** in the DevRev App.

Airdrop initiates the external sync unit extraction phase by starting the worker with a message
with an event of type `EXTRACTION_EXTERNAL_SYNC_UNITS_START`.
### Implementation

The snap-in must respond to Airdrop with a message with an event of type
`EXTRACTION_EXTERNAL_SYNC_UNITS_DONE`, which contains a list of external sync units as a payload,
or `EXTRACTION_EXTERNAL_SYNC_UNITS_ERROR` in case of an error.
This phase should be implemented in the [`external-sync-units-extraction.ts`](https://github.com/devrev/adaas-template/blob/main/code/src/functions/extraction/workers/external-sync-units-extraction.ts) file.

## Response from the snap-in
The snap-in should emit the list of external sync units in the given format:

```typescript
const externalSyncUnits: ExternalSyncUnit[] = [
{
id: "devrev",
name: "devrev",
description: "Demo external sync unit",
item_count: 100,
},
];
```

The snap-in provides the list of external sync units in the provided event message
`event_data.external_sync_units` containing the following fields:
- `id`: The unique identifier in the external system.
- `name`: The human-readable name in the external system.
- `description`: The short description if the external system provides it.
- `item_count`: The number of items (issues, tickets, comments or others) in the external system.
Item count should be provided if it can be obtained in a lightweight manner, such as by calling an API endpoint.
If there is no such way to get it (for example, if the items would need to be extracted to count them),
then the item count should be `-1` to avoid blocking the import with long-running queries.
Item count should be provided if it can be obtained in a lightweight manner, such as by calling an API endpoint.
If there is no such way to get it (for example, if the items would need to be extracted to count them),
then the item count should be `-1` to avoid blocking the import with long-running queries.

Example:
```json
[
{
"id": "a-microservice-repository",
"name": "A Microservice Repository",
"description": "Our greatest microservice repo",
"item_count": 232
}
]
The snap-in must respond to Airdrop with a message, which contains a list of external sync units as a payload:

```typescript
await adapter.emit(ExtractorEventType.ExtractionExternalSyncUnitsDone, {
external_sync_units: externalSyncUnits,
});
```

or an error:

```typescript
await adapter.emit(ExtractorEventType.ExtractionExternalSyncUnitsError, {
error: {
message: "Failed to extract external sync units. Lambda timeout.",
},
});
```

To test your changes, start a new airdrop in the DevRev App. If external sync units extraction is successful, you should be prompted to choose an external sync unit from the list.
Loading
Loading