Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion base_folder/src/functions/extraction/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ export interface ExtractorState {}

export const initialState: ExtractorState = {};


function getWorkerPerExtractionPhase(event: FunctionInput) {
let path;
switch (event.payload.event_type) {
Expand Down
2 changes: 2 additions & 0 deletions devrev-snapin-template.plain
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,8 @@

- The Function being implemented *MUST not* be called "extraction" or "loading".

- {% include "templates/should_not_modify_extraction_function.plain" %}

***Functional Requirements:***

- Implement The Function that only checks if The Function can be invoked.
Expand Down
1 change: 1 addition & 0 deletions docs/data-extraction-rules-for-emitting-events.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
Here are the rules when `EXTRACTION_DATA_START` or `EXTRACTION_DATA_CONTINUE` event is received by the snap-in:

- If there's a timeout ("onTimeout" method in "spawn" method), the snap-in should emit an event with type `EXTRACTION_DATA_PROGRESS` (no "data" payload).
- Note: Event `EXTRACTION_DATA_PROGRESS` **MUST NEVER** be emitted if the "task" method in The Worker Thread. The Only place where `EXTRACTION_DATA_PROGRESS` can be emitted is in the "onTimeout" method in The Worker Thread.
- If the extraction is successful and all the resources have been extracted, the snap-in should emit an event with type `EXTRACTION_DATA_DONE` (no "data" payload).
- In case of unexpected error, the snap-in should emit an event with type `EXTRACTION_DATA_ERROR` ("data" payload should contain attribute "error" - error message as a string).
- In case of rate limiting (status code 429) for "data extraction", the snap-in *must* emit an event with type `EXTRACTION_DATA_DELAY`. If emitting event of type `EXTRACTION_DATA_DELAY`, you *must* also provide a "delay" attribute, specifying the delay in seconds as an integer.
Expand Down
22 changes: 1 addition & 21 deletions docs/data-extraction.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,27 +15,7 @@ This state object is shared across all invocations and keeps track of where the

## Triggering event

Airdrop initiates data extraction by starting the snap-in with a message with event type
`EXTRACTION_DATA_START` when transitioning to the data extraction phase.

During the data extraction phase, the snap-in extracts data from an external system,
prepares batches of data and uploads them in the form of artifacts (files) to DevRev.

The snap-in must respond to Airdrop with a message with event type of `EXTRACTION_DATA_PROGRESS`,
together with an optional progress estimate when the maximum Airdrop snap-in runtime (13 minutes) has been reached.

If the extraction has been rate-limited by the external system and back-off is required, the snap-in
must respond to Airdrop with a message with event type `EXTRACTION_DATA_DELAY` and specifying
back-off time with `delay` attribute (in seconds as an integer).

In both cases, Airdrop starts the snap-in with a message with event type `EXTRACTION_DATA_CONTINUE`.
In case of `EXTRACTION_DATA_PROGRESS` the restarting is immediate,
meanwhile in case of `EXTRACTION_DATA_DELAY` the restarting is delayed for the given number of seconds.

Once the data extraction is done, the snap-in must respond to Airdrop with a message with event type `EXTRACTION_DATA_DONE`.

If data extraction failed in any moment of extraction, the snap-in must respond to Airdrop with a
message with event type `EXTRACTION_DATA_ERROR`.
Refer to the resource [data-extraction-rules-for-emitting-events.md](docs/data-extraction-rules-for-emitting-events.md) for the rules for emitting events for The Extraction Function.

## Implementation

Expand Down
5 changes: 5 additions & 0 deletions templates/data_fetching_return_expectation.plain
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
The Functions implemented *should* also return:
- `status_code` - a status code from The API.
- `api_delay` - Delay in seconds to wait before retrying the request to The API (if no rate limit, set to 0)
- `message` - Information about the call the The Function.
- Note: Every The Function should handle rate limiting.
2 changes: 2 additions & 0 deletions templates/external_domain_metadata_boilerplate.plain
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@

- The External Domain Metadata JSON object *MUST* be imported without "as" statement in The Implementation Code.

- {% include "templates/should_not_modify_extraction_function.plain" %}

***Test Requirements:***

- Validate generated The External Domain Metadata JSON object with the chef-cli tool (The Chef CLI) using the command "chef-cli validate-metadata":
Expand Down
6 changes: 6 additions & 0 deletions templates/external_sync_unit_acceptance_test.plain
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Test The Extraction Function using the resource [{{ resource_name }}](test_data/{{ resource_name }}).
- Expect The Callback Server to receive *a single* event with "event_type" "EXTRACTION_EXTERNAL_SYNC_UNITS_DONE".
- Let `external_sync_units_array = <event_from_callback_server>["event_data"]["external_sync_units"]` (array of objects).
- Expect `external_sync_units_array` to exist and be an array.
- Expect `len(external_sync_units_array) = {{ expected_external_sync_unit_count }}`.
- Out of `external_sync_units_array`, expect that there exists and element that has "name" equal to "{{ expected_external_sync_unit_name }}".
2 changes: 2 additions & 0 deletions templates/initial_domain_mapping_boilerplate.plain
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,8 @@

- The External Domain Metadata JSON object should not be modified. If there are discrepancies between The External Domain Metadata JSON object and The Initial Domain Mapping JSON object, assume The External Domain Metadata JSON object is correct and The Initial Domain Mapping JSON object needs to be adjusted.

- {% include "templates/should_not_modify_extraction_function.plain" %}

***Test Requirements:***

- Validate generated The Initial Domain Mapping JSON object with the chef-cli tool (The Chef CLI) using the command "chef-cli initial-mapping check -m <The External Domain Metadata JSON object file>":
Expand Down
3 changes: 3 additions & 0 deletions templates/internal_client.plain
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
The {{ external_system_name }} Internal Client is a TypeScript service that communicates with The API. These are the rules for The {{ external_system_name }} Internal Client:
- If we need to create a new request to The API, we must create a new method in The {{ external_system_name }} Internal Client.
- Communication with The API must be completely abstracted away from The Function. The Function must be able to initialize The {{ external_system_name }} Internal Client, call the relevant method from The {{ external_system_name }} Internal Client and get the response from The API.
1 change: 1 addition & 0 deletions templates/should_not_modify_extraction_function.plain
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
The Extraction Function *must not* be modified.
2 changes: 2 additions & 0 deletions templates/spawn_method_instructions.plain
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,5 @@

- Use The Initial Domain Mapping JSON object for initialDomainMapping parameter when spawning a new worker.
- Note: The Initial Domain Mapping JSON object should be read directly from the JSON file.

- The Extraction Function should not stringify error messages. If an error is thrown, it should be logged before throwing it.