do not merge - additional resources support #189

thomas-riccardi · 2023-10-19T13:51:56Z

Requirements for Contributing to this repository

Fill out the template below. Any pull request that does not include enough information to be reviewed in a timely manner may be closed at the maintainers' discretion.
The pull request must only fix one issue, or add one feature, at the time.
The pull request must update the test suite to demonstrate the changed functionality.
After you create the pull request, all status checks must be pass before a maintainer reviews your contribution. For more details, please see CONTRIBUTING.

What does this PR do?

To migrate our datadog organization from datadog US site to datadog EU1 site, we used datadog-sync-cli, but needed support for more resources.

This PR is open not to be merged, but to help potential future users: this is not supported code, but we used it for our migration, it can serve as a base, after review.

We do not intend to work on this anymore: our migration is finished.

If this is not the place for this, sorry for the inconvenience, don't hesitate to close.

Description of the Change

warning: some resources don't have an official API, so we used the api used by the web frontend, with cookie authentication (and csrf tokens).

warning: we migrated what we could, with our own priorities, it's often partial migrations

info: we opened datadog support tickets for missing api, small bugs in the api, some were fixed already (!), some upstream happened here from our tickets (logs pipelines order), more details in individual commit messages.

See the README for more details.

additional resources

logs_facets
logs_views
metric_metadatas
incidents
incidents_integrations
incidents_todos
incident_org_settings
incidents_config_fields
incidents_config_notifications_templates
incidents_config_integrations_workflows
integrations_slack_channels

Alternate Designs

with official APIs for more resources, we would have avoided the cookie auth hack

Possible Drawbacks

non-homogeneous model/resource naming
cookie auth/non-official api usage
no automated test
didn't understand everything the lib does/how to develop new models, especially on the first ones or advanced patterns

Verification Process

we tested and used this as is for our migration.

Additional Notes

Some fixes are in datadog_sync/utils/resource_utils.py and could be useful upstream.

Release Notes

do not merge/release

Review checklist (to be filled by reviewers)

Feature or bug fix MUST have appropriate tests (unit, integration, etc...)
PR title must be written as a CHANGELOG entry (see why)
Files changes must correspond to the primary purpose of the PR as described in the title (small unrelated changes should have their own PR)
PR must have one changelog/ label attached. If applicable it should have the backward-incompatible label attached.
PR should not have do-not-merge/ label attached.
If Applicable, issue must have kind/ and severity/ labels attached at least.

…thout DD_ prefix

needed for hack web frontend api usage (e.g. logs_facets) when defining config cookie_dogweb value, the client *switches* to cookie *only*: the api doesn't support both api/app key & cookie auth. => when setting cookie_dogweb config, it will break all standard resources types. also need csrf_token for destination as _authentication_token key in request payload for write actions.

- hardcoded source & target scopeid for now (seem to be internal index id, but which ones?) - use x-csrf-token as _authentication_token key in request payload for write actions

- use dogweb cookie auth - use x-csrf-token as _authentication_token key in request payload for write actions

- get all metric metadata with standard v2 api, but it returns only ids - hack import_resource standard pattern to populate the dict that contains just 'id'; it's called during import just after get_resources() - get & update metric metadata uses v1 api, which doesn't have 'id' in its dict, hack adding it everywhere for consistency with v2/standard pattern - create metric metadata is *not* supported by datadog api, just update it on already existing metric: raise error explaining that: first push data-points on metric, then rerun the script - use standard pattern to get the destination resources (just ids here), and call update_resource() instead if it already exists - initial diff won't be the actual diff: it says it will create everything

…just when len(keys_list) == 1

Only the base incident data is supported, related resources (integrations(slack), todos(remediations), attachments) may be done later with dedicated resources. The import is lossy: for example the creation date is on sync, timeline is lost, etc. - the datadog api documentation says only a subset of accepted fields for creation; in practice it does handles only a subset, and ignores the others => auto update just after create to sync more data - attributes.notification_handles ignored: too hard to reproduce properly, spams people during sync (or dates are all wrong); we don't really care about that piece of information => skip it - avoid forever diff on old incidents without `attributes.visibility`: hardcode `organization` value, it's what will be created on the destination organization anyway - incidents list api initially skipped old incidents without `visibility` field, exchange with datadog support resulted in a production fix - after full import: it seems users not yet active on the destination organization are in fact *not* notified (active incidents commanders are notified, and mandatory according to api documentation)

…mespace if kube_namespace is not null, don't overwrite: the data is better.

…s() when last level is an array of non-nested objects

- undocumented api, but standard v2 api used by web frontend, works with API/APP key - just one resource per org, forcing update, ignoring ids, etc.

- unique by attributes.names - perpetual diff: on 'metadata' for ootb service & team: - PATCH ok (maybe ignores metadata?) - but PATCH response contains `metadata: null` => `diffs` always shows it; it's ok, we can ignore those

Covers General>Integrations & Notifications>Rules - api inconsistency: `attributes.triggers.variables.severity_values` and `attributes.triggers.variables.status_values` are `null` in read calls, and require an array in write calls => skipping them with non_nullable_attr (fixed to support lists too) - errors (probably because some workflows are hardcoded, not duplicable, but no obvious attribute to distingish them) - Error: 400 Bad Request - {"errors":["a workflow like that already exists"]} - Error: 400 Bad Request - {"errors":["Invalid payload: 'name' is invalid"]} => ignoring those errors for now, and manually fixed `Send all incident updates to a global channel` via web frontend.

- iterate on all incidents, then for each incident, iterate on relationships 'todo'

api bug: https://help.datadoghq.com/hc/en-us/requests/1377270

- api doesn't support `display.muting` option, cf https://help.datadoghq.com/hc/en-us/requests/1380152

thomas-riccardi added 23 commits August 23, 2023 18:38

cleanup_org: failure on missing env var, use same env vars as cli: wi…

61e2c1c

…thout DD_ prefix

cleanup_org: don't try to delete ootb log pipeline: not authorized

a133155

don't touch users or host tags

4cbdf21

debug rate limiting by printing headers

dd37f78

gitignore emacs

a7e5867

deepomatic fork readme

455eb93

additional resource: logs_facets

0af262b

- hardcoded source & target scopeid for now (seem to be internal index id, but which ones?) - use x-csrf-token as _authentication_token key in request payload for write actions

additional resource: logs_views

8226ca5

- use dogweb cookie auth - use x-csrf-token as _authentication_token key in request payload for write actions

Fix crash on resource_connection when hitting None instead of object …

9469688

…just when len(keys_list) == 1

Add deepomatic-specific incident fields migration: Namespace=>kube_na…

9518f0d

…mespace if kube_namespace is not null, don't overwrite: the data is better.

Fix crash in remove_excluded_attr() and remove_non_nullable_attribute…

395916d

…s() when last level is an array of non-nested objects

additional resource: incident_org_settings

6e922d7

- undocumented api, but standard v2 api used by web frontend, works with API/APP key - just one resource per org, forcing update, ignoring ids, etc.

additional resource: incidents_config_fields

6f8723c

- unique by attributes.names - perpetual diff: on 'metadata' for ootb service & team: - PATCH ok (maybe ignores metadata?) - but PATCH response contains `metadata: null` => `diffs` always shows it; it's ok, we can ignore those

additional resource: incidents_config_notifications_templates

f753f56

additional resource: incidents_todos

4daf218

- iterate on all incidents, then for each incident, iterate on relationships 'todo'

additional resource: incidents_integrations

8d8fd9c

api bug: https://help.datadoghq.com/hc/en-us/requests/1377270

additional resource: integrations_slack_channels

2f48fa8

- api doesn't support `display.muting` option, cf https://help.datadoghq.com/hc/en-us/requests/1380152

Merge remote-tracking branch 'upstream/main' into extra_resources

a8adbb8

README.md: distinguish deepomatic fork from upstream

1d25d5d

plumdog mentioned this pull request Nov 22, 2023

Document unsupported resources #194

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

do not merge - additional resources support #189

do not merge - additional resources support #189

thomas-riccardi commented Oct 19, 2023

do not merge - additional resources support #189

Are you sure you want to change the base?

do not merge - additional resources support #189

Conversation

thomas-riccardi commented Oct 19, 2023

Requirements for Contributing to this repository

What does this PR do?

Description of the Change

additional resources

Alternate Designs

Possible Drawbacks

Verification Process

Additional Notes

Release Notes

Review checklist (to be filled by reviewers)