Skip to content

Commit

Permalink
x-pack/filebeat/input/entityanalytics/provider/okta: allow fine-grain…
Browse files Browse the repository at this point in the history
… control of API requests (elastic#36492)

This adds support for specifying which of users/devices to collect from
the Okta API endpoints in order to reduce network costs for users who do
not need a full set of entities.

The current change does not change the behaviour of device collection of
registered owners and registered users; when the "devices" dataset is
selected there user entities will still be collected as they are
considered here as an attribute of the device, rather than a component
of the users dataset.

This change also removes the undocumented WantDevices config field as this
option is non-orthogonal with the dataset configuration.
  • Loading branch information
efd6 authored and Scholar-Li committed Feb 5, 2024
1 parent c3e453c commit f0b743f
Show file tree
Hide file tree
Showing 5 changed files with 208 additions and 150 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.next.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -209,6 +209,7 @@ https://github.com/elastic/beats/compare/v8.8.1\...main[Check the HEAD diff]
- Reduce HTTPJSON metrics allocations. {pull}36282[36282]
- Add support for a simplified input configuraton when running under Elastic-Agent {pull}36390[36390]
- Make HTTPJSON response body decoding errors more informative. {pull}36481[36481]
- Allow fine-grained control of entity analytics API requests for Okta provider. {issue}36440[36440] {pull}36492[36492]

*Auditbeat*

Expand Down
9 changes: 9 additions & 0 deletions x-pack/filebeat/docs/inputs/input-entity-analytics.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -545,6 +545,7 @@ Example configuration:
enabled: true
id: okta-1
provider: okta
dataset: "all"
sync_interval: "12h"
update_interval: "30m"
okta_domain: "OKTA_DOMAIN"
Expand All @@ -570,6 +571,14 @@ Whether the input should collect device and device-associated user details
from the Okta API. Device details must be activated on the Okta account for
this option.

[float]
===== `dataset`

The datasets to collect from the API. This can be one of "all", "users" or "devices",
or may be left empty for the default behavior which is to collect all entities.
When the `dataset` is set to "devices", some user entity data is collected in order
to populate the registered users and registered owner fields for each device.

[float]
===== `sync_interval`

Expand Down
15 changes: 10 additions & 5 deletions x-pack/filebeat/input/entityanalytics/provider/okta/conf.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ package okta

import (
"errors"
"strings"
"time"

"github.com/elastic/elastic-agent-libs/transport/httpcommon"
Expand Down Expand Up @@ -41,10 +42,10 @@ type conf struct {
OktaDomain string `config:"okta_domain" validate:"required"`
OktaToken string `config:"okta_token" validate:"required"`

// WantDevices indicates that device details
// should be collected. This is optional as
// the devices API is not necessarily activated.
WantDevices bool `config:"collect_device_details"`
// Dataset specifies the datasets to collect from
// the API. It can be ""/"all", "users", or
// "devices".
Dataset string `config:"dataset"`

// SyncInterval is the time between full
// synchronisation operations.
Expand Down Expand Up @@ -159,7 +160,11 @@ func (c *conf) Validate() error {
return errInvalidUpdateInterval
case c.SyncInterval <= c.UpdateInterval:
return errSyncBeforeUpdate
default:
}
switch strings.ToLower(c.Dataset) {
case "", "all", "users", "devices":
return nil
default:
return errors.New("dataset must be 'all', 'users', 'devices' or empty")
}
}
17 changes: 15 additions & 2 deletions x-pack/filebeat/input/entityanalytics/provider/okta/okta.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import (
"io"
"net/http"
"net/url"
"strings"
"time"

"github.com/hashicorp/go-retryablehttp"
Expand Down Expand Up @@ -338,6 +339,13 @@ func (p *oktaInput) runIncrementalUpdate(inputCtx v2.Context, store *kvstore.Sto
// any existing deltaLink will be ignored, forcing a full synchronization from Okta.
// Returns a set of modified users by ID.
func (p *oktaInput) doFetchUsers(ctx context.Context, state *stateStore, fullSync bool) ([]*User, error) {
switch strings.ToLower(p.cfg.Dataset) {
case "", "all", "users":
default:
p.logger.Debugf("Skipping user collection from API: dataset=%s", p.cfg.Dataset)
return nil, nil
}

var (
query url.Values
err error
Expand Down Expand Up @@ -418,7 +426,10 @@ func (p *oktaInput) doFetchUsers(ctx context.Context, state *stateStore, fullSyn
// synchronization from Okta.
// Returns a set of modified devices by ID.
func (p *oktaInput) doFetchDevices(ctx context.Context, state *stateStore, fullSync bool) ([]*Device, error) {
if !p.cfg.WantDevices {
switch strings.ToLower(p.cfg.Dataset) {
case "", "all", "devices":
default:
p.logger.Debugf("Skipping device collection from API: dataset=%s", p.cfg.Dataset)
return nil, nil
}

Expand Down Expand Up @@ -482,7 +493,9 @@ func (p *oktaInput) doFetchDevices(ctx context.Context, state *stateStore, fullS

// Users are not stored in the state as they are in doFetchUsers. We expect
// them to already have been discovered/stored from that call and are stored
// associated with the device undecorated with discovery state.
// associated with the device undecorated with discovery state. Or, if the
// the dataset is set to "devices", then we have been asked not to care about
// this detail.
batch[i].Users = append(batch[i].Users, users...)

next, err := okta.Next(h)
Expand Down

0 comments on commit f0b743f

Please sign in to comment.