Collect usage history (access logs) data #265

mabdh · 2022-08-23T07:31:15Z

Summary
As part of #171 , one scope of access monitoring is from the provider access logs: Analytics about what resources are being actually queried and how frequently. We need to figure out how guardian utilize access usage history data for data access monitoring.

Some questions to help

How many times a resource is accessed
How many times sensitive resources are getting accessed
Identify what operations are done on a resource (read/write)
Identify users with excess access rights (access that is active but is not used)

Proposed solution

Track access log to figure which user access what resource
Check which providers that support usage history extraction
- Bigquery
- GCS
- Gcloud IAM
- Metabase
- Tableau
- Grafana
For provider that does not support access log (Expose APIs to submit access logs as well to help capture for providers which does not provide direct log extraction)
Labels for resources to mark as sensitive
Define generic data model

Additional context
This issue is concluded once these points are clearly answered

Clarity on what providers that support usage history extraction and what doesn’t
Generic data model on how the usage history/activities logs will be persisted in Guardian
Approach on how to label a resource as sensitive
Clear approaches on how to collect usage history
- From provider that support usage history extraction
- From provider that does not support usage history extraction
  - Exposed API to submit access logs

The text was updated successfully, but these errors were encountered:

rahmatrhd · 2022-09-02T04:28:57Z

type ProviderActivity struct {
	ID         string
	ProviderID string
	ResourceID string
	AccountID  string
	Timestamp  time.Time
	// Action correlates with grant role/permissions
	Action string // read | write | ...
	// Type is specific to the provider. It defines what kind of activity is being run
	Type     string // query | view | export | etc...
	Metadata map[string]interface{}
}

// example
bqQueryLog := ProviderActivity{
	ID: "123",
	ProviderID: "<provider-id>",
	ResourceID: "<resource-id>",
	AccountID: "user@example.com",
	Timestamp: "2022-09-02T12:00:00:00Z",
	Action: "read",
	Type: "query",
	Metadata: map[string]interface{}{...},
}

I think this is how the activity log entity should looks like, at least
note: this is still the first proposal, might update this in the future

Will check on each provider what kind of Types that they're support in their activity logs

mabdh added the access label Aug 23, 2022

mabdh mentioned this issue Aug 23, 2022

Data access monitoring #171

Open

mabdh assigned rahmatrhd Aug 23, 2022

mabdh changed the title ~~feat: collecting usage history (access logs) data~~ Data access monitoring: Collecting usage history (access logs) data Aug 23, 2022

mabdh added the roadmap label Aug 23, 2022

rahmatrhd added the enhancement New feature or request label Aug 25, 2022

ravisuhag changed the title ~~Data access monitoring: Collecting usage history (access logs) data~~ Collect usage history (access logs) data Sep 21, 2022

ravisuhag linked a pull request Nov 30, 2022 that will close this issue

feat: introduce provider activity #331

Merged

rahmatrhd closed this as completed in #331 Jan 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Collect usage history (access logs) data #265

Collect usage history (access logs) data #265

mabdh commented Aug 23, 2022 •

edited by rahmatrhd

Loading

rahmatrhd commented Sep 2, 2022 •

edited

Loading

Collect usage history (access logs) data #265

Collect usage history (access logs) data #265

Comments

mabdh commented Aug 23, 2022 • edited by rahmatrhd Loading

rahmatrhd commented Sep 2, 2022 • edited Loading

mabdh commented Aug 23, 2022 •

edited by rahmatrhd

Loading

rahmatrhd commented Sep 2, 2022 •

edited

Loading