Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Asset Integrations & Entity Store RFC - Stage 2 #2233

Open
wants to merge 34 commits into
base: main
Choose a base branch
from

Conversation

SourinPaul
Copy link
Contributor

@SourinPaul SourinPaul commented Jul 12, 2023

Stage1 checklist:

  • .yml files
  • Examples of Source Documents
  • Examples of Mappings
  • New concerns from stage1

Key change-log:

  • Given the security use cases and acknowledging overall PII exposure, I have removed the two user.phone.* fields from this stage-1 of this proposal.

Including examples of source documents and mappings
Adding level keys
I am removing the 'phone' fields from the proposal to reduce the risk of PII exposure.
@SourinPaul SourinPaul marked this pull request as ready for review July 13, 2023 22:38
@SourinPaul SourinPaul requested a review from a team as a code owner July 13, 2023 22:38
@ebeahan ebeahan changed the title Stage1 2215 assetintegration rfc [RFC] Asset Integrations & Entity Store RFC - Stage 1 Jul 14, 2023
@ebeahan ebeahan added the RFC label Jul 14, 2023
Copy link
Member

@ebeahan ebeahan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some initial comments on a first review. As a general comment, I suggest reviewing what can be used in the existing schema to avoid adding overlapping fields as much as possible.

For example, is user.profile.organization necessary when we have organization.name and organization.id fields? Reusing existing fields avoids adding more fields, and it also allows users to query across potentially other data sources that also populate the organization.* fields.

rfcs/text/0041-asset-integration.md Outdated Show resolved Hide resolved
rfcs/text/0041-asset-integration.md Outdated Show resolved Hide resolved
rfcs/text/0041/asset.yml Outdated Show resolved Hide resolved
rfcs/text/0041/asset.yml Outdated Show resolved Hide resolved
rfcs/text/0041/asset.yml Outdated Show resolved Hide resolved
rfcs/text/0041/user.yml Outdated Show resolved Hide resolved
rfcs/text/0041/user.yml Outdated Show resolved Hide resolved
rfcs/text/0041/user.yml Outdated Show resolved Hide resolved
rfcs/text/0041/user.yml Outdated Show resolved Hide resolved
rfcs/text/0041/user.yml Outdated Show resolved Hide resolved
SourinPaul and others added 11 commits July 19, 2023 10:25
Co-authored-by: Eric Beahan <eric.beahan@elastic.co>
Co-authored-by: Eric Beahan <eric.beahan@elastic.co>
Co-authored-by: Eric Beahan <eric.beahan@elastic.co>
Co-authored-by: Eric Beahan <eric.beahan@elastic.co>
Co-authored-by: Eric Beahan <eric.beahan@elastic.co>
Co-authored-by: Eric Beahan <eric.beahan@elastic.co>
Co-authored-by: Eric Beahan <eric.beahan@elastic.co>
Co-authored-by: Eric Beahan <eric.beahan@elastic.co>
Co-authored-by: Eric Beahan <eric.beahan@elastic.co>
Co-authored-by: Eric Beahan <eric.beahan@elastic.co>
Co-authored-by: Eric Beahan <eric.beahan@elastic.co>
@SourinPaul
Copy link
Contributor Author

Hey @SourinPaul - I'm back from leave and looking this over. I'd still like to have a discussion about whether security is ok with the concepts of:

  • asset.ean - elastic asset name, a unique URI identifier for this instance across all assets (composition subject to further refinement and change)
  • asset.parents, asset.children, asset.references: keyword lists of EAN values that map to other assets that this asset is related to, in simple parent/child/reference terms

@tommyers-elastic do we have any other requests?

@jasonrhodes @chrisdistasio and I met to review the above inputs and agreed on the following as the next steps:

  1. Exclude asset.ean, asset.parents, and asset.children from this RFC proposal to move the RFC forward.
  2. Use cases that require us to persist a composite entity identifier (in addition to asset.id) and entity relationship metadata, which are common to both Entity Analytics (in security) and service mapping or inventory (in o11y). We will continue to collaborate on introducing these fields in future RFCs.
  3. Once internal research and development work is complete across workstreams and we have confidence in the necessary fields, we will propose a subsequent RFC to extend the ECS schema.

With the above, I'll take the following next steps:

  • Capture resolution to the concerns in Stage 1
  • Request a review approval from @chrisdistasio for this PR to move RFC to Stage 2.

Please lmk if I'm overlooking anything.

cc: @ebeahan

Captured agreements to stage 1 concern.
@jasonrhodes
Copy link
Member

LGTM 👍

@SourinPaul
Copy link
Contributor Author

I have updated stage 1 RFC artifacts per my prior comment.

@jasonrhodes @chrisdistasio please give this PR a formal approval so @ebeahan can help move the RFC to stage 2. Thank you!

@SourinPaul
Copy link
Contributor Author

@ebeahan, resurfacing this so we can move this to Stage 2. Thanks for your help!

@@ -1,7 +1,7 @@
# 0041: Asset Integration
<!-- Leave this ID at 0000. The ECS team will assign a unique, contiguous RFC number upon merging the initial stage of this RFC. -->

- Stage: **0 (strawperson)** <!-- Update to reflect target stage. See https://elastic.github.io/ecs/stages.html -->
- Stage: **1 (Draft)** <!-- Update to reflect target stage. See https://elastic.github.io/ecs/stages.html -->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we still targeting stage 1 here @SourinPaul? You mentioned stage 2 in a different conversation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ebeahan, thanks for the ping. We are currently in stage 1 and targeting stage 2.

Do I need to update this section? Please advise, or feel free to update before merging.

@ebeahan ebeahan requested a review from a team February 21, 2024 03:35
@ebeahan
Copy link
Member

ebeahan commented Feb 22, 2024

Before merging, can @elastic/sec-deployment-and-devices and/or @trisch-me also review these changes?

@ebeahan ebeahan changed the title [RFC] Asset Integrations & Entity Store RFC - Stage 1 [RFC] Asset Integrations & Entity Store RFC - Stage 2 Feb 22, 2024
an infrastructure. These fields can be nested under other objects that
identifies an asset such as host, user, network, and cloud schemas.
reusable:
top_level: false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should asset not be used on the top level?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably because they are tied to main object and don't have meaning themselves. But good question

Copy link

@tinnytintin10 tinnytintin10 Apr 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MikePaquette @oatkiller @jaredburgettelastic, do you have any insights on why asset should not be used at the top level?

IMO, the asset fieldset should and can be used at the top level. There are valid use cases where an asset needs to be represented independently of any specific context like host, user, network, or cloud. For example, consider an IT asset management system (some CMDB) that tracks all the assets in an organization. This includes not only physical assets like workstations and servers but also mobile devices and other assets that might not fit neatly into the host, user, network, or cloud fieldsets.

Also, based on our use cases, we often need to query or analyze assets across different contexts (e.g., find all assets owned by a specific user, regardless of whether they're associated with a host, network, or cloud). This would be easier to achieve if asset is a top-level field.

With this in mind, I propose that we use the asset fieldset at the top level. This would give us the flexibility to represent assets that are not directly associated with a host, user, network, or cloud environment.

Copy link

@jaredburgettelastic jaredburgettelastic May 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tinnytintin10 agreed. Because of the open source nature of ECS, we should not limit ourselves to the vision of solution (Security & Observability) use cases only, as any Elastic user may desire to map data into the asset schema the likes of which we haven't considered.

@norrietaylor
Copy link
Member

norrietaylor commented Mar 7, 2024

@trisch-me, could you please review this from an Otel perspective? Should we be planning a contribution to Semantic Conventions?

@@ -0,0 +1,170 @@
---
- name: user.profile.id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't create profile as separate namespace because it's tuned and should be used only in user namespace?

multi_fields:
- type: text
example: first.last@elk.elastic.co
description: Array of additional user identities (usually email addresses).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't it have

normalize:
        - array

level: extended
type: keyword
example: Regular
description: Further classification type for the user account.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how it correlates to just user.profile.type
Should we add more details into description to resolve this question for ECS users?

type: keyword
example: US - Washington - Distributed
description: Assigned location for the user account.
- name: user.profile.mobile_phone
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it was written these 2 fields will be removed?

level: extended
type: date
example: June 5, 2023 @ 18:25:57.000
description: Date account was activated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The format of the date should be in description

@trisch-me
Copy link
Contributor

From otel POV we don't have assets as namespace there and currently is not a good place to start to add a new namespace.
But for the user we can already start to follow our upcoming guidelines: first check Otel and then adopt in ECS because there is an open PR for the user. it is still in discussions and it might take some time, so it shouldn't prevent us from continue work in ECS but as soon as it's merged we can add user.profile there if we think this should be in the next ECS releases and we need it for our product.

For opening PR there we would need a use-case story - why we should have those fields

Copy link

@tinnytintin10 tinnytintin10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tehilashn @eyalkraft @oren-zohar @kfirpeled @romulets, I have taken a first look at some of the suggestions here that I believe will impact some of the experiences we want to build out. See my thoughts and feedback below. I will soon take a second pass-through of the remaining asset and user fields. In the meantime, please review and provide your thoughts.

an infrastructure. These fields can be nested under other objects that
identifies an asset such as host, user, network, and cloud schemas.
reusable:
top_level: false
Copy link

@tinnytintin10 tinnytintin10 Apr 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MikePaquette @oatkiller @jaredburgettelastic, do you have any insights on why asset should not be used at the top level?

IMO, the asset fieldset should and can be used at the top level. There are valid use cases where an asset needs to be represented independently of any specific context like host, user, network, or cloud. For example, consider an IT asset management system (some CMDB) that tracks all the assets in an organization. This includes not only physical assets like workstations and servers but also mobile devices and other assets that might not fit neatly into the host, user, network, or cloud fieldsets.

Also, based on our use cases, we often need to query or analyze assets across different contexts (e.g., find all assets owned by a specific user, regardless of whether they're associated with a host, network, or cloud). This would be easier to achieve if asset is a top-level field.

With this in mind, I propose that we use the asset fieldset at the top level. This would give us the flexibility to represent assets that are not directly associated with a host, user, network, or cloud environment.

- cloud
type: group
fields:
- name: category

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MikePaquette @oatkiller @jaredburgettelastic Have we considered establishing a more detailed taxonomy up-front for asset categorization, which will include a set of allowed values (like we do with the event fieldset) that could expand as needed?

A detailed taxonomy could significantly enhance the schema’s flexibility and precision. For instance, consider the following:

Detailed Asset Taxonomy / Categorization Mind Map

View in Whimsical

Asset Management Analytics Home

Field Name Description
asset.category The top-level classification of assets, reflecting the primary nature of the asset groups, like Software, Infrastructure, or Identity. See mind-map for more details
asset.subcategory A division within a category that encompasses a range of assets sharing common characteristics, such as 'Applications' under 'Software', or 'Compute' and 'Storage' under 'Infrastructure'. See mind-map for more details
asset.type A specific classification of assets within a subcategory, characterizing the primary function or purpose, like 'Operating System' or 'Development Tools' under the 'Applications' subcategory. See mind-map for more details
asset.subtype The most granular classification that provides specific details about the asset, including the exact version, model, or configuration, such as 'Microsoft Windows 10' or 'Ubuntu Server 20.04' under the 'Operating System' type or AWS EC2 under the Virtual Machine type. See mind-map for more details

This level of granularity/detail in our schema/taxonomy will enable us to deliver experiences tailored to an asset's specific classification—like specific security suggestions for workstations versus cloud storage vs cloud compute assets (i.e., allow our product to behave more intelligently). Precise asset classification will also enable us to provide sophisticated billing models in our serverless offering by allowing us to track assets more accurately (in a transparent and explainable way we can expose to users for billing). And of course, as new technologies emerge that we need to track, this schema can evolve without overhauling the existing framework, ensuring the product adapts to future developments.

That being said, over-flexibility at higher classification levels could lead to consistency and an unpredictable product experience. Therefore, IMO for asset.category and asset.subcategory, a defined set of allowed values (similar to what we do with event categorization fields) is crucial to maintain a standardized, navigable, and intuitive interface. This will ensure that as users interact with different assets, the experience remains coherent and aligned with the overall experience we aim to provide:

Field Name Allowed Values Rationale
asset.category Yes Ensures consistency and predictability at the highest level of asset classification. This aids in defining standard operational procedures and analytics across the organization.
asset.subcategory Yes Provides a controlled expansion of categories, ensuring relevant details are captured while maintaining uniformity in data segmentation for reliable analysis and reporting.
asset.type No Allows for specific and varied asset identification within subcategories, accommodating unique and diverse organizational assets without modifying the overarching schema.
asset.subtype No Permits detailed and nuanced asset distinctions, reflecting the granular variations and characteristics specific to the asset types for in-depth management and tracking.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI @dimadavid @r4zr32d3k1l, as I belive decisions here will have an impact on our ultimate UX for the asset inventory experience we want to build out

Copy link

@jaredburgettelastic jaredburgettelastic May 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I agree with the sentiment of "establishing a more detailed taxonomy" 👍

As for the fields, I wonder if we should instead leverage the existing field names for categorization that ECS already provides. These are (in largest-to-smallest bucket order): kind, category, type, outcome, and can be found here.

The only one of those that doesn't fit the use case of assets would be outcome, and therefore it may be appropriate to incorporate part of your proposal, leaving us with kind, category, type, and subtype.

As for whether each categorization bucke has a list of allowed values, I'm curious to hear the thoughts of others. I believe I agree with your assessment that the highest values should be allowlisted (in my proposal, that would be kind and category, while in yours that would be category and subcategory). The counterpoint, though, is that I believe the primary reason events categorization had allowlisted values is because events data can be seen as metadata, while asset data is domain-specific, and we don't know every desired domain up front. If we went with the taxonomy currently defined in the linked mind-map, a new ECS RFC would have to be created to update those allowed values if an Elastic user wanted to begin considering their production cutting machine (as a random example) as an asset, and have that data be ECS-compliant. And maybe that's fine! But food for thought. Would the alternative, that solutions define and standardize on their own allowlisted values for these fields and ECS remains agnostic to them, be a more desirable approach?

type: keyword
example: sourin.paul@elastic.co
description: The primary user entity who owns the 'Host' asset
- name: priority

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oatkiller @jaredburgettelastic, is there a clear distinction between asset priority and asset criticality? If so, what's the difference? See below:

  • name: priority
    level: extended
    type: keyword
    example: Priority 1
    description: A priority classification for the asset obtained from outside this
    system, such as from external CMDB or Directory service.
  • name: criticality
    level: extended
    type: keyword
    example: Critical
    description: A business criticality classification assigned to the asset.

The only difference seems to be where the context (how important this asset is to the organization/security team) came from, either from an external source or natively provided in our system. I don't think this warrants separate mapping and I also think it will be confusing for users.

cc @r4zr32d3k1l @dimadavid

example: workstation
description: "A sub-classification of assets. Possible values for host assets:
workstation, S3,Compute. Possible values for host assets: (NULL/ TBD)"
- name: id

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we include multiple forms of asset identifiers (IDs) in the schema to capture both an asset's native/external ID (such as the ARN for an AWS resource) and a unique internal identifier (some UUID or a hash value of some sort we come up with) to account for assets that might have native/external IDs that could lead to collisions or lack of uniqueness? I could see this being an issue with k8s components, some IPs we have limited metadata on, etc.

This is a technical consideration, so I don't have a strong opinion one way or the other. However, we should keep two things in mind

  1. If we end up going with only one ID value, we should ensure (where possible/relevant) we use an asset's native ID (like an AWS asset's ARN) as the value for this field.
  2. If we do end up going with a dual ID approach (an internal UUID and native ID), then we should capture these two data points separately in a clear/easy-to-understand field. Ideally, from a UX perspective, we don't surface this UUID but instead, the ID the user is more familiar with (native ID).

Copy link

@jaredburgettelastic jaredburgettelastic May 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Experience and external standards suggest that just a singular "ID" field, without context, is not enough when IDs are exposed beyond some internal bound. With that said, I'd go even further: I don't feel we should only limit ourselves to "native"/"internal" or surrogate/non-surrogate IDs.

Some concrete examples to make this case:

  • Microsoft uses SIDs for Windows Server, which is made up of not only an identifier, but also what authority assigned that identifier (the authority being the "context" mentioned above).
  • AWS EC2 instances have an ARN, but in most AWS APIs you don't use the ARN to identify that EC2 instance. Instead, you use its "Instance ID", which would look something like "i-01e8de571c1ea7903". But you can compute the ARN if you have more information on where that instance is hosted (provided the region, account id, and instance ID). And both of these are useful depending on the context, as you'd need the ARN if you were writing an AWS policy, but the instance ID if you were trying to stop the instance. So ID context is helpful (in this case, the identifier type such as "instance ID" vs "ARN", is the "context" mentioned above).

Although prior art in ECS doesn't match this (such as the ECS host.id field), my recommendation would be for asset.id to represent something like an array of identifiers, or subfields, having context data, such as (at least) a type. An example document might look like:

{
  "asset": {
    "id": {
      "arn": "arn:aws:ec2:us-east-1:123456789012:i-01e8de571c1ea7903",
      "instance_id": "i-01e8de571c1ea7903"
    }
  }
}

or even

{
  "asset": {
    "ids": [
      {
        "id_type": "arn",
        "id_value": "arn:aws:sns:us-east-1:123456789012:example-sns-topic-name"
      }
    ] 
  }
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From one side, I like how rich it becomes. From the other side, I wonder how easy it becomes to correlate data.

For example, how easy would be to have a ES|QL query joining documents

<!--
Stage 1: If the changes include field additions or modifications, please create a folder titled as the RFC number under rfcs/text/. This will be where proposed schema changes as standalone YAML files or extended example mappings and larger source documents will go as the RFC is iterated upon.
-->

This proposal extends the existing ECS field set to store inventory metadata for hosts and users from external application repositories. Using ECS to store such fields will improve metadata querying and retrieval across various use cases.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This description primarily focuses on hosts and users. Have we considered the benefits of expanding the asset schema to encompass a broader range of assets from the start?

An asset schema that enables us to capture a broader range of assets is essential for developing new asset-centric features and experiences, such as the proposed asset inventory experience/workflow.

An extensible schema will also be crucial for specific use cases like our proposed enhancements to SIEM to enable better Cloud Detection and Response (CDR), where accurately modeling and representing a diverse array of cloud assets beyond hosts and users is a crucial requirement for the experiences we want to deliver. For example, today, in our SIEM, we have threat detection rules for AWS and other CSPs that detect malicious activity related to non-host and user entities. When these detection rules trigger, our current alert flyout will only highlight host and user assets as being present, even though the detection rule explicitly mentions other assets too (ex., RDS database, SecurityGroup, etc.); having a standardized schema is one of the first steps in addressing enhancements like this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

```


#### AzureAD Hosts

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oatkiller @jaredburgettelastic do you have any sample AzureAD data on hand? If so, can you help out with this section or should we drop it?

@tinnytintin10
Copy link

@jaredburgettelastic @oatkiller, While working on concrete examples for my ask in this epic, I realized there might be an issue with using the term "asset" to model the wide variety of entities present in our logs.

The core concern is that "asset" implies ownership or inherent value to the user, which isn't always the case for some resources in the logs. For example, a network flow log or cloud trail log might reference an external IP address belonging to a third party or even a malicious actor. Modeling these with the "asset" field set could be misleading and cause confusion. In short, while asset equals entity, not all entities are assets.

Instead, I propose that we consistently use the term "entity" across our entire security solution, from the data model being proposed here to the in-product experiences (e.g., asset inventory entity inventory). The term "entity" is more versatile and can accurately represent both user/organization-owned resources and external resources that might appear in logs but aren't necessarily owned by the organization/user.

This change would provide several key benefits:

  • Clarity: By using "entity" throughout, we avoid the ownership ambiguity of "asset" and present a clearer, more accurate representation of the data being modeled. The inventory experience we build will provide the curated experiences users need to manage the resources that belong to them.
  • Consistency: As we're committed to the name "entity analytics" for our risk analytics capability, pairing that with an "asset inventory" could be confusing.
  • Alignment: We've been grappling with the "asset" vs. "entity" question, often starting sentences with "asset/entity." I think a unified term will foster better cross-team collaboration and data correlation.

My ask for you:

  • Does my underlying point make sense? (i.e., it could be misleading to model all resources using the term "asset")
  • Is swapping the field set name to entity feasible?
  • If swapping the name isn't feasible/possible at this stage, how would you propose we tackle the issue I highlighted?

@jaredburgettelastic
Copy link

jaredburgettelastic commented May 22, 2024

@jaredburgettelastic @oatkiller, While working on concrete examples for my ask in this epic, I realized there might be an issue with using the term "asset" to model the wide variety of entities present in our logs.

The core concern is that "asset" implies ownership or inherent value to the user, which isn't always the case for some resources in the logs. For example, a network flow log or cloud trail log might reference an external IP address belonging to a third party or even a malicious actor. Modeling these with the "asset" field set could be misleading and cause confusion. In short, while asset equals entity, not all entities are assets.

Instead, I propose that we consistently use the term "entity" across our entire security solution, from the data model being proposed here to the in-product experiences (e.g., asset inventory entity inventory). The term "entity" is more versatile and can accurately represent both user/organization-owned resources and external resources that might appear in logs but aren't necessarily owned by the organization/user.

This change would provide several key benefits:

  • Clarity: By using "entity" throughout, we avoid the ownership ambiguity of "asset" and present a clearer, more accurate representation of the data being modeled. The inventory experience we build will provide the curated experiences users need to manage the resources that belong to them.
  • Consistency: As we're committed to the name "entity analytics" for our risk analytics capability, pairing that with an "asset inventory" could be confusing.
  • Alignment: We've been grappling with the "asset" vs. "entity" question, often starting sentences with "asset/entity." I think a unified term will foster better cross-team collaboration and data correlation.

My ask for you:

  • Does my underlying point make sense? (i.e., it could be misleading to model all resources using the term "asset")
  • Is swapping the field set name to entity feasible?
  • If swapping the name isn't feasible/possible at this stage, how would you propose we tackle the issue I highlighted?

@tinnytintin10 thank you for the input!

@tommyers-elastic on the Observability Solution mentioned that they are also shifting away from the term "assets" and toward "entities", so this aligns well from that perspective. However, I'm unsure of:

  • whether there are any Observability features or integrations already implemented that align with this RFC in its current state
  • whether the future plans for Observability are shifting completely away from fields defined as asset. This schema defined in their most recent implementation suggests that they are indeed moving away from that term. However, note that that schema does not adhere to this RFC more generally.

@tommyers-elastic could you provide input from the perspective of Observability?

I know of only one feature in the Security Solution today that adheres to the asset portion of this RFC, which is "Asset Criticality". There are two pieces to this: one is an index that stores asset criticality information, and the other is an enrichment on alert documents that adds the host/user.asset.criticality field to each alert if one is assigned for that host/user.

The asset criticality feature is currently behind an advanced setting, so there could be wiggle room there in our ability to change the data structures. However, from a nomenclature perspective, "Asset Criticality" explicitly applies to resources that are owned by the customer/Elastic Security user, because they must explicitly assign those classifications. I don't know if our product partners have a desire to change that nomenclature in the Security Solution platform. @MikePaquette could you please provide input from a product perspective on this portion?

cc @oatkiller

@oren-zohar
Copy link

@tinnytintin10, I completely agree with your underlying point that using the term "asset" to model the wide variety of entities in our logs can be misleading, as it implies ownership or inherent value to the user, which isn't always the case.

However, I suggest keeping "Asset Inventory" as is. The asset inventory integration focuses on managed entities, aligning with your definition of "assets." Additionally, the collected assets could still be indexed into an "entity" ECS, which makes sense in my opinion. This distinction helps maintain clarity and consistency for user/organization-owned resources.

What do you think about this?

@lauravoicu
Copy link

@tinnytintin10 you are probably aware of this, but to add to @oren-zohar's point from an InfoSec perspective: in InfoSec we are using the term asset, and this comes from the ISO definition of an asset: "An asset is an item, thing or entity that has potential or actual value to an organisation" (although other definitions of an asset would imply ownership, this one does not address ownership explicitly). I acknowledge terminology is extremely important, and there are of course many different inputs to this, hope this helps!

@chrisdistasio
Copy link

@jaredburgettelastic o11y team has chosen to use "entity" (instead of asset). i'm not aware of any o11y feature(s) that use assetin the context of this rfc.

cc: @tommyers-elastic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet