Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[backend] use representatives in distrib graphs, including 'restricted' case (#6319) #6323

Merged
merged 14 commits into from Apr 4, 2024

Conversation

labo-flg
Copy link
Member

@labo-flg labo-flg commented Mar 12, 2024

Proposed changes

  • use representative in all distribution queries, removing a lot of lengthy duplicated query code.
  • when distribution data gives restricted entities (as a side of a relationship), use 'restricted' representative
  • refactor frontend code as much as possible in this context

Related issues

Closes #6319

Further comments

A lot of changes but all the same.
In essence, I'm now using the representative field defined for every StixObject and StixRelationShip (and more, see opencti.graphql) in the frontend queries for Distribution widgets.
Backend side, distribution data will now send a dummy entity info, with "restricted" as representative, and no longer filter out these items (this is solving #6319).

Refactoring:

  • defaultValue and other related utility functions, used everywhere in the app, have been moved to a specific file defaultRepresentatives out of Graph.js
  • defaultValue and defaultSecondaryValue have been renamed to getMainRepresentative and getSecondaryRepresentative to avoid confusion with the settings default value concept.
  • getMainRepresentative and getSecondaryRepresentative first try to use the representative of the input object, then fallback to the previous possibilities
  • new hook useDistributionGraphData to factorize common code in distribution graphs (utilities to turn query data into chart props)

A lot of other places in the code still use lengthy queries with repeated ... on XXX instead of using representative { main }. I'm not addressing all of them here.

Ultimately, we shall be able to simplify getMainRepresentative and getSecondaryRepresentative to only use representative when possible, and other fields for the objects that do not define it (and a side question is... shall we define a representative for every object in the platform? that would make things lot easier).

Copy link

codecov bot commented Mar 12, 2024

Codecov Report

Attention: Patch coverage is 68.42105% with 12 lines in your changes are missing coverage. Please review.

Project coverage is 67.80%. Comparing base (7b0bcf3) to head (2c306f9).
Report is 22 commits behind head on master.

❗ Current head 2c306f9 differs from pull request most recent head cfea2e9. Consider uploading reports for the commit cfea2e9 to get more accurate results

Files Patch % Lines
...latform/opencti-graphql/src/database/middleware.js 71.87% 9 Missing ⚠️
...ncti-graphql/src/database/entity-representative.js 60.00% 2 Missing ⚠️
...ti-platform/opencti-graphql/src/database/engine.js 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6323      +/-   ##
==========================================
+ Coverage   67.69%   67.80%   +0.10%     
==========================================
  Files         532      532              
  Lines       65010    65101      +91     
  Branches     5445     5491      +46     
==========================================
+ Hits        44008    44139     +131     
+ Misses      21002    20962      -40     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@labo-flg labo-flg marked this pull request as draft March 13, 2024 09:25
@labo-flg
Copy link
Member Author

labo-flg commented Mar 14, 2024

If you ask for the 10 first element matching a relationship distribution on target/source, you get the first 10. If you see only 3 it means 7 of them are not accessible to you (due to markings for instance).
This issue is more about a confusing display, that might be perceived as buggy.

Showing the first 10 that YOU can see is not suitable : it means a dashboard is not presenting the same information to different users and that's not the point (dashboard are meant to be shared, etc.).

The solution we have in mind will require more work:

  • Change the relationshipDistribution query API, to return only what's required for display and not the fully resolved entities (id, entity_type, representative)
  • if user has no access over one of the entity, we put <restricted> as representative

This is a breaking change to the API but will allow us to simplify the query handling frontend side, where today we must make ... on fragments for each entity_type possible, with a magic function to deduce the label to display.

@labo-flg
Copy link
Member Author

If you ask for the 10 first element matching a relationship distribution on target/source, you get the first 10. If you see only 3 it means 7 of them are not accessible to you (due to markings for instance). This issue is more about a confusing display, that might be perceived as buggy.

Showing the first 10 that YOU can see is not suitable : it means a dashboard is not presenting the same information to different users and that's not the point (dashboard are meant to be shared, etc.).

The solution we have in mind will require more work:

  • Change the relationshipDistribution query API, to return only what's required for display and not the fully resolved entities (id, entity_type, representative)
  • if user has no access over one of the entity, we put <restricted> as representative

This is a breaking change to the API but will allow us to simplify the query handling frontend side, where today we must make ... on fragments for each entity_type possible, with a magic function to deduce the label to display.

This initial idea implies we update the API at a high level, namely the Distribution type in our graphql spec.
To limit the disruption, I suggest we actually use our current API without change: in fact, Distribution.entity can be null.
Let's assume that the item is not accessible when the entity is null, but we keep sending the computed stat for this object id.

@labo-flg
Copy link
Member Author

labo-flg commented Apr 2, 2024

This initial idea implies we update the API at a high level, namely the Distribution type in our graphql spec.
To limit the disruption, I suggest we actually use our current API without change: in fact, Distribution.entity can be null.
Let's assume that the item is not accessible when the entity is null, but we keep sending the computed stat for this object id.

In fact, no need to get around this problem... we need to use the representative even if it's a big refactoring. It's why it has been designed in the first place.

I'll use the representative in all queries and refactor to make it work. And use representative 'restricted' when needed.

@labo-flg labo-flg changed the title [backend] expand entity resolution when converting aggregate distribution (#6319) [backend]use representatives in distrib graphs, including 'restricted' case (#6319) Apr 2, 2024
@labo-flg labo-flg changed the title [backend]use representatives in distrib graphs, including 'restricted' case (#6319) [backend] use representatives in distrib graphs, including 'restricted' case (#6319) Apr 3, 2024
@labo-flg labo-flg marked this pull request as ready for review April 3, 2024 07:07
@lndrtrbn lndrtrbn self-requested a review April 3, 2024 07:18
@labo-flg
Copy link
Member Author

labo-flg commented Apr 3, 2024

Note that Observed Data now use the representative, which is NOT the observed value but the time range for this observation.
image

As It's a container, there might be several entities in it... I'm unsure how we can improve this.

@labo-flg labo-flg merged commit fc0366c into master Apr 4, 2024
5 of 6 checks passed
@labo-flg labo-flg deleted the issue/6319 branch April 4, 2024 06:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Relationship Distribution gives inconsistent results when changing the limit parameter
2 participants