From 8aa5d5375a806da71c7d9c2471f7e85595ef914c Mon Sep 17 00:00:00 2001 From: Sven Strittmatter Date: Wed, 25 Nov 2020 15:03:26 +0100 Subject: [PATCH 1/2] Proposing finding attributes hash Signed-off-by: Sven Strittmatter --- docs/adr/adr_0007.adoc | 47 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 47 insertions(+) create mode 100644 docs/adr/adr_0007.adoc diff --git a/docs/adr/adr_0007.adoc b/docs/adr/adr_0007.adoc new file mode 100644 index 0000000000..902c64ef6d --- /dev/null +++ b/docs/adr/adr_0007.adoc @@ -0,0 +1,47 @@ +[[ADR-0007]] += ADR-0007: Proposal How to Mark Findings With Hashes to Find Duplicates + +[cols="h,d",grid=rows,frame=none,stripes=none,caption="Status",%autowidth] +|==== +// Use one of the ADR status parameter based on status +// Please add a cross reference link to the new ADR on 'superseded' ADR. +// e.g.: {adr_suposed_by} <> +| Status +| PROPOSED + +| Date +| 2020-11-25 + +| Author(s) +| Sven Strittmatter +// ... +|==== + +NOTE: The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in https://tools.ietf.org/html/rfc2119[RFC 2119]. + +== Context + +We need the possibility to find duplicate findings. One use case is that we want to accept a finding and want to ignore the same finding in the future. + +=== Assumptions + +* The execution order of _hooks_ is unspecified. +* The information if a finding's hash is a duplicate MUST NOT be stored or maintained in the _SCB_ S3 storage. +* The _SCB_ MUST NOT remove findings: _read-write-hooks_ may alter them, but never delete or filter them out. +** Maybe a _read-hooks_ MAY decide to not store a finding into an external system. + +== Decision + +* We generate a hash over each finding so we can compare findings by the hash and identify duplicates. +* This hash MUST be mutable and MAY be altered by _read-write-hooks_ because we do not want to introduce an exceptions to what a _read-write-hooks_ can alter. +* The _parser_ MUST generate the initial hash of a finding over some attributes of it. +** Each _scanner_ MUST have a default set of hashed attributes. +** This set of hashed attributes MAY be overwritten. +* Each _read-write-hooks_ MUST update the hash as last step because the _hook_ MAY changed a hashed attribute. + +We implement the hashing step in the _parser_ first with feature flag to evaluate this proposal. + +== Consequences + +* We don't need to introduce an ordering for the _read-write-hooks_. +* The duplicate detection/handling MUST be done in another service with its own data storage because we have a stable hash not until the _read-hooks_ will beexecuted and these MUST NOT alter the data in _SCB_ itself. But the _read-hooks_ MAY decide to not store data into an external system. From b894a978bdc29da093d120672890a9c9c647df97 Mon Sep 17 00:00:00 2001 From: Robert Seedorff Date: Mon, 7 Dec 2020 09:50:30 +0100 Subject: [PATCH 2/2] Update adr_0007.adoc --- docs/adr/adr_0007.adoc | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/adr/adr_0007.adoc b/docs/adr/adr_0007.adoc index 902c64ef6d..bce6589803 100644 --- a/docs/adr/adr_0007.adoc +++ b/docs/adr/adr_0007.adoc @@ -32,10 +32,10 @@ We need the possibility to find duplicate findings. One use case is that we want == Decision -* We generate a hash over each finding so we can compare findings by the hash and identify duplicates. -* This hash MUST be mutable and MAY be altered by _read-write-hooks_ because we do not want to introduce an exceptions to what a _read-write-hooks_ can alter. -* The _parser_ MUST generate the initial hash of a finding over some attributes of it. -** Each _scanner_ MUST have a default set of hashed attributes. +* We generate a hash for each finding so we can compare findings by the hash and identify duplicates. +* This hash MUST be mutable and MAY be altered by _read-write-hooks_ because we don't want to introduce an exceptions to what a _read-write-hooks_ can alter. +* The _parser_ MUST generate the initial hash of a finding from some of it's attributes (e.g. name, lication, category ...). +** Each _scanner_ MUST define a default set of attributes used for the hashing. ** This set of hashed attributes MAY be overwritten. * Each _read-write-hooks_ MUST update the hash as last step because the _hook_ MAY changed a hashed attribute. @@ -44,4 +44,4 @@ We implement the hashing step in the _parser_ first with feature flag to evaluat == Consequences * We don't need to introduce an ordering for the _read-write-hooks_. -* The duplicate detection/handling MUST be done in another service with its own data storage because we have a stable hash not until the _read-hooks_ will beexecuted and these MUST NOT alter the data in _SCB_ itself. But the _read-hooks_ MAY decide to not store data into an external system. +* The duplicate detection/handling MUST be done in another service with its own data storage. This is because we have no stable hash until the _read-hooks_ will be executed and these MUST NOT alter the data in _SCB_ itself. But the _read-hooks_ MAY decide to not store data into an external system.