From 8aa5d5375a806da71c7d9c2471f7e85595ef914c Mon Sep 17 00:00:00 2001
From: Sven Strittmatter <Sven.Strittmatter@iteratec.com>
Date: Wed, 25 Nov 2020 15:03:26 +0100
Subject: [PATCH 1/2] Proposing finding attributes hash

Signed-off-by: Sven Strittmatter <Sven.Strittmatter@iteratec.com>
---
 docs/adr/adr_0007.adoc | 47 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 47 insertions(+)
 create mode 100644 docs/adr/adr_0007.adoc
diff --git a/docs/adr/adr_0007.adoc b/docs/adr/adr_0007.adoc
new file mode 100644
index 0000000000..902c64ef6d
--- /dev/null
+++ b/docs/adr/adr_0007.adoc
@@ -0,0 +1,47 @@
+[[ADR-0007]]
+= ADR-0007: Proposal How to Mark Findings With Hashes to Find Duplicates
+
+[cols="h,d",grid=rows,frame=none,stripes=none,caption="Status",%autowidth]
+|====
+// Use one of the ADR status parameter based on status
+// Please add a cross reference link to the new ADR on 'superseded' ADR.
+// e.g.: {adr_suposed_by} <<ADR-0000>>
+| Status
+| PROPOSED
+
+| Date
+| 2020-11-25
+
+| Author(s)
+| Sven Strittmatter <Sven.Strittmatter@iteratec.com>
+// ...
+|====
+
+NOTE: The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this document are to be interpreted as described in https://tools.ietf.org/html/rfc2119[RFC 2119].
+
+== Context
+
+We need the possibility to find duplicate findings. One use case is that we want to accept a finding and want to ignore the same finding in the future.
+
+=== Assumptions
+
+* The execution order of _hooks_ is unspecified.
+* The information if a finding's hash is a duplicate MUST NOT be stored or maintained in the _SCB_ S3 storage.
+* The _SCB_ MUST NOT remove findings: _read-write-hooks_ may alter them, but never delete or filter them out.
+** Maybe a _read-hooks_ MAY decide to not store a finding into an external system.
+
+== Decision
+
+* We generate a hash over each finding so we can compare findings by the hash and identify duplicates.
+* This hash MUST be mutable and MAY be altered by _read-write-hooks_ because we do not want to introduce an exceptions to what a _read-write-hooks_ can alter.
+* The _parser_ MUST generate the initial hash of a finding over some attributes of it.
+** Each _scanner_ MUST have a default set of hashed attributes.
+** This set of hashed attributes MAY be overwritten.
+* Each _read-write-hooks_ MUST update the hash as last step because the _hook_ MAY changed a hashed attribute.
+
+We implement the hashing step in the _parser_ first with feature flag to evaluate this proposal.
+
+== Consequences
+
+* We don't need to introduce an ordering for the _read-write-hooks_.
+* The duplicate detection/handling MUST be done in another service with its own data storage because we have a stable hash not until the _read-hooks_ will beexecuted and these MUST NOT alter the data in _SCB_ itself. But the _read-hooks_ MAY decide to not store data into an external system.

From b894a978bdc29da093d120672890a9c9c647df97 Mon Sep 17 00:00:00 2001
From: Robert Seedorff <Robert.Seedorff@iteratec.com>
Date: Mon, 7 Dec 2020 09:50:30 +0100
Subject: [PATCH 2/2] Update adr_0007.adoc

---
 docs/adr/adr_0007.adoc | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/adr/adr_0007.adoc b/docs/adr/adr_0007.adoc
index 902c64ef6d..bce6589803 100644
--- a/docs/adr/adr_0007.adoc
+++ b/docs/adr/adr_0007.adoc
@@ -32,10 +32,10 @@ We need the possibility to find duplicate findings. One use case is that we want
 
 == Decision
 
-* We generate a hash over each finding so we can compare findings by the hash and identify duplicates.
-* This hash MUST be mutable and MAY be altered by _read-write-hooks_ because we do not want to introduce an exceptions to what a _read-write-hooks_ can alter.
-* The _parser_ MUST generate the initial hash of a finding over some attributes of it.
-** Each _scanner_ MUST have a default set of hashed attributes.
+* We generate a hash for each finding so we can compare findings by the hash and identify duplicates.
+* This hash MUST be mutable and MAY be altered by _read-write-hooks_ because we don't want to introduce an exceptions to what a _read-write-hooks_ can alter.
+* The _parser_ MUST generate the initial hash of a finding from some of it's attributes (e.g. name, lication, category ...).
+** Each _scanner_ MUST define a default set of attributes used for the hashing.
 ** This set of hashed attributes MAY be overwritten.
 * Each _read-write-hooks_ MUST update the hash as last step because the _hook_ MAY changed a hashed attribute.
 
@@ -44,4 +44,4 @@ We implement the hashing step in the _parser_ first with feature flag to evaluat
 == Consequences
 
 * We don't need to introduce an ordering for the _read-write-hooks_.
-* The duplicate detection/handling MUST be done in another service with its own data storage because we have a stable hash not until the _read-hooks_ will beexecuted and these MUST NOT alter the data in _SCB_ itself. But the _read-hooks_ MAY decide to not store data into an external system.
+* The duplicate detection/handling MUST be done in another service with its own data storage. This is because we have no stable hash until the _read-hooks_ will be executed and these MUST NOT alter the data in _SCB_ itself. But the _read-hooks_ MAY decide to not store data into an external system.