google · PurelyApplied · Aug 24, 2020 · Aug 25, 2020 · Aug 31, 2020 · mlevesquedion
diff --git a/design/1.0.md b/design/1.0.md
@@ -0,0 +1,97 @@
+I recently opened a Kubernetes Enhancement Proposal to introduce `go-flow-levee` during testing as a defense against accidental credential logging.
+See [KEP-1933](https://github.com/kubernetes/enhancements/pull/1936) details.
+Such integration demands a stable API and well-defined scope.
+
+I propose the following.
+Portions may already be implemented, but are included here as a "big picture."
+
+# 1.0 Targets
+
+## Basic taint propagation detection
+
+Given a spec defining sources and sinks (below), diagnostics are reported for
+any direct call to a sink that includes a source.  Additionally, we detect any
+taint propagation from a source to a sink within the scope of a single function.
+
+Analysis does not explicitly track across the use of reflection.
+
+## Analysis Configuration
+
+Configuration will consist of the following key components:
+
+* Identification of sources, sinks, and sanitizers
+* Identification of analysis scope, e.g. to skip third-party or testing packages.
+* Specify diagnostics to be ignored, e.g false-positives
+
+### Identification of sources, sinks, and sanitizers
+
+Much of our work is directly with `ssa.Value`s and `ssa.Instruction`s.
+Whether those values or instructions represent one element or another,
+ the configuration to identify these members should be as uniform as possible.
+In that light, specification should follow:
+
+#### ValueSpecifier
+A ValueSpecifier marks an `ssa.Value`, e.g., for identification as a source or as the safe result of a sanitizer.
+Values should be specifiable via any or all of:
+
+* Specifier name: For identification during reporting
+* Type / Field: Package path and type name, optional field name
+* Field tags
+* Scope: local variable, free variable, or global
+* Context: within a specific package or function call.  See CallSpecifier below.
+* Is a reference
+* Is "reference-like," e.g., slices, maps, chans.
+* Const value (e.g., `"PASSWORD"`)
+* Argument position (for use below)
+
+#### CallSpecifier
+A CallSpecifier marks an `*ssa.Call`, e.g., as a sink or sanitizer.
+Calls should be specifiable via any or all of:
+
+* Specifier name: For identification during reporting
+* Symbol: package path, function name, optional receiver name.
+* Context: A CallSpecifier indicating scope, such as package or specific functions.
+* Based on argument value and position (ValueSpecifier).
+* Based on return values (ValueSpecifier)
+
+## Inference
+
+Inference represents the scope of what a user does *not* need to configure.
+For instance, we can currently detect getter methods and mark as propagating
+a marked field without requiring any explicit configuration.
+
+If a source type is aliased, the alias is properly identified as equivalent as the aliased source type.
+
+If `Source` is identified as a source type, collection types based on that should also be identified as a source type.
+
+## Internal Process
+
+Add benchmarks.
+
+# Beyond a 1.0
+
+The following seem like significant improvements but seem larger in scope than what constitutes a stable 1.0.
+These items are open to interpretation and warrant discussion.
+
+## Improved Inference
+
+* Detect partial sanitization of collections.
+* Detect sanitization via zero-value.
+    * Explicit sanitization of `type Source struct{ data string }` is easily done via `s.data = ""`.
+    * Alternatively, a user may specify redaction placeholders, e.g. `"REDACTED"` rather than require a zero value.
+* A type defined from a source type (e.g., `type Foo core.Source`) should also be considered a source type.
+    * Note: as tags take part in type identify of the underlying struct literal, field tags are "inherited" in defined types based on existing types.
+      This may result in conflicting current behavior between those sources identified by tag and those identified by path/package/name.
+* A type which contains a known source type as a field should itself be a source type and associated field.  E.g., `type Wrapper struct { data core.Source }`
+* Current effort under the umbrella of *cross functional analysis* greatly refines inference of safety in scenarios that would currently produce a finding.
+
+## Reporting
+
+At present, reporting of Diagnostics is a generic `a source has reached a sink` message with source and sink positions included.
+As we iterate to include wider inference and definitions of sources, sinks, cross-function analysis, etc, the path between a source and a sink can become ambiguous.
+If much inference is targeted for a 1.0 release, reporting will need to be improved to remain meaningful.
+
+## Extensibility
+
+A user may have highly specific considerations for the definition of a source/sink/sanitizer and may wish to implement custom specification code that interacts directly with `ssa` / `types` packages.
+We should expose an extensible interface for this purpose.