Skip to content

Commit

Permalink
Merge pull request #436 from pcattori/adr
Browse files Browse the repository at this point in the history
Start tracking Architectural Decisions (ADs)
  • Loading branch information
pcattori committed Aug 17, 2020
2 parents 28b8a75 + 87f82ec commit 4ec7db6
Show file tree
Hide file tree
Showing 13 changed files with 351 additions and 21 deletions.
1 change: 1 addition & 0 deletions .adr-dir
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
docs/contributor-guide/adr/
2 changes: 1 addition & 1 deletion docs/contributor-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ Submit feature requests as [Github issues](https://github.com/Datatamer/tamr-cli
* [Install the codebase](contributor-guide/install)
* [Run dev tasks](contributor-guide/dev-tasks)
* [Configure your text editor](contributor-guide/text-editor)
* [Read the style guide](contributor-guide/style-guide)
* [Read the ADRs](contributor-guide/adrs)
* [How to write tests](contributor-guide/how-to-write-tests)
* [Submit a pull request](contributor-guide/pull-request)

Expand Down
19 changes: 19 additions & 0 deletions docs/contributor-guide/adr/0001-record-architecture-decisions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# 1. Record architecture decisions

Date: 2020-08-14

## Status

Accepted

## Context

We need to record the architectural decisions made on this project.

## Decision

We will use Architecture Decision Records, as [described by Michael Nygard](http://thinkrelevance.com/blog/2011/11/15/documenting-architecture-decisions).

## Consequences

See Michael Nygard's article, linked above. For a lightweight ADR toolset, see Nat Pryce's [adr-tools](https://github.com/npryce/adr-tools).
31 changes: 31 additions & 0 deletions docs/contributor-guide/adr/0002-linting-and-formatting.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# 2. Linting and formatting

Date: 2019-01-14

## Status

Accepted

## Context

Inconsistent code formatting slows down development and the review process.

Code should be linted for things like:
- unused imports and variables
- consistent import order

Code formatting should be done automatically or programmatically, taking the burden off of reviewers.

## Decision

For linting, use [flake8](https://flake8.pycqa.org/en/latest/) and [flake8-import-order](https://github.com/PyCQA/flake8-import-order).

For formatting, use [black](https://github.com/psf/black).

## Consequences

All linting and formatting are enforced programmatically.

Most linting and formatting errors can be autofixed.

Text editors and IDEs are able to integrate with our linting and formattings tools to automatically fix (most) errors on save.
31 changes: 31 additions & 0 deletions docs/contributor-guide/adr/0003-reproducibility.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# 3. Reproducibility

Date: 2019-06-05

## Status

Accepted

## Context

Reproducing results from a program is challenging when operating systems, language versions, and dependency versions can vary.

For this codebase, we will focus on consistent Python versions and dependency versions.

## Decision

Manage multiple Python versions via [pyenv](https://github.com/pyenv/pyenv).

Manage dependencies via [poetry](https://python-poetry.org/).

Define tests via [nox](https://nox.thea.codes/en/stable/).

Run tests in automation/CI via [Github Actions](https://github.com/features/actions).

## Consequences

This solution lets us:
- keep track of [abstract *and* concrete versions](https://caremad.io/posts/2013/07/setup-vs-requirement/) for dependencies (think `.lock` file)
- locally test against multiple Python versions
- run the same tests locally as we do in [Continuous Integration](https://en.wikipedia.org/wiki/Continuous_integration) (CI)
- easily view CI test results within the review context
39 changes: 39 additions & 0 deletions docs/contributor-guide/adr/0004-documentation-and-docstrings.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# 4. Documentation and docstrings

Date: 2019-10-03

## Status

Accepted

## Context

Documentation can take four forms:
1. Explanation
2. Tutorial
3. How-to
4. Reference

We need a way to author and host prosey documentation and generate reference docs based on source code.

## Decision

Doc compilation will be done via [sphinx](https://www.sphinx-doc.org/en/master/).

Prosey documentation (1-3) via [recommonmark](https://github.com/readthedocs/recommonmark).

Reference documentation (4) will be generated based on type annotations and docstrings via:
- Automatic docs based on docstrings via [sphinx-autodoc](https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html), [sphinx-autodoc-typehints](https://github.com/agronholm/sphinx-autodoc-typehints)
- Google-style docstrings via [napoleon](https://www.sphinx-doc.org/en/master/usage/extensions/napoleon.html)
- Hosting on [ReadTheDocs](https://readthedocs.org/) (RTD)
- Build docs in CI and fail on errors or warnings.

## Consequences

Prosey documentation can be written in Markdown (.md), which is more familiar to our contributors than .rst format.

Reference doc generation makes docs more maintainable and consistent with actual code.

Google-style docstrings are easier to read than sphinx-style docstrings.

RTD natively compiles documentation using sphinx and simultaneously hosts docs at each version.
69 changes: 69 additions & 0 deletions docs/contributor-guide/adr/0005-composable-functions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# 5. Composable functions

Date: 2019-11-01

## Status

Accepted

## Context

We need a reasonable tradeoff between ease-of-use and maintainability.

Specifically, we need composable, combinable units that can be improved independently.

### Approach 1: Classes + Methods

One approach is to embrace Object-Oriented Programming (OOP) with fluent interfaces (i.e. method chaining):

```python
project
.create(...)
.update(...)
.delete(...)
```

Characteristics:
- Ease-of-use is maximized, but this requires each method to `return self`.
- Also, this approach implies that if a function can be called with X different object types,
each of those object types should have a corresponding method that applies that functionality and then `return self`.

How to enforce these characteristics?

Any solution will be a tax on maintainability, as code that adheres to these characteristics will include many non-semantic lines simply going through the motions of `return self` and copying function usage into dedicated methods for each class.

### Approach 2: Types + Functions

Another approach is to embrace a functional programming style: simple types and functions (no methods).

Usage is not as terse as for OOP:

```python
p = tc.project.create(...)
u = tc.project.update(p, ...)
d = tc.project.delete(p, ...)
```

Characteristics:
- Ease-of-use is not optimized, but still reasonable.
- With tab-completion, ease-of-use is comparable to OOP.
- Each type can be made immutable
- Each function can be made pure
- Functionality can be shared by calling the same function in user-land, not copying function calls in contributor-land.

## Decision

Use `@dataclass(frozen=True)` to model types and plain Python modules and functions to capture business logic.

## Consequences

Immutable types and pure functions make the code much easier to reason about,
drastically cutting down the time to ramp up and debug.

Functions are easily composable without accumulating undesired side-effects, unlike methods.

Note that not all types and functions *have* to be immutable and pure,
but immutable types and pure functions should be the default.

If there are good reasons to make exceptions, we can do so, but we should include comments to explain why that exception was made.

22 changes: 22 additions & 0 deletions docs/contributor-guide/adr/0006-type-checking.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# 6. Type-checking

Date: 2020-01-29

## Status

Accepted

## Context

Static type-checking is available for Python, making us of the type annotations already in the codebase.

## Decision

Type-check via [mypy](http://mypy-lang.org/).

## Consequences

Testing is still important, but type checking helps to eliminate bugs via static checking,
even for parts of the code not exercised during tests.

Additionally, type-checking relies on our type annotations, ensuring that the annotations are correct and complete.
33 changes: 33 additions & 0 deletions docs/contributor-guide/adr/0007-tamr-client-package.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# 7. tamr_client package

Date: 2020-04-03

## Status

Accepted

## Context

We have an existing userbase that relies on `tamr_unify_client` and cannot painlessly make backwards-incompatible changes.

But, we want to rearchitect this codebase as a [library of composable functions](/contributor-guide/adr/0005-composable-functions).

## Decision

Implement rearchitected design as a new package named `tamr_client`.

Require the `TAMR_CLIENT_BETA=1` feature flag for `tamr_client` package usage.

Warn users who attempt to use `tamr_client` package to opt-in if they want to beta test the new design.

## Consequences

Continue to support `tamr_unify_client`, but any new functionality:
- must be included in `tamr_client`
- may be included in `tamr_unify_client`

Users are required to explicitly opt-in to new features,
preserving backward compatiblitiy for current users.

Once we reach feature parity with `tamr_unify_client`,
we can undergo a deprecation cycle and subsequently remove `tamr_unify_client.
47 changes: 47 additions & 0 deletions docs/contributor-guide/adr/0008-standardized-imports.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# 8. Standardized imports

Date: 2020-06-01

## Status

Accepted

## Context

Python has many ways of importing:

```python
# option 1: import module

# option 1.a
import foo.bar.bazaar as baz
baz.do_the_thing()

# option 1.b
from foo.bar import bazaar as baz
baz.do_the_thing()

# option 2: import value
from foo.bar.bazaar import do_the_thing
do_the_thing()
```

Not to mention that each of these styles may be done with relative imports (replacing `foo.bar` with `.bar` if the `bar` package is a sibling).

Confusingly, Option 1.a and Option 1.b are _conceptually_ the same, but mechanically there are [subtle differences](https://stackoverflow.com/questions/24807434/imports-in-init-py-and-import-as-statement/24968941#24968941).


## Decision

Imports within `tamr_client`:
- Must import statements for modules, classes, and exceptions
- Must `from foo import bar` instead of `import foo.bar as bar`
- Must not import functions directly. Instead import the containing module and use `module.function(...)`
- Must not use relative imports. Use absolute imports instead.

## Consequences

Standardized import style helps linter correctly order imports.

Choosing import styles is a syntactic choice without semantic meaning.
Removing this choice should speed up development and review.
36 changes: 36 additions & 0 deletions docs/contributor-guide/adr/0009-separate-types-and-functions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# 9. Separate types and functions

Date: 2020-06-29

## Status

Accepted

## Context

Code must be organized to be compatible with:
- Static type-checking via [mypy](https://github.com/python/mypy)
- Runtime execution during normal usage and running tests via [pytest](https://docs.pytest.org/en/stable/)
- Static doc generation via [sphinx-autodoc-typehints](https://github.com/agronholm/sphinx-autodoc-typehints)

Additionally:
- Functions should be able to refer to any type
- Most types depend on other types non-recursively, but some types (e.g. `SubAttribute` and `AttributeType`) do depend on each other recursively / cyclically.

## Decision

Put types (`@dataclass(frozen=True)`) into the `_types` module
and have all function modules depend on the `_types` module to define their inputs and outputs.

## Consequences

Separating types into a `_types` module (e.g. `tc.Project` is an alias for `tc._types.project.Project`)
and functions into namespaced modules (e.g. `tc.project` is a module containing project-specific utilities)
allows all of our tooling to run successfully.

Also, splitting up types and functions means that we can author a function like `tc.dataset.attributes` in the `tc.dataset` module
while still having the `tc.attribute` module depend on `tc.Dataset` type.

Finally, for the rare cases where cyclical dependencies for types are unavoidable,
we can use [typing.TYPE_CHECKING](https://docs.python.org/3/library/typing.html#typing.TYPE_CHECKING) since `mypy` and Python are smart enough to resolve these cyclical correctly via [forward references](https://www.python.org/dev/peps/pep-0484/#forward-references).

22 changes: 22 additions & 0 deletions docs/contributor-guide/adrs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Architectural Decision Records

Important architectural decisions are logged as Architectural Decision Records (ADRs)
and are housed here.

For more on ADRs, see:
- [Why write ADRs](https://github.blog/2020-08-13-why-write-adrs/)
- [Earn future maintainers esteem by writing simple ADRs](https://understandlegacycode.com/blog/earn-maintainers-esteem-with-adrs/)

To author new ADRs, we recommend [adr-tools](https://github.com/npryce/adr-tools).

## ADRs

* [Record architecture decisions](/contributor-guide/adr/0001-record-architecture-decisions)
* [Linting and formatting](/contributor-guide/adr/0002-linting-and-formatting)
* [Reproducibility](/contributor-guide/adr/0003-reproducibility)
* [Documentation and docstrings](/contributor-guide/adr/0004-documentation-and-docstrings)
* [Composable functions](/contributor-guide/adr/0005-composable-functions)
* [Type checking](/contributor-guide/adr/0006-type-checking)
* [tamr_client package](/contributor-guide/adr/0007-tamr-client-package)
* [Standardized imports](/contributor-guide/adr/0008-standardized-imports)
* [Separate types and functions](/contributor-guide/adr/0009-separate-types-and-functions)
20 changes: 0 additions & 20 deletions docs/contributor-guide/style-guide.md

This file was deleted.

0 comments on commit 4ec7db6

Please sign in to comment.