Skip to content

Commit

Permalink
Add unmodified Marconi-related ADRs from plutus-apps (#49)
Browse files Browse the repository at this point in the history
  • Loading branch information
koslambrou committed May 30, 2023
1 parent 2e565c0 commit 7e1b044
Show file tree
Hide file tree
Showing 7 changed files with 550 additions and 0 deletions.
169 changes: 169 additions & 0 deletions doc/read-the-docs-site/adr/0001-record-architecture-decisions.rst
@@ -0,0 +1,169 @@
.. _adr1:

ADR 1: Record architectural decisions
=====================================

Date: 2022-06-08

Authors
---------

koslambrou <konstantinos.lambrou@iohk.io>

Status
------

Accepted

Context
-------

We are in search for a means to document our architectural and design decisions
for all of our components.
In order to do that, there is practice called architectural decision records ("ADR"),
that we can integrate into our workflow.

This does not replace actual architecture documentation, but provides people who are contributing:

* the means to understand architectural and design decisions that were made
* a framework for proposing changes to the current architecture

For each decision, it is important to consider the following factors:

* what we have decided to do
* why we have made this decision
* what we expect the impact of this decision to be
* what we have learned in the process

As we're already using `rST <https://docutils.sourceforge.io/rst.html>`_,
`Sphinxdoc <https://www.sphinx-doc.org/en/master/>`_ and
`readthedocs <https://readthedocs.org/>`_, it would be practical to
integrate these ADRs as part of our current documentation infrastructure.

Decision
--------

* We will use ADRs to document, propose and discuss
any important or significant architectural and design decisions.

* The ADR format will follow the format described in `Implications`_ section.

* We will follow the convention of storing those ADRs as rST or Markdown formatted
documents stored under the `docs/adr` directory, as exemplified in Nat Pryce's
`adr-tools <https://github.com/npryce/adr-tools>`_. This does not imply that we will
be using `adr-tools` itself, as we might diverge from the proposed structure.

* We will keep rejected ADRs

* We will strive, if possible, to create an ADR as early as possible in relation to the actual
implementation.

Implications
------------

ADRs should be written using the template described in the `ADR template`_ which comes from
Chapter 6.5.2 (*A Template for Documenting Architectural Decisions*) of
*Documenting Software Architectures: Views and Beyond (2nd Edition)*.

However, the mandatory sections are *Title*, *Status*, *Issue/Context*, *Decision*, *Implications/Consequences*.
The rest are optional.

Another good reference is the article
`Architecture Decision Records <https://cognitect.com/blog/2011/11/15/documenting-architecture-decisions>`_
by Michael Nygard (Nov. 15, 2011).

ADR template
^^^^^^^^^^^^

What follows is the ADR format (adapted from the book).

+----------------------+---------------------------------------------------------------------------+
| Section | Description |
+======================+===========================================================================+
| Title | These documents have names that are short noun phrases. |
| | |
| | For example, "ADR 1: Deployment on Ruby on Rails 3.0.10" |
| | or "ADR 9: LDAP for Multitenant Integration" |
+----------------------+---------------------------------------------------------------------------+
| Authors | List each author's name and email. |
+----------------------+---------------------------------------------------------------------------+
| Status | State the status of the decision, such as "draft" if the decision is |
| | still being written, as "proposed" if the project stakeholders haven't |
| | agreed with it yet, "accepted" once it is agreed. If a later ADR changes |
| | or reverses a decision, it may be marked as "deprecated" or "superseded" |
| | with a reference to its replacement. (This is not the status of |
| | implementing the decision.) |
+----------------------+---------------------------------------------------------------------------+
| Issue (or context) | This section describes the architectural design issue being addressed. |
| | This description should leave no questions as to why this issue needs to |
| | be addressed now. The language in this section is value-neutral. It is |
| | simply describing facts. |
+----------------------+---------------------------------------------------------------------------+
| Decision | Clearly state the solution chosen. It is the selection of one of the |
| | positions that the architect could have taken. It is stated in full |
| | sentences, with active voice. "We will …" |
+----------------------+---------------------------------------------------------------------------+
| Tags | Add one or more tags to the decision. Useful for organizing the set of |
| | decision. |
+----------------------+---------------------------------------------------------------------------+
| Assumptions | Clearly describe the underlying assumptions in the environment in which a |
| | decision is being made. These could be cost, schedule, technology, and so |
| | on. Note that constraints in the environment (such as a list of accepted |
| | technology standards, an enterprise architecture, or commonly employed |
| | patterns) may limit the set of alternatives considered. |
+----------------------+---------------------------------------------------------------------------+
| Argument | Outline why a position was selected. This is probably as important as the |
| | decision itself. The argument for a decision can include items such as |
| | implementation cost, total cost of ownership, time to market, and |
| | availability of required development resources. |
+----------------------+---------------------------------------------------------------------------+
| Alternatives | List alternatives (that is, options or positions) considered. |
| | |
| | Explain alternatives with sufficient detail to judge their suitability; |
| | refer to external documentation to do so if necessary. Only viable |
| | positions should be described here. While you don’t need an exhaustive |
| | list, you also don’t want to hear the question “Did you think about... ?” |
| | during a final review, which might lead to a loss of credibility and a |
| | questioning of other architectural decisions. Listing alternatives |
| | espoused by others also helps them know that their opinions were heard. |
| | Finally, listing alternatives helps the architect make the right |
| | decision, because listing alternatives cannot be done unless those |
| | alternatives were given due consideration. |
+----------------------+---------------------------------------------------------------------------+
| Implications | Describe the decision’s implications. For example, it may |
| (or consequences) | |
| | * Introduce a need to make other decisions |
| | * Create new requirements |
| | * Modify existing requirements |
| | * Pose additional constraints to the environment |
| | * Require renegotiation of scope |
| | * Require renegotiation of the schedule with the customers |
| | * Require additional training for the staff |
| | |
| | Clearly understanding and stating the implications of the decisions has |
| | been a very effective tool in gaining buy-in. All consequences should be |
| | listed here, not just the "positive" ones. A particular decision may have |
| | positive, negative, and neutral consequences, but all of them affect the |
| | team and project in the future. |
+----------------------+---------------------------------------------------------------------------+
| Related Decisions | List decisions related to this one. Useful relations among decisions |
| | include causality (which decisions caused other ones), structure (showing |
| | decisions’ parents or children, corresponding to architecture elements at |
| | higher or lower levels), or temporality (which decisions came before or |
| | after others). |
+----------------------+---------------------------------------------------------------------------+
| Related Requirements | Map decisions to objectives or requirements, to show accountability. Each |
| | architecture decision is assessed as to its contribution to each major |
| | objective. We can then assess how well the objective is met across all |
| | decisions, as part of an overall architecture evaluation. |
+----------------------+---------------------------------------------------------------------------+
| Affected Artifacts | List the architecture elements and/or relations affected by this |
| | decision. You might also list the effects on other design or scope |
| | decisions, pointing to the documents where those decisions are described. |
| | You might also include external artifacts upstream and downstream of the |
| | architecture, as well as management artifacts such as budgets and |
| | schedules. |
+----------------------+---------------------------------------------------------------------------+
| Notes | Capture notes and issues that are discussed during the decision process. |
| | They can be links to a external document, a PR, a Github issue, etc. |
+----------------------+---------------------------------------------------------------------------+
107 changes: 107 additions & 0 deletions doc/read-the-docs-site/adr/0002-marconi-initiative.rst
@@ -0,0 +1,107 @@
.. _adr2:

ADR 2: Making a case for Marconi
================================

Date: 2022-07-26

Author(s)
---------

Radu Ometita <radu.ometita@iohk.io>

Status
------

Accepted

Context
-------

Plutus off-chain code oftentimes needs access to indexed portions of the blockchain. The plutus-chain-index project is the initial solution meant to deliver access to this kind of data. However, after release, a couple of shortcomings were identified which prompted the development of an indexing solution that is based on a different set of architectural and functional constraints.

A lot of the shortcomings are connected to the exploratory type of development that we used to deliver the plutus-chain-index which was prompted by the lack of a clear specification and a lack of concern for non-functional and quality assurance requirements. The top-down design resulted in a monolithic and fairly complex architecture which made the code difficult to reuse, compose and understand.

Some of the problems we identified due to the above-mentioned approaches are:

A. The use of an effect system (the `freer-simple` package) makes the code fairly complex and difficult to understand (quite a few type-level computations are happening). The separation between syntax and semantics imposed by the library also complicates matters for no clear reason (for example, if we write two semantics, one for pure code used for testing and one for production code, then there would be a lot of production code that would not be tested).

B. We cannot customise the indexed set of data, the plutus-chain-index provides only all-or-nothing indexing. While this can be addressed, the architecture makes it an uphill battle.

C. The implicit assumption that there is only one index running caused issues when we made the Plutus Application Backend collect and index information requested by smart contracts. Now we have two components that index information from the blockchain, but they are not synchronised. Querying the plutus-chain-index about transactions received from the Plutus Application Backend may result in no data returned, since the plutus-chain-index indexes data slower than the PAB.

D. The lack of non-functional requirements resulted in software that uses an unreasonable amount of resources and results in slow synchronisation speeds. And since everything is monolithic it is difficult to turn off indexing of data which is not required by our customers there is no way to limit the required resources.

E. The same lack of a specification and non-functional requirements makes the testing feel ad-hoc and like an afterthought.

The Chain Index was meant to be a software application that supports the execution of smart contracts. And, in that, it succeeded. However, we found that our customers would rather have a library of functionality that they can customize to do the following:

* to build their own indexers,
* to work only with the data that they care about for their application,
* to use whatever storage engine they prefer, and
* to support only the queries that they need to support.

So when we took all the feedback into account we decided that a redesign of the indexing solution using a much simplified and modular design is a worthwhile enterprise.

We continue by introducing some of the design principles that guided us in the specification of Marconi.

Design principles of Marconi
----------------------------

We follow the Algebra Driven Design approach for Marconi components, so from the get-go, we will have a checked specification for the software that we develop.

The specification is based on a simplified model which should help with documenting how everything works without getting into the more complex details.

Having a set of property-based tests to validate that the implementation conforms to the specification also means that the correctness of the implementation does not rely on type-level checks or complicated term-level machinery (we could even verify the correctness of a Rust implementation by leveraging the Rust to Haskell FFI).

Because we have no reliance on type-level checks or complicated architectural patterns to validate the software (we use the specification and property tests for that), the code is much easier to understand, document and extend.

Indexing solution
-----------------

The indexing solution has the following basic requirements: it needs to deal with rollbacks as elegantly as possible and provide a way to compromise between memory, disk and CPU usage.

On the Cardano blockchain, there are frequent rollbacks, but they can only span a maximum of 2160 blocks (and most of them are < 10 blocks). We call the 2160 number the security parameter K (and we denote it by 'K' henceforth).

Indexers are a store which is updated by events created from each block. The problem introduced by rollbacks is that we need to undo all state changes when a rollback occurs.

We opted for a design where we keep K blocks in memory as the list of events that are fed into the function that stores them once they go beyond the K limit.

This architectural decision has some desirable effects:

1. Managing rollbacks is very simple and fast. We drop the events that were rolled back. (No need to undo the application of blocks on the state stored on disk, which would be necessary if we were to store everything on disk as fast as possible).

2. Making 'K' configurable makes the design already quite scalable. Developers do not usually need to guard themselves against rollbacks by K blocks so they can choose to store 10 events in memory allowing for chain desynchronisation in the unlikely event that a rollback occurs beyond the 10 blocks limit.

3. In case of a restart recovery is very simple. If the selected K parameter is properly set, we store only fully confirmed transactions so there is nothing to do other than resume operation.

And some less desirable effects:

1. We must keep K events in memory, which (depending on how large events are) can waste some memory. Our educated guess is that this is a reasonable compromise, but depending on how large events can get that may not be the case for your use case.

2. Queries are more involved as we need to scan events in memory and the state persisted on disk.

Query and storage
-----------------

The indexed data is accessible through queries. There are no constraints on the format of queries or results. Both are identified by a type variable that the indexer exposes and the implementation of the result and query datatypes and the store and query functions can be provided by the user. One of the complications of this query implementation is that a query has to run on the merged data from memory and disk.

The possibility of defining the query and store functions allows us to associate any kind of storage type to the indexers, though, right now we are only using SQLite.

Identification of events
------------------------

We need a way to provide an answer to the question: How much of the stream has been consumed by the indexer? We choose to do that by associating a sequence number to incoming blocks, and carrying it along the stream of events. Having a way to answer this question is connected to the following features which we plan to implement:

1. Synchronisation of multiple indexers (queries have a validity interval)
2. Resume functionality (we need to know from which slot to resume)
3. Handling of rollbacks (now there is explicit handling of rollbacks)

More information will become available in the next few sprints.

Event streams
-------------

To support PAB functionality which subscribes to a source for a set of event types, we need a way to produce events from indexers.

They are also very useful for contracts that want to track rollbacks. Rollbacks are invisible from the point of view of the indexed data, but it may be the case that the internal state of a contract needs to know that the state has been reverted.

0 comments on commit 7e1b044

Please sign in to comment.