Skip to content

Commit

Permalink
docs: add developer oriented docs
Browse files Browse the repository at this point in the history
This contains docs meant to aid those wishing to
contribute to the project and for on-boarding new developers on the project.
  • Loading branch information
kennedykori committed Aug 24, 2022
1 parent 8273884 commit 0e98c4f
Show file tree
Hide file tree
Showing 3 changed files with 200 additions and 17 deletions.
27 changes: 10 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ pip install -r requirements/dev.txt

And then create the binary using the following command:-
```bash
pyinstaller app/__main__.py --hidden-import apps/imp --collect-all app --name idr_client_temp -F
pyinstaller app/__main__.py --collect-all app --name idr_client_temp -F
```
This will create an executable but the executable will still depend on the
target system/computer having the correct system libraries. More details on this
Expand All @@ -96,22 +96,15 @@ The executable binary can be found on the `dist` directory of the project. To
learn more about the `staticx` command, check the docs [here](https://staticx.readthedocs.io).


## Concepts
This section is for the curious and those wishing to contribute. It provides a
summary description of how the app works and the concepts and terms used in the
project. These are:
* __Data Source Type__ - A data source type is just that, it describes a kind
of data source together with the operations that can be performed around those
data sources. Each data source type can have multiple *data sources*.
* __Data Source__ - A data source represents an entity that contains data of
interest such as a database or a file. Each data source has multiple
*extra metadata*.
* __Extract Metadata__ - This a description of the data to be extracted from a
data source. An extract metadata also defines how data is extracted from a
data source.
* __Upload Metadata__ - This describes data that has been extracted and how
it's packaged for uploading to the remote server. Each upload metadata is
always associated with a given *extract metadata*.
## Contributing
This section is for the curious and those wishing to contribute. For those who
are curious about how the app works and the architecture of the project,
check out the [architecture docs](https://github.com/savannahghi/idr-client/blob/develop/docs/ARCHITECTURE.rst).
For those wishing to contribute, it is highly recommended that they start by
reading the [contribution guidelines](https://github.com/savannahghi/idr-client/blob/develop/docs/CONTRIBUTING.rst).

All contributions are welcome.


## License

Expand Down
132 changes: 132 additions & 0 deletions docs/ARCHITECTURE.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
=======================
IDR Client Architecture
=======================

This document describes among other things the core concepts used through out
this project, the layout of the project and the class hierarchy of the main
domain models in the project. All these are important in order to understand
how the application is structured and how the different components of the
application fit together and interact with each other.


Concepts and Terminology
------------------------

This section contains important terms and concept central to the project.

Core Domain
~~~~~~~~~~~

The core domain of the project is majorly composed of the following components:

* **Data Source Type** - A data source type is just that, it describes a kind
of data source together with the operations that can be performed around
those data sources. Each data source type can have multiple *data sources*.
As well as being a container for *data sources*, a data source type also
exposes concrete implementations of the other core domain models that define
properties and behaviours that are useful when working with data of the
given *type*. This allows the application to work with data of different
types and from different sources.
* **Data Source** - A data source represents an entity that contains data of
interest such as a database or a file. Each data source has multiple
*extract metadata* associated with it.
* **Extract Metadata** - This a description of the data to be extracted from a
*data source*. These description can include *(but is not limited)*
properties such as the scope, depth and amount of data to be extracted from a
data source. An extract metadata also defines how data is extracted from it's
parent *data source*.
* **Upload Metadata** - This describes the attributes of the extracted data and
how it's packaged for uploading to the remote server. Each upload metadata is
always associated with a given *extract metadata*. Note that an upload
metadata doesn't contain the actual data to be uploaded, just information
about the data. The actual data is contained by the *upload chunks*
associated with the given upload metadata.
* **Upload Chunk** - Before data is uploaded to the server, it is partitioned
into smaller units *(for transmission efficiency reasons)* which are referred
to as upload chunks. These chunks are then uploaded to the server.

These domain components are defined in the ``app.core.domain`` module as
interfaces meant to be implemented for each *data source type* that the
application needs to support. The default implementations that ship with the
application can be found at the ``app.imp`` package. This is designed to
emulate something similar to the `Service Provider Interface <spi_>`_ pattern in
Java.

Transport
~~~~~~~~~

A transport in the project represents the flow of data to and from the IDR
Client. Specifically, a transport connects the IDR Client to a metadata source
and also connects the client to the final destination of the extracted data. If
it helps, a transport can be thought of as an interface composed of two other
interfaces, ``MetadataProvider`` and ``DataSink``. In the future, the transport
interface might as well be split into those two interfaces if the need arises
but for now it remains as a single interface. The application receives metadata
through a transport and uploads the final data using a transport. A transport
can be anything from a HTTP API to a filesystem API. The transport interface is
defined in the ``app.core.transport`` module whereas the ``app.lib.transports``
package contains common transport implementations.

Task
~~~~

A task is a job or an action that takes an input and returns an output. Most
actions and processes in the project are modelled by composing different tasks
to achieve the desired objective. The task interface is defined at the
``app.core.task`` module whereas the ``app.lib.tasks`` package provides most
common tasks implementations as well as tasks that can be used to compose
multiple tasks.

Project Layout
--------------

The project structure/layout.

::

.
idr-client
├── ...other project configuration files.
└── app - The root src directory.
| ├── core - The core application components.
| | ├── domain - Interfaces describing the services and essential processes provided by the application.
| | ├── exceptions - Defines key application errors and exception used through out the project.
| | ├── mixins - Defines components and interfaces used to model common behaviours and reusable functionality.
| | ├── serializers - Defines interfaces that convert python objects into simple native types for easy storage and/or transmission.
| | ├── task - Defines the interface that models a job or piece of work in the application.
| | └── transport - Defines an interface that models the flow of data to and from the application.
| |
| ├── imp - Implementations of the core services.
| |
| ├── lib - Utilities and helpers.
| | ├── config - Classes and functions needed to configure the application.
| | ├── tasks - Implementations of common utility tasks.
| | ├── transports - Different implementations of the transport interface.
| | ├── app_registry - Contains the implementation of the main application registry.
| | ├── checkers - Defines validators used throughout the application.
| | └── module_loading - Defines utilities used for dynamic module loading.
| |
| ├── use_cases - This are application specific operations.
| | ├── fetch_metadata - Defines fetch metadata operations.
| | ├── main_pipeline - The main application pipeline operations.
| | ├── run_extraction - Define data extraction operations.
| | ├── types - Defines common typings used within the use cases package.
| | └── upload_extracts - Defines data upload operations.
| |
| ├── __init__ - Defines the application setup operations.
| ├── __main__ - The main application entry point.
| └── __version__ - Metadata about the application.
|
├── docs - Documentation for the project.
|
├── logs - A directory that can be used to store log directories during development. This is not needed to run the application but is there for convenience.
|
├── requirements - Defines dependencies needed to by the application.
| ├── base - The key dependencies needed for the application to run.
| ├── dev - Dependencies needed to set up a development environment for the project.
| └── test - Dependencies needed to test the application.
|
└── tests - Tests for the application.


.. _spi: https://docs.oracle.com/javase/tutorial/sound/SPI-intro.html
58 changes: 58 additions & 0 deletions docs/CONTRIBUTING.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
===================
Contributor's Guide
===================

If you are reading this, you're probably interested in contributing to this
project. All contributions are welcome and your efforts are greatly
appreciated. This document lays out guidelines and advice for contributing to
the project.

Note that the project maintainers have the final say on whether or not a
contribution is accepted. All contributions will be considered carefully, but
occasionally, some contributions will be rejected because they do not suit the
current goals or needs of the project.

If your contribution is rejected, don't despair! As long as you followed these
guidelines, you will have a much better chance of getting your next
contribution accepted.

Steps for Submitting Code
~~~~~~~~~~~~~~~~~~~~~~~~~

Use the checklist below when contributing code:

1. Fork the repository on `GitHub`_.
2. Run the tests to confirm they all pass on your system. If they don't, you'll
need to investigate why they fail. If you're unable to diagnose this
yourself, raise it as a bug report by creating a new issue on GitHub.
3. Write tests that demonstrate your bug or feature. Ensure that they fail.
4. Make your change.
5. Run the entire test suite again, confirming that all tests pass including
the ones you just added. Send a GitHub Pull Request to the main repository's
``main`` branch. GitHub Pull Requests are the expected method of code
collaboration on this project.

Code Review
~~~~~~~~~~~

Contributions will not be merged until they've been code reviewed. You should
implement any code review feedback unless you strongly object to it. In the
event that you object to the code review feedback, you should make your case
clearly and calmly. If, after doing so, the feedback is judged to still apply,
you must either apply the feedback or withdraw your contribution.

Code Style
~~~~~~~~~~

This project uses a collection of tools to ensure the code base has a
consistent style as it grows. We have these orchestrated using a tool called
`pre-commit`_. This can be installed locally and run over your changes prior
to opening a PR, and will also be run as part of the CI approval process
before a change is merged.

You can find the full list of formatting requirements specified in the
`.pre-commit-config.yaml`_ at the top level directory of this project.

.. _GitHub: https://github.com/savannahghi/idr-client
.. _pre-commit: https://pre-commit.com/
.. _.pre-commit-config.yaml: https://github.com/savannahghi/idr-client/blob/develop/.pre-commit-config.yaml

0 comments on commit 0e98c4f

Please sign in to comment.