Skip to content

Latest commit

 

History

History
66 lines (35 loc) · 7.04 KB

architecture.md

File metadata and controls

66 lines (35 loc) · 7.04 KB

🏠 Architecture Overview

When starting this project, it seemed simple but it turns out there are a few things that were not obvious and hide quite a bit of complexity.

Caching

Hyper-dns uses two levels of cache. A context-cache and a system-cache.

The context-cache is used for the duration of a resolve operation. It will keep things like DNS results in memory to make sure that two protocols that need the same resource can access it. This is implemented in the resolve-context and is not shared1 between requests.

The system-cache is used to cache the results of a resolve operations to lower the load for repeat requests. It will only cache the results and supports expiration. This is implemented in either as a in-memory-cache or a sqlite-cache.

The in-memory-cache is a lru-cache used to cache the result of resolve operations only. It is which gives it a stable runtime-performance.

The sqlite-cache is a overlay over the in-memory-cache. As the name suggest it uses SQLite to store data on the hard disk. SQLite is used because it turns out that the team put significant effort into making it work for multiple processes at a time. (See more about this in the API Documentation)

Cache entries are not flushed or destroyed by default. This ensures that even expired entries can be used if your computer is offline for a longer period of time!

Networking

For DNS requests to work work in the browser context and for anonymity we use dns-over-https to look-up dns entries. However, since dns-over-https providers may break and/or be unreachable it falls back to the system DNS resolving if possible.

All requests are exclusively done over https and respect proxy settings, for the system to also work in environments with limited network access.

Isolation of Concerns

This library in several places embraces functional programming. For example: it may be a question why the API is written resolve(protocol, name) instead of protocol.resolve(name). While this may look like a question of taste, this structure has gradually evolved.

Below you find a diagram, illustrating two common data flows, in:

API

The API methods (like resolve or resolveURL) isolate the Cache, Context and Protocol from another, making sure that each only focusses on their task.

Cache

To accomodate browser/node environments we need different cache implementations. To prevent having to test various assumptions in the cache, its operations are as simple as possible.

Context

Different runtimes need to provide different contexts (eg. different fetch implementation), but the contexts also simplify operations for Protocol definitions as much as possible in order for Protocol definitions to be as minimalistic as possible to distinguish themselves from another.

Protocol (like dat or hyper)

To make sure that protocols can be easily tested, the protocol definitions are very reduced and isolated, they have only the context and name as input to operate on which makes them very easily testable.

As you may have noticed: all concepts are simplified for testability. Earlier versions needed complex setup's to test even the simplest assumptions thoroughly.

Performance considerations

All resolve API's of hyper-dns can be cancelled and support timeouts. It uses the AbortSignal API which also is used the browser's fetch operation. This is done to allow for saving resources in quick lookups of domains as may happen in a URL input bar.

The resolveURL and resolve API's have two different goals: one is to find the first matching protocol, the other to find all protocols. The cache system has been chosen to allow both operations to run at the best performance.

Footnotes

  1. An API user can use the context option to share contexts between requests.