Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(archive): Moving waku archive logic from app.nim to the archive module #1817

Merged
merged 8 commits into from
Jun 27, 2023

Conversation

Ivansete-status
Copy link
Collaborator

@Ivansete-status Ivansete-status commented Jun 22, 2023

Description

This PR performs a refactoring to simplify the app.nim, and move all the waku archive logic into the archive.nim module.

The App object shouldn't directly know about the "archive" protocol or the "retention policy". IMHO, these concepts should be managed by the archive module.

This movement is interesting from the perspective of new incoming archive drivers.

This is how I see a natural chain of responsibility:

app -> node -> wakuArchive -> driver (sqlite now but postgres in the next PR to be merged)

@Ivansete-status Ivansete-status force-pushed the refactor-archive branch 4 times, most recently from cbab1f4 to 7a8d08c Compare June 22, 2023 12:02
@Ivansete-status Ivansete-status changed the title refactor(app/archive) Moving waku archive logic from app.nim to the archive module refactor(app/archive): Moving waku archive logic from app.nim to the archive module Jun 22, 2023
@Ivansete-status Ivansete-status changed the title refactor(app/archive): Moving waku archive logic from app.nim to the archive module refactor(archive) Moving waku archive logic from app.nim to the archive module Jun 22, 2023
@Ivansete-status Ivansete-status changed the title refactor(archive) Moving waku archive logic from app.nim to the archive module refactor(archive): Moving waku archive logic from app.nim to the archive module Jun 22, 2023
@Ivansete-status Ivansete-status force-pushed the refactor-archive branch 2 times, most recently from 5ecf5c0 to 8e04200 Compare June 22, 2023 13:04
@Ivansete-status Ivansete-status self-assigned this Jun 22, 2023
@Ivansete-status Ivansete-status marked this pull request as ready for review June 22, 2023 13:40
Copy link
Contributor

@SionoiS SionoiS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -467,7 +440,6 @@ procSuite "Waku Archive - find messages":
response.cursor.isNone()

test "handle temporal history query with a valid time window":
## Given
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes, this is totally beyond the scope of this PR. I will set it back


discard setTimer(Moment.fromNow(interval), executeRetentionPolicy)
node.wakuArchive.startMessageRetentionPolicyPeriodicTask(30.minutes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded 30m? Do we have an issue for making this configurable?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have an issue for that but I think is a good idea indeed :)

Copy link
Contributor

@jm-clius jm-clius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! Apologies for not being able to approve this yet. :) Many of the changes seem reasonable, I mainly think my issue here is that the archive driver interface is designed to be generic and to leave the choice and set up of specific driver backends to the application. I agree that we can move the driver-specific setup code to not clutter app.nim, but I'd put them in a place close to the application (or at least not in the generic archive.nim - which should be seen as similar to an interface). I hope my comments below make sense, but happy to discuss in more detail if perspectives diverge a lot here. :)

Comment on lines 86 to 89
storeMessageDbUrl: string,
storeMessageDbVacuum: bool = false,
storeMessageDbMigration: bool = false,
storeMessageRetentionPolicy: string = "none"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree that the Archive Driver should have any knowledge about underlying implementations. An Archive does not even have to be a DB at all - it could simply be an internal memory queue. The point of having a separate archive driver is exactly so that applications can set up and provide their own archive backend to the generic archive driver, with no information about application specific choices (do I have harddrives? do I have my own proprietary DB?) being passed down.

Comment on lines 510 to 513
storeMessageDbUrl: string,
storeMessageDbVacuum: bool = false,
storeMessageDbMigration: bool = false,
storeMessageRetentionPolicy: string = "none"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The node does not receive a "DB" it receives an Archive as provided by the application - it should not have to have any knowledge about this external component. This may in fact not even be a DB, but simply a chunk of memory.


discard setTimer(Moment.fromNow(interval), executeRetentionPolicy)

proc setupWakuArchiveDriver(url: string, vacuum: bool, migrate: bool):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is application logic: decisions about how the configured url should be interpreted and which specific backend storage systems should be set up for the application lies with the application. When adding new drivers (e.g. someone now supports Codex distributed storage) the point of a driver interface is exactly that the backend operation would not have to be changed or even be aware of this fact. In this case, the application would have to request changes in the setup instructions which couples the driver interface with application-defined underlying implementations.

@jm-clius
Copy link
Contributor

The App object shouldn't directly know about the "archive" protocol or the "retention policy".

I disagree. I think only the App should know what e.g. "postgresql" means. Database technologies (and other technologies for storage) are not Waku protocols so a Waku p2p node should not know about the internals here. It simply provides a plugin interface and an application can then configure a service of its choice that works with the plugin.

or the "retention policy".

Mmm. There may indeed be some generic concepts here that can be further abstracted from the application. Thanks for pointing this out! I do think, however, that the "simplest" change here may be to keep the setup code here similarly close to the application and ensure that it works with postgresql as well. But happy to consider other ideas here. :)

app -> node -> wakuArchive -> driver

<-- I think this is true. What shouldn't be added here is that the archive is aware of different driver implementations, which moving the implementation-specific setup code into the archive interface would achieve. :)

@Ivansete-status
Copy link
Collaborator Author

I've re-submitted the PR seeking to follow your suggestions, @jm-clius :) Thanks for them!

Copy link
Contributor

@jm-clius jm-clius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think this is a much better pattern than we had before. Thanks for your patience and taking the time to clean this up. Note my comments/nitpicks below. One comment is that this signifies a significant number of changes (also for SQLite DBs) which we'll only be able to fully test in a production environment. When rolling this out in a next release, I'd suggest (1) testing basic functions in collaboration with go-waku and (2) deploying only to a few production nodes at first.


discard setTimer(Moment.fromNow(interval), executeRetentionPolicy)
node.wakuArchive.startMessageRetentionPolicyPeriodicTask(30.minutes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Better to extract this to a const just before the function than to make this a "magic value" inline with the code: easier to debug; clear where to add config if needed.

proc setupPeerStorage(): AppResult[Option[WakuPeerStorage]] =
let db = ?setupDatabaseConnection(PeerPersistenceDbUrl)
let db = ? SqliteDatabase.new("peers.db")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Better to extract this to a const just before the function than to make this a "magic value" inline with the code: easier to debug; clear where to add config if needed.

@@ -11,7 +11,7 @@ import
import
../../../waku_core,
../../common,
../../driver,
../../driver_base,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I wouldn't call this _base. We didn't previously follow this pattern (see e.g. "node" and "builder"). It's clear enough that the builder for the driver is distinct from the driver. I'd either go for driver.nim and driver_builder or, preferably, driver.nim and builder.nim as it's fully qualified in import path: driver/driver.nim and driver/builder.nim. The latter would be the established pattern.

../driver,
../retention_policy
../driver_base,
../retention_policy_base
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I wouldn't call this _base. See my comment elsewhere on the established pattern here.

@Ivansete-status Ivansete-status merged commit 52894a8 into master Jun 27, 2023
15 checks passed
@Ivansete-status Ivansete-status deleted the refactor-archive branch June 27, 2023 11:24
@jakubgs
Copy link
Contributor

jakubgs commented Jun 27, 2023

Looks like this resulted in all nodes on wakuv2.test restarting due to:

admin@node-01.do-ams3.wakuv2.test:~ % d run --rm -it statusteam/nim-waku:deploy-wakuv2-test --version
could not load: libpq.so(.5|)
(compile with -d:nimDebugDlOpen for more information)

So I've made a PR to sanity-check the Docker image startup:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants