-
-
Notifications
You must be signed in to change notification settings - Fork 396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backend architecture: Datomic, datahike, OpenCrux, datalevin, Fluree #9
Comments
From Jeroen in the Slack:
|
I can only speak for Crux on these points... Pros:
Cons:
Hope that helps! Edit: this might be of interest: https://findka.com/blog/migrating-to-biff/ (Firebase-like stack on top of Crux) |
From Christopher Small, author of datsync
|
We've not looked at datsync in any detail but we have spent some time thinking about crux->datascript replication already: https://github.com/crux-labs/crux-datascript/blob/master/src/crux_datascript/core.clj |
Just to also chime in and add a few things that have not been said yet: Yes, Datahike is still very much compatible with DataScript and moreover we are aiming to port our query engine with durability back over to ClojureScript in our next release as well (after We also provide a Datomic compatible core API that is used by our commercial clients, so if you decide to stick to the common subset, you will be able to swap Datahike in at any point. If you hit missing features or incompatibilities, please open an issue. We are currently working on our write throughput and I am confident that we can scale to Datomic size deployments in principle, it was just a matter of priorities. We, the members of LambdaForge, are also big fans of the Zettelkasten method (even before we were aware of Roam) and use https://org-roam.readthedocs.io/en/latest/ at the moment. We would be super happy to see a reliable open source implementation like Athens to succeed, so keep going 💯 ! I think ideally the backends should be exchangeable, so even if you decide for one, keep in mind when you buy into its specific semantics. |
Although I don't consider myself an expert in databases, I guess one of the (future) advantages of Datahike would be that it could potentially enable "local first" as described here: https://www.inkandswitch.com/local-first.html. For me this would be great to have in a tool like Athens because you could easily edit offline on multiple machines, while having confidence that your edits could later on combine seamlessly. |
Thanks so much for sharing that link. Several engineers (including myself) are quite interested in local first applications. We've discussed databases like OrbitDB, Gun, and Scuttlebutt. Datahike is very interesting for this reason. |
@tangjeff0 no problem. I guess Datahike isn't quite there yet, but maybe @whilo can share something about whether Datahike would allow a local-first workflow in the future? |
Yes, since our early work on http://replikativ.io/, which was predating most of these other local first approaches, but did not attract a large community back then and also did not have a nice programming model such as Datalog, we wanted to be local-first. We aim to port Datahike back to ClojureScript in our next iteration. Do you think open-collective would work to fund this work? Any help would be appreciated, as we are currently still hammering out Datomic compatibility and some scalability issues in the JVM version. |
Will re-open when after v1 is complete |
TL;DR; Do you plan to support block-level access control and notifications/subscriptions? If so, how do you plan to do this? Maybe the DB is a deal-breaker. Hi there. I've been discussing with @tangjeff0 a little bit on Twitter about your plans and how they could be linked with ours. I also had some experience working with heavily nested and linked content with my previous project www.collectiveone.org and I have a couple of comments regarding the DB and how to handle the multi-player case:
I did this in Postgres the last time I tried and relied a lot on algorithmic recursion, so I navigated the DB in many directions before determining what to do, or who to send a message to. This was too slow. I am not an expert in big data systems, so I really wonder how these problems should be actually handled. |
Another factor I'd like to point out is the conflict resolution story, whether it be distributed or centralized. |
Another option could be to use a Git Repository as a backend. This would require creating a REST API on top of the Git repo to parse the documents, but it would result in greater compatibility with existing tools. One will be easily able to have to files locally, and even use other markdown editors or more advanced editors like Obsidian. And there is also a mobile app already ready (GitJournal - I'm the author) This would result in a very different architecture though. I'm willing to help, if you want to go down this route. I would love more tools to be compatible with each other. |
You could also consider https://github.com/terminusdb/terminusdb-server |
I'd just like to add that for me, Athens being open-source is a significant advantage over Roam, and if Athens ends up requiring a closed-source backend to be most useful, that advantage would be diminished. Also it would be nice to abstract the backend-talking code to allow people to potentially run Athens on other backends, as long as they support some defined protocol. |
Protocol is always most ideal but hardest to pull off. Crux, Datomic, Datascript, Datahike will inevitably have some differences with each other. Agree that closed-source backend diminishes value. Inevitably parts of our infrastructure will be closed, but if there is a fully open-source full-stack solution for users to self-host, super great. Also just learned about https://github.com/fluree/ from Matei. Clojure, Web3, open-source. https://www.youtube.com/watch?v=uSum3uynHy4&feature=youtu.be |
Hi there! I'm glad to see some movement here 🙂 We have been working on an interface specification for our Athens-like app so that the backend is abstracted. We have also been working hard on a NodeJS + DGraph backend API that is AGPL-like open-sourced. I'd bet the interface supports (or will support) all the needs of Athens. Who knows! 💪. It includes backlinks and search features, granular access control (and thus multi-player), and fast data creation and fetching. Reusing our backend, or just the interface, will also provide interoperability among our apps. Users will be able to embed and edit blocks from Athens in Intercreativity, for example. They can also "fork" them as we want to support GIT-like flows with content. Oh, and eventually Athens could connect to other data storage solutions. We have prototypes for OrbitDB, Ethereum, Kusama, and IndexedDB (local). This is a recent demo of our latest milestone (a simple case where users mix private with public content). We are about to release a new version where users can explore a feed of blog posts. If you want to run it, this repo should run ok on Ubuntu or Mac. It is our latest development version. Oh, and this is our discord in case you want to reach us. 👋 |
The video @tangjeff0 mentioned above covers both the broad vision and technical details of Fluree better than I could, but here's my quick take: Fluree is an in-memory, semantic graph database backed by a permissioned blockchain, built with Clojure and open-source. It can be containerized (with Kubernetes support) and optionally decentralized (e.g. using StorJ via Tardigrade), run as a standalone JVM service, or embedded inside the browser as a web worker. Read more here about the query server (fluree-db) and the ledger server (fluree-ledger) Since Fluree extends RDF (official W3C standard for data interchange), it immediately becomes interoperable with the linked open datasets on the semantic web. One interesting use case would be to directly query DBPedia or Wikidata from within Athens and combine it with your own data at runtime, without an API. Additionally an RDF foundation means you can build ontologies with any of the modeling languages that build on top of it (RDFS, OWL, etc., which are the official recommendations of W3C), which opens up capabilities for inferencing and automated reasoning. From my view, Fluree could be a powerhouse tool to strongly differentiate Athens from Roam and every other "tool for networked thought." Between RDF standards and a permissioned blockchain (which allows for block/cell-level access control), you could seamlessly and securely deploy Athens at an individual, team, or enterprise level using the same scalable infrastructure. Would love to get the Fluree team's thoughts here... |
@quoll I would like to advocate for the adoption of https://github.com/threatgrid/asami but feel like it would better be left up to the expert :) Athens is currently Clojurescript/re-frame/datascript/posh (I'm working on sunsetting posh rn actually) What are your thoughts on whether Asami would be a good fit as a graph-DB for us? Selfishly, I will admit I would LOVE the excuse to combine our opensource powers to leverage the benefits of bi-directional knowledge linking, use Asami in the wild and possibly have the opportunity to work with you in a technical aspect to help implement it if we do end up going this way... and I don't think I'd be the only one! |
Love to help. I hope to have Asami 2.0-alpha out by the end of the week. This will have storage when on the JVM. JavaScript is coming, but in the meantime it will have save/load functions. |
I've looked a bit into Datahike. From what I learned it looks like:
As someone new to Clojure, this makes me less nervous about depending on a backend that has a Datomic-like API, and optimistic about Datahike, becaue it would still allow freedom of backing storage system. |
@mateicanavra laid out Fluree for us very well in his comment above. I will elaborate a little on some of the points made and bring up one additional one, which is one of the most powerful parts of Fluree. |
Noting this work-in-progress Datahike backend for the benefit of those following this issue: https://github.com/athensresearch/athens-backend Also, I recently pulled together a comparison matrix for various Clojure-Datalog stores: https://clojurelog.github.io/ |
Made code better based on jeff's great review.
The text was updated successfully, but these errors were encountered: