Merge pull request #158 from replikativ/release-0.3.0

Release 0.3.0
replikativ · May 24, 2020 · 3c81b6f · 3c81b6f
2 parents 499f151 + cc64244
commit 3c81b6f
Show file tree

Hide file tree

Showing 35 changed files with 1,934 additions and 534 deletions.
diff --git a/.gitignore b/.gitignore
@@ -18,4 +18,6 @@ release.clj
 .idea
 *.iml
 .vscode
+.#*
 /.gitold/COMMIT_EDITMSG
+.cpcache
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,3 +1,18 @@
+# 0.3.0
+
+- overhaul configuration while still supporting the old one
+- support of environment variables for configuration 
+- added better default configuration
+- adjust time points in history functions to match Datomic's API
+- add load-entities capabilities
+- add cas support for nil 
+- add support for non-date tx attributes 
+- add Java API
+- add Java interop in queries
+- add basic pagination
+- add noHistory support
+- multiple bugfixes including downstream dependencies
+
 # 0.2.1
 
 - add numbers type

diff --git a/README.md b/README.md
@@ -1,18 +1,29 @@
-# datahike <a href="https://gitter.im/replikativ/replikativ?utm_source=badge&amp;utm_medium=badge&amp;utm_campaign=pr-badge&amp;utm_content=badge"><img src="https://camo.githubusercontent.com/da2edb525cde1455a622c58c0effc3a90b9a181c/68747470733a2f2f6261646765732e6769747465722e696d2f4a6f696e253230436861742e737667" alt="Gitter" data-canonical-src="https://badges.gitter.im/Join%20Chat.svg" style="max-width:100%;"></a> <a href="https://clojars.org/io.replikativ/datahike"> <img src="https://img.shields.io/clojars/v/io.replikativ/datahike.svg" /></a> [![CircleCI](https://circleci.com/gh/replikativ/datahike.svg?style=shield)](https://circleci.com/gh/replikativ/datahike)
-
-datahike is a durable [datalog](https://en.wikipedia.org/wiki/Datalog) database
-powered by an efficient datalog query engine. This project is a port of
-[datascript](https://github.com/tonsky/datascript) to the
+<h1 align="center">
+    Datahike
+</h1>
+<p align="center">
+<a href="https://clojurians.slack.com/archives/CB7GJAN0L"><img src="https://img.shields.io/badge/clojurians%20slack-join%20channel-blueviolet"/></a> 
+<a href="https://gitter.im/replikativ/replikativ?utm_source=badge&amp;utm_medium=badge&amp;utm_campaign=pr-badge&amp;utm_content=badge"><img src="https://camo.githubusercontent.com/da2edb525cde1455a622c58c0effc3a90b9a181c/68747470733a2f2f6261646765732e6769747465722e696d2f4a6f696e253230436861742e737667" alt="Gitter" data-canonical-src="https://badges.gitter.im/Join%20Chat.svg" style="max-width:100%;"></a> 
+<a href="https://clojars.org/io.replikativ/datahike"> <img src="https://img.shields.io/clojars/v/io.replikativ/datahike.svg" /></a> 
+<a href="https://circleci.com/gh/replikativ/datahike"><img src="https://circleci.com/gh/replikativ/datahike.svg?style=shield"/></a>
+<a href="https://github.com/replikativ/datahike/tree/development"><img src="https://img.shields.io/github/last-commit/replikativ/datahike/development"/></a>
+</p>
+
+Datahike is a durable [Datalog](https://en.wikipedia.org/wiki/Datalog) database
+powered by an efficient Datalog query engine. This project started as a port of
+[DataScript](https://github.com/tonsky/DataScript) to the
 [hitchhiker-tree](https://github.com/datacrypt-project/hitchhiker-tree). All
-datascript tests are passing, but we are still working on the internals. Having
-said this we consider datahike usable for small projects, since datascript is
+DataScript tests are passing, but we are still working on the internals. Having
+said this we consider Datahike usable for medium sized projects, since DataScript is
 very mature and deployed in many applications and the hitchhiker-tree
-implementation is at least heavily tested through generative testing. We are
+implementation is heavily tested through generative testing. We are
 building on the two projects and the storage backends for the hitchhiker-tree
 through [konserve](https://github.com/replikativ/konserve). We would like to
 hear experience reports and are happy if you join us.
 
-Some presentations are available:
+You may find articles on Datahike on our company's [blog page](https://lambdaforge.io/articles).
+
+We presented Datahike also at meetups,for example at:
 
 - [2019 scicloj online meetup](https://www.youtube.com/watch?v=Hjo4TEV81sQ).
 - [2019 Vancouver Meetup](https://www.youtube.com/watch?v=A2CZwOHOb6U).
@@ -33,15 +44,17 @@ stable on-disk schema. _Take a look at the ChangeLog before upgrading_.
 
 
 ;; use the filesystem as storage medium
-(def uri "datahike:file:///tmp/example")
+(def cfg {:store {:backend :file :path "/tmp/example"}})
 
-;; create a database at this place, by default configuration we have a strict
-;; schema and temporal index
-(d/create-database uri)
+;; create a database at this place, per default configuration we enforce a strict
+;; schema and keep all historical data
+(d/create-database cfg)
 
-(def conn (d/connect uri))
+(def conn (d/connect cfg))
 
 ;; the first transaction will be the schema we are using
+;; you may also add this within database creation by adding :initial-tx
+;; to the configuration
 (d/transact conn [{:db/ident :name
                    :db/valueType :db.type/string
                    :db/cardinality :db.cardinality/one }
@@ -83,15 +96,15 @@ stable on-disk schema. _Take a look at the ChangeLog before upgrading_.
   (d/history @conn))
 ;; => #{[20] [25]}
 
-;; you might need to release the connection, e.g. for leveldb
+;; you might need to release the connection for specific stores like leveldb
 (d/release conn)
 
 ;; clean up the database if it is not need any more
-(d/delete-database uri)
+(d/delete-database cfg)
 ```
 
 The API namespace provides compatibility to a subset of Datomic functionality
-and should work as a drop-in replacement on the JVM. The rest of datahike will
+and should work as a drop-in replacement on the JVM. The rest of Datahike will
 be ported to core.async to coordinate IO in a platform-neutral manner.
 
 Refer to the docs for more information:
@@ -110,25 +123,25 @@ For simple examples have a look at the projects in the `examples` folder.
   demonstrated at the [Dutch Clojure
   Meetup](https://www.meetup.com/de-DE/The-Dutch-Clojure-Meetup/events/trmqnpyxjbrb/).
 
-## Relationship to Datomic and datascript
+## Relationship to Datomic and DataScript
 
-datahike provides similar functionality to [Datomic](http://Datomic.com) and can
-be used as a drop-in replacement for a subset of it. The goal of datahike is not
+Datahike provides similar functionality to [Datomic](http://Datomic.com) and can
+be used as a drop-in replacement for a subset of it. The goal of Datahike is not
 to provide an open-source reimplementation of Datomic, but it is part of the
 [replikativ](https://github.com/replikativ) toolbox aimed to build distributed
 data management solutions. We have spoken to many backend engineers and Clojure
 developers, who tried to stay away from Datomic just because of its proprietary
-nature and we think in this regard datahike should make an approach to Datomic
-easier and vice-versa people who only want to use the goodness of datalog in
+nature and we think in this regard Datahike should make an approach to Datomic
+easier and vice-versa people who only want to use the goodness of Datalog in
 small scale applications should not worry about setting up and depending on
 Datomic.
 
 Some differences are:
 
-- datahike runs locally on one peer. A transactor might be provided in the
+- Datahike runs locally on one peer. A transactor might be provided in the
   future and can also be realized through any linearizing write mechanism, e.g.
   Apache Kafka. If you are interested, please contact us.
-- datahike provides the database as a transparent value, i.e. you can directly
+- Datahike provides the database as a transparent value, i.e. you can directly
   access the index datastructures (hitchhiker-tree) and leverage their
   persistent nature for replication. These internals are not guaranteed to stay
   stable, but provide useful insight into what is going on and can be optimized.
@@ -139,45 +152,40 @@ Datomic is a full-fledged scalable database (as a service) built from the
 authors of Clojure and people with a lot of experience. If you need this kind
 of professional support, you should definitely stick to Datomic.
 
-datahike's query engine and most of its codebase come from
-[datascript](https://github.com/tonsky/datascript). Without the work on
-datascript, datahike would not have been possible. Differences to Datomic with
+Datahike's query engine and most of its codebase come from
+[DataScript](https://github.com/tonsky/DataScript). Without the work on
+DataScript, Datahike would not have been possible. Differences to Datomic with
 respect to the query engine are documented there.
 
 ## When should I pick what?
 
-### datahike
+### Datahike
 
-Pick datahike if your app has modest requirements towards a typical durable
+Pick Datahike if your app has modest requirements towards a typical durable
 database, e.g. a single machine and a few millions of entities at maximum.
 Similarly if you want to have an open-source solution and be able to study and
-tinker with the codebase of your database, datahike provides a comparatively
+tinker with the codebase of your database, Datahike provides a comparatively
 small and well composed codebase to tweak it to your needs. You should also
 always be able to migrate to Datomic later easily.
 
 ### Datomic
 
 Pick Datomic if you already know that you will need scalability later or if you
 need a network API for your database. There is also plenty of material about
-Datomic online already. Most of it applies in some form or another to datahike,
+Datomic online already. Most of it applies in some form or another to Datahike,
 but it might be easier to use Datomic directly when you first learn Datalog.
 
-### datascript
+### DataScript
 
-Pick datascript if you want the fastest possible query performance and do not
+Pick DataScript if you want the fastest possible query performance and do not
 have a huge amount of data. You can easily persist the write operations
-separately and use the fast in-memory index datastructure of datascript then.
-datahike also at the moment does not support ClojureScript anymore, although we
+separately and use the fast in-memory index datastructure of DataScript then.
+Datahike also at the moment does not support ClojureScript anymore, although we
 plan to recover this functionality.
 
-In general all [datascript
-documentation](https://github.com/tonsky/datascript/wiki/Getting-started)
-applies for namespaces beyond `datahike.api`. We are working towards a portable
-version of datahike on top of `core.async`. Feel free to provide some help :).
-
 ## ClojureScript support
 
-ClojureScript support is planned. Please see [Roadmap](https://github.com/replikativ/datahike#roadmap).
+ClojureScript support is planned and work in progress. Please see [Roadmap](https://github.com/replikativ/datahike#roadmap).
 
 ## Migration & Backup
 
@@ -188,10 +196,10 @@ The database can be exported to a flat file with:
 (export-db @conn "/tmp/eavt-dump")
 ```
 
-You must do so before upgrading to a datahike version that has changed the
+You must do so before upgrading to a Datahike version that has changed the
 on-disk format. This can happen as long as we are arriving at version `1.0.0`
 and will always be communicated through the Changelog. After you have bumped the
-datahike version you can use
+Datahike version you can use
 
 ```clojure
 ;; ... setup new-conn (recreate with correct schema)
@@ -239,7 +247,7 @@ Have a look at the [change log](./CHANGELOG.md) for recent updates.
 
 - support GC or eager deletion of fragments
 - use hitchhiker-tree synchronization for replication
-- run comprehensive query suite and compare to datascript and Datomic
+- run comprehensive query suite and compare to DataScript and Datomic
 - support anomaly errors (?)
 
 ### 1.0.0
@@ -256,6 +264,6 @@ feature, please let us know.
 
 ## License
 
-Copyright © 2014–2019 Konrad Kühne, Christian Weilbach, Nikita Prokopov
+Copyright © 2014–2020 Konrad Kühne, Christian Weilbach, Nikita Prokopov
 
 Licensed under Eclipse Public License (see [LICENSE](LICENSE)).
diff --git a/dev/sandbox.clj b/dev/sandbox.clj
@@ -3,30 +3,41 @@
 
 (comment
 
-  (def uri "datahike:mem://sandbox")
-
-  (d/delete-database uri)
-
-  (def schema [{:db/ident :name
+  (def schema [{:db/ident       :name
                 :db/cardinality :db.cardinality/one
-                :db/index true
-                :db/unique :db.unique/identity
-                :db/valueType :db.type/string}
-               {:db/ident :sibling
+                :db/index       true
+                :db/unique      :db.unique/identity
+                :db/valueType   :db.type/string}
+               {:db/ident       :sibling
                 :db/cardinality :db.cardinality/many
-                :db/valueType :db.type/ref}
-               {:db/ident :age
+                :db/valueType   :db.type/ref}
+               {:db/ident       :age
                 :db/cardinality :db.cardinality/one
-                :db/valueType :db.type/long}])
+                :db/valueType   :db.type/long}])
+
+  (def cfg {:store  {:backend :mem :id "sandbox"}
+            :keep-history? true
+            :schema-flexibility :write
+            :initial-tx schema})
+
+  (d/delete-database cfg)
 
-  (d/create-database uri :initial-tx schema)
+  (d/create-database cfg)
 
-  (def conn (d/connect uri))
+  (def conn (d/connect cfg))
 
-  (def result (d/transact conn [{:name  "Alice", :age   25}
-                                {:name  "Bob", :age   35}
-                                {:name "Charlie", :age 45 :sibling [[:name "Alice"] [:name "Bob"]]}]))
+  (d/transact conn [{:name "Alice"
+                     :age  25}
+                    {:name "Bob"
+                     :age  35}
+                    {:name    "Charlie"
+                     :age     45
+                     :sibling [[:name "Alice"] [:name "Bob"]]}])
 
-  (d/q '[:find ?e ?v ?t :where [?e :name ?v ?t]] @conn)
+  (d/q '[:find ?e ?a ?v ?t
+         :in $ ?a
+         :where [?e :name ?v ?t] [?e :age ?a]]
+       @conn
+       35)
 
   )
diff --git a/doc/backend-development.md b/doc/backend-development.md
@@ -18,6 +18,9 @@ Here, we provide a basic template for a backend implementation. The bracketed te
 - **indexID** should be a `keyword` identifying an index to be used as default for your backend. So far, you can choose between the following: 
   - `:datahike.index/hitchhiker-tree` 
   - `:datahike.index/persistent-set`
+- **configSpec** optional `clojure.spec` definition for configuration validation
+
+You may add any configuration attributes to the store configuration. Only `:backend` is mandatory which refers to **backendID**.
 
 In your *core.clj*:
 ```clojure
@@ -27,23 +30,29 @@ In your *core.clj*:
   ))
 
 
-(defmethod s/empty-store {backendID} [{:keys [path]}]
+(defmethod s/empty-store {backendID} [config]
   ;; your implementation
   )
 
-(defmethod s/delete-store {backendID} [{:keys [path]}]
+(defmethod s/delete-store {backendID} [config]
   ;; your implementation
   )
 
-(defmethod connect-store {backendID} [{:keys [path]}]
+(defmethod connect-store {backendID} [config]
   ;; your implementation
   )
 
-(defmethod release-store {backendID} [_ store]
+(defmethod release-store {backendID} [config store]
   ;; your implementation
   )
 
 (defmethod scheme->index {backendID} [_]
   {indexID}
   )
+
+(defmethod default-config {backendID} [config]
+  ;; your implementation for default values e.g. from env vars or values from best practices
+ )
+
+(defmethod config-spec {backendID} [_] {configSpec})
 ```