Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/sorted sets #523

Closed
wants to merge 62 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
cf2f43c
add persistent-sorted-set dependency
zonotope Jun 22, 2023
dbff487
remove unused fns
zonotope Jun 22, 2023
116569a
use subrange fn defined in flake ns
zonotope Jun 22, 2023
3382a18
remove unused fns and references to network attribute
zonotope Jun 22, 2023
871a4c3
ledger-id -> ledger-alias
zonotope Jun 22, 2023
0421797
remove references to network/ledger-id in storage
zonotope Jun 22, 2023
d04bef9
fluree.db.storage.core -> fluree.db.storage
zonotope Jun 22, 2023
7ea67f5
add accessors and a comparator for child entries
zonotope Jun 24, 2023
395383d
add a minimum flake value
zonotope Jun 25, 2023
231322e
use slice instead of subrange
zonotope Jun 26, 2023
e2868a8
fix recursive invocation arity
zonotope Jun 26, 2023
921657e
use persistent-sorted-set instead of clojure.data.avl for flake sets
zonotope Jun 26, 2023
77ca187
don't use apply when creating sorted sets and maps
zonotope Jun 26, 2023
1e77e31
ensure node children remains a sorted map; use map entries in slice fns
zonotope Jun 26, 2023
2fab4e6
remove unused functions
zonotope Jun 27, 2023
996de33
use nil for indeterminate flake boundaries
zonotope Jun 27, 2023
d0e37d4
remove unnecessary start/end test opts
zonotope Jun 28, 2023
14e7cf6
use sorted sets for node children; remove now unnecessary sorted map
zonotope Jun 28, 2023
0398837
remove unused clojure.data.avl dependency
zonotope Jun 28, 2023
3ff5aab
remove unused namespace
zonotope Jun 28, 2023
410707c
fix typo
zonotope Jun 28, 2023
e7bae2c
Merge remote-tracking branch 'origin/main' into feature/sorted-sets
zonotope Jul 3, 2023
2bd331c
change node comparator to consider entire interval of flakes
zonotope Jul 3, 2023
d51ac0f
add an rslice fn; fix max/min flakes' metadata components as integer
zonotope Jul 3, 2023
0bef8ab
add endpoints to tree-chan api to filter out irrelevant children
zonotope Jul 3, 2023
a5c5908
remove unused flake ranking fns
zonotope Jul 3, 2023
551067b
remove unused resolved-leaf? fn
zonotope Jul 3, 2023
aebfb93
add docstring to node comparator
zonotope Jul 3, 2023
296193d
remove include? fn in favor of passed in transducer in tree-chan
zonotope Jul 3, 2023
0410d48
iterate through flakes only once by combining transducers
zonotope Jul 3, 2023
fc1cf27
use default arity for tree-chan when refreshing index
zonotope Jul 3, 2023
4e83877
assume collection argument to sorted-set-by is already sorted
zonotope Jul 4, 2023
0c4d518
also filter out leaves that haven't been resolved
zonotope Jul 4, 2023
e3c5ad6
move test logging configuration to test-resources
zonotope Jul 5, 2023
7c0f7c4
trim flakes and branches between start/end flakes in tree-chan
zonotope Jul 5, 2023
c8e5710
remove redundant `:leftmost?` index node attribute
zonotope Jul 5, 2023
ac06a03
add subrange utility fn that returns a flakeset instead of slice
zonotope Jul 5, 2023
d0e948f
clean up trimming code for efficiency
zonotope Jul 5, 2023
1a003d9
add utility fns for if a node is largest/smallest of its siblings
zonotope Jul 5, 2023
f8b7bda
trim-right? should trim if rhs comes _after_ the end flake
zonotope Jul 5, 2023
ff502b9
remove now unused subrange fn
zonotope Jul 5, 2023
a3a4ad5
use info instead of warn because this is a normal operation
zonotope Jul 5, 2023
fa2a531
drop last flake when splitting leaves so siblings don't overlap
zonotope Jul 5, 2023
84cefc0
use nil instead of min/max flakes for indeterminate endpoints
zonotope Jul 5, 2023
0cb168d
Merge remote-tracking branch 'origin/main' into feature/sorted-sets
zonotope Jul 10, 2023
1639a06
formatting
zonotope Jul 10, 2023
d8d30e6
prioritize including n and xf over start/end flakes in tree-chan
zonotope Jul 10, 2023
a247bbe
pretty-print flakes for (some) repl formatters to work
zonotope Jul 11, 2023
4120855
resolve empty branches too
zonotope Jul 12, 2023
2d3f1ae
don't clobber :first attr when add/removing flakes to/from leaves
zonotope Jul 12, 2023
b728b63
Merge remote-tracking branch 'origin/main' into feature/sorted-sets
zonotope Jul 12, 2023
0d1f520
use pre-existing util/sequential fn
zonotope Jul 12, 2023
48044b7
Merge remote-tracking branch 'origin/main' into feature/sorted-sets
zonotope Jul 13, 2023
42e1709
unresolve child nodes before attaching them to branches
zonotope Jul 13, 2023
b41c839
unresolve the root node as well after indexing is complete
zonotope Jul 13, 2023
f99f883
use the original :first attribute when beginning to rebalance leaves
zonotope Jul 13, 2023
cbab0e9
add widely used ns to dev repl env
zonotope Jul 13, 2023
9390d6d
Merge remote-tracking branch 'origin/fix/id-map-resolution' into feat…
zonotope Jul 14, 2023
04584a2
Merge remote-tracking branch 'origin/main' into feature/sorted-sets
zonotope Jul 14, 2023
5ffb8c7
don't resolve empty nodes unless necessary
zonotope Jul 14, 2023
56f930b
Merge remote-tracking branch 'origin/main' into feature/sorted-sets
zonotope Jul 16, 2023
69255a9
add some docstrings
zonotope Jul 17, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
113 changes: 56 additions & 57 deletions deps.edn
Original file line number Diff line number Diff line change
@@ -1,49 +1,48 @@
{:deps {org.clojure/clojure {:mvn/version "1.11.1"}
org.clojure/clojurescript {:mvn/version "1.11.60"}
org.clojure/core.async {:mvn/version "1.6.673"}
org.clojure/core.cache {:mvn/version "1.0.225"}
org.clojars.mmb90/cljs-cache {:mvn/version "0.1.4"}
org.clojure/data.avl {:mvn/version "0.1.0"}
org.clojure/data.xml {:mvn/version "0.2.0-alpha8"}
environ/environ {:mvn/version "1.2.0"}
byte-streams/byte-streams {:mvn/version "0.2.4"}
cheshire/cheshire {:mvn/version "5.11.0"}
instaparse/instaparse {:mvn/version "1.4.12"}
metosin/malli {:mvn/version "0.11.0"}
com.fluree/json-ld {:git/url "https://github.com/fluree/json-ld.git"
:git/sha "2be635f2a30b08a5c100a35dd8afb20078d261ac"}


;; logging
org.clojure/tools.logging {:mvn/version "1.2.4"}
ch.qos.logback/logback-classic {:mvn/version "1.4.7"}
org.slf4j/slf4j-api {:mvn/version "2.0.7"}

;; Lucene
clucie/clucie {:mvn/version "0.4.2"}

;; http
http-kit/http-kit {:mvn/version "2.6.0"}
com.fluree/http.async.client {:mvn/version "1.3.1-25-0xae4f"}

;; benchmarking
criterium/criterium {:mvn/version "0.4.6"}

;; serialization / compression
com.fluree/alphabase {:mvn/version "3.3.0"}

;; cryptography
com.fluree/crypto {:mvn/version "0.4.0"}

org.bouncycastle/bcprov-jdk15on {:mvn/version "1.70"}

;; smartfunctions
org.babashka/sci {:mvn/version "0.3.31"}

;; storage
com.cognitect.aws/api {:mvn/version "0.8.666"}
com.cognitect.aws/endpoints {:mvn/version "1.1.12.456"}
com.cognitect.aws/s3 {:mvn/version "847.2.1365.0"}}
{:deps {org.clojure/clojure {:mvn/version "1.11.1"}
org.clojure/clojurescript {:mvn/version "1.11.60"}
org.clojure/core.async {:mvn/version "1.6.673"}
org.clojure/core.cache {:mvn/version "1.0.225"}
org.clojars.mmb90/cljs-cache {:mvn/version "0.1.4"}
org.clojure/data.xml {:mvn/version "0.2.0-alpha8"}
environ/environ {:mvn/version "1.2.0"}
byte-streams/byte-streams {:mvn/version "0.2.4"}
cheshire/cheshire {:mvn/version "5.11.0"}
instaparse/instaparse {:mvn/version "1.4.12"}
metosin/malli {:mvn/version "0.11.0"}
com.fluree/json-ld {:git/url "https://github.com/fluree/json-ld.git"
:git/sha "2be635f2a30b08a5c100a35dd8afb20078d261ac"}
persistent-sorted-set/persistent-sorted-set {:mvn/version "0.2.3"}


;; logging
org.clojure/tools.logging {:mvn/version "1.2.4"}
ch.qos.logback/logback-classic {:mvn/version "1.4.7"}
org.slf4j/slf4j-api {:mvn/version "2.0.7"}

;; Lucene
clucie/clucie {:mvn/version "0.4.2"}

;; http
http-kit/http-kit {:mvn/version "2.6.0"}
com.fluree/http.async.client {:mvn/version "1.3.1-25-0xae4f"}

;; benchmarking
criterium/criterium {:mvn/version "0.4.6"}

;; serialization / compression
com.fluree/alphabase {:mvn/version "3.3.0"}

;; cryptography
com.fluree/crypto {:mvn/version "0.4.0"}
org.bouncycastle/bcprov-jdk15on {:mvn/version "1.70"}

;; smartfunctions
org.babashka/sci {:mvn/version "0.3.31"}

;; storage
com.cognitect.aws/api {:mvn/version "0.8.666"}
com.cognitect.aws/endpoints {:mvn/version "1.1.12.456"}
com.cognitect.aws/s3 {:mvn/version "847.2.1365.0"}}

:paths ["src" "resources"]

Expand All @@ -63,8 +62,8 @@

:cljtest
{:extra-paths ["test" "dev-resources" "test-resources"]
:extra-deps {lambdaisland/kaocha {:mvn/version "1.83.1314"}
org.clojure/test.check {:mvn/version "1.1.1"}
:extra-deps {lambdaisland/kaocha {:mvn/version "1.83.1314"}
org.clojure/test.check {:mvn/version "1.1.1"}
io.github.cap10morgan/test-with-files {:git/tag "v1.0.0"
:git/sha "9181a2e"}}
:exec-fn kaocha.runner/exec-fn
Expand Down Expand Up @@ -96,15 +95,15 @@
:main-opts ["-m" "cloverage.coverage" "-p" "src" "-s" "test" "--output" "scanning_results/coverage"]}

:eastwood
{:extra-deps {jonase/eastwood {:mvn/version "1.4.0"}}
:main-opts ["-m" "eastwood.lint"
{:source-paths ["src" "src-docs"]
:test-paths ["test"]
;; TODO: Un-exclude this when it stops triggering false
;; positives on "UnsupportedOperationException empty is
;; not supported on Flake" when using the #Flake data
;; reader - WSM 2023-02-01
:exclude-linters [:implicit-dependencies]}]}
{:extra-deps {jonase/eastwood {:mvn/version "1.4.0"}}
:main-opts ["-m" "eastwood.lint"
{:source-paths ["src" "src-docs"]
:test-paths ["test"]
;; TODO: Un-exclude this when it stops triggering false
;; positives on "UnsupportedOperationException empty is
;; not supported on Flake" when using the #Flake data
;; reader - WSM 2023-02-01
:exclude-linters [:implicit-dependencies]}]}

:ancient
{:extra-deps {com.github.liquidz/antq {:mvn/version "RELEASE"}}
Expand Down
1 change: 0 additions & 1 deletion dev/json_ld/shacl.clj
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@
[fluree.db.util.log :as log]
[fluree.db.index :as index]
[criterium.core :as criterium]
[clojure.data.avl :as avl]
[clojure.tools.reader.edn :as edn]))


Expand Down
2 changes: 1 addition & 1 deletion dev/user.clj
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
[fluree.db.flake :as flake]
[fluree.db.util.json :as json]
[fluree.db.serde.json :as serdejson]
[fluree.db.storage.core :as storage]
[fluree.db.storage :as storage]
[fluree.db.query.fql :as fql]
[fluree.db.query.range :as query-range]
[fluree.db.dbproto :as dbproto]
Expand Down
4 changes: 2 additions & 2 deletions src/fluree/db/conn/file.cljc
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
[fluree.db.conn.cache :as conn-cache]
[fluree.db.conn.state-machine :as state-machine]
[fluree.db.util.log :as log :include-macros true]
[fluree.db.storage.core :as storage]
[fluree.db.storage :as storage]
[fluree.db.indexer.default :as idx-default]
[fluree.db.serde.json :refer [json-serde]]
#?@(:cljs [["fs" :as fs]
Expand Down Expand Up @@ -255,7 +255,7 @@
[conn {:keys [id leaf tempid] :as node}]
(let [cache-key [::resolve id tempid]]
(if (= :empty id)
(storage/resolve-empty-leaf node)
(storage/resolve-empty-node node)
(conn-cache/lru-lookup
lru-cache-atom
cache-key
Expand Down
4 changes: 2 additions & 2 deletions src/fluree/db/conn/ipfs.cljc
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
(ns fluree.db.conn.ipfs
(:require [fluree.db.storage.core :as storage]
(:require [fluree.db.storage :as storage]
[fluree.db.index :as index]
[fluree.db.util.context :as ctx-util]
[fluree.db.util.core :as util :refer [exception?]]
Expand Down Expand Up @@ -131,7 +131,7 @@
[conn {:keys [id leaf tempid] :as node}]
(let [cache-key [::resolve id tempid]]
(if (= :empty id)
(storage/resolve-empty-leaf node)
(storage/resolve-empty-node node)
(conn-cache/lru-lookup
lru-cache-atom
cache-key
Expand Down
4 changes: 2 additions & 2 deletions src/fluree/db/conn/memory.cljc
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
(ns fluree.db.conn.memory
(:require [clojure.core.async :as async :refer [go]]
[fluree.db.storage.core :as storage]
[fluree.db.storage :as storage]
[fluree.db.index :as index]
[fluree.db.util.context :as ctx-util]
[fluree.db.util.core :as util]
Expand Down Expand Up @@ -163,7 +163,7 @@
[_ node]
;; all root index nodes will be empty

(storage/resolve-empty-leaf node))
(storage/resolve-empty-node node))

#?@(:clj
[full-text/IndexConnection
Expand Down
4 changes: 2 additions & 2 deletions src/fluree/db/conn/s3.clj
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
[fluree.db.indexer.default :as idx-default]
[fluree.db.ledger.proto :as ledger-proto]
[fluree.db.serde.json :refer [json-serde]]
[fluree.db.storage.core :as storage]
[fluree.db.storage :as storage]
[fluree.db.util.context :as ctx-util]
[fluree.db.util.json :as json]
[fluree.db.util.log :as log]
Expand Down Expand Up @@ -224,7 +224,7 @@
(resolve [conn {:keys [id leaf tempid] :as node}]
(let [cache-key [::resolve id tempid]]
(if (= :empty id)
(storage/resolve-empty-leaf node)
(storage/resolve-empty-node node)
(conn-cache/lru-lookup lru-cache-atom cache-key
(fn [_]
(storage/resolve-index-node
Expand Down
12 changes: 6 additions & 6 deletions src/fluree/db/db/json_ld.cljc
Original file line number Diff line number Diff line change
Expand Up @@ -313,11 +313,11 @@
opst-cmp :opst
tspo-cmp :tspo} index/default-comparators

spot (index/empty-branch method alias spot-cmp)
psot (index/empty-branch method alias psot-cmp)
post (index/empty-branch method alias post-cmp)
opst (index/empty-branch method alias opst-cmp)
tspo (index/empty-branch method alias tspo-cmp)
spot (index/empty-branch alias spot-cmp)
psot (index/empty-branch alias psot-cmp)
post (index/empty-branch alias post-cmp)
opst (index/empty-branch alias opst-cmp)
tspo (index/empty-branch alias tspo-cmp)
stats {:flakes 0, :size 0, :indexed 0}
schema (vocab/base-schema)
branch (branch/branch-meta ledger)
Expand All @@ -327,7 +327,7 @@
(map->JsonLdDb {:ledger ledger
:conn conn
:method method
:alias alias
:ledger-alias alias
:branch (:name branch)
:commit (:commit branch)
:t 0
Expand Down
117 changes: 22 additions & 95 deletions src/fluree/db/flake.cljc
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
(ns fluree.db.flake
(:refer-clojure :exclude [split-at sorted-set-by sorted-map-by take last])
(:require [clojure.data.avl :as avl]
(:require [me.tonsky.persistent-sorted-set :as pss]
[me.tonsky.persistent-sorted-set.arrays :as arrays]
[fluree.db.constants :as const]
[fluree.db.util.core :as util]
#?(:clj [clojure.pprint :as pprint]))
Expand Down Expand Up @@ -227,10 +228,6 @@
[flake]
[(s flake) (p flake) (o flake) (dt flake) (t flake) (op flake) (m flake)])

(def maximum
"The largest flake possible"
(->Flake util/max-long 0 util/max-long const/$xsd:decimal 0 true nil))

(defn- assoc-flake
"Assoc for Flakes"
[flake k v]
Expand Down Expand Up @@ -451,81 +448,30 @@
(defn slice
"From and to are Flakes"
[ss from to]
(cond
(and from to) (avl/subrange ss >= from <= to)
(nil? from) (avl/subrange ss <= to)
(nil? to) (avl/subrange ss >= from)
:else (throw (ex-info "Unexpected error performing slice, both from and to conditions are nil. Please report."
{:status 500
:error :db/unexpected-error}))))

(defn match-spot
"Returns all matching flakes to a specific subject, and optionaly also a predicate if provided
Must be provided with subject/predicate integer ids, no lookups are performed."
[ss sid pid]
(if pid
(avl/subrange ss >= (->Flake sid pid nil -1 nil nil nil)
<= (->Flake sid (inc pid) nil util/max-long nil nil nil))
(avl/subrange ss > (->Flake (inc sid) MAX-COLL-SUBJECTS nil nil nil nil nil)
< (->Flake (dec sid) -1 nil nil nil nil nil))))


(defn match-post
"Returns all matching flakes to a predicate + object match."
[ss pid o dt]
(avl/subrange ss
>= (->Flake util/max-long pid o dt nil nil nil)
<= (->Flake 0 pid o dt nil nil nil)))

(defn match-tspo
"Returns all matching flakes to a specific 't' value."
[ss t]
(avl/subrange ss
>= (->Flake util/max-long nil nil nil t nil nil)
<= (->Flake util/min-long nil nil nil t nil nil)))

(defn lookup
[ss start-flake end-flake]
(avl/subrange ss >= start-flake <= end-flake))

(defn subrange
([ss test flake]
(avl/subrange ss test flake))
([ss start-test start-flake end-test end-flake]
(avl/subrange ss start-test start-flake end-test end-flake)))


(defn split-at
[n ss]
(avl/split-at n ss))

(defn lower-than-all?
[f ss]
(let [[lower e _] (avl/split-key f ss)]
(and (nil? e)
(empty? lower))))

(defn higher-than-all?
[f ss]
(let [[_ e upper] (avl/split-key f ss)]
(and (nil? e)
(empty? upper))))

(defn split-by-flake
"Splits a sorted set at a given flake. If there is an exact match for flake,
puts it in the left-side. Primarily for use with last-flake."
[f ss]
(let [[l e r] (avl/split-key f ss)]
[(if e (conj l e) l) r]))
(pss/slice ss from to))

(defn rslice
[ss to from]
(pss/rslice ss to from))

(defn sorted-set-by
[comparator & flakes]
(apply avl/sorted-set-by comparator flakes))
"Create a new sorted set according to `comparator`. If a collection of flakes or
index nodes is supplied as the `elts` argument, the returned set will contain
all of the elements in the collection. If the `elts` collection is supplied,
it *must* already be sorted."
([comparator]
(pss/sorted-set-by comparator))
([comparator elts]
(->> elts
arrays/into-array
(pss/from-sorted-array comparator))))

(defn sorted-map-by
[comparator & entries]
(apply avl/sorted-map-by comparator entries))
(defn match-tspo
"Returns all matching flakes to a specific 't' value."
[ss t]
(pss/slice ss
(->Flake util/max-long nil nil nil t nil nil)
(->Flake util/min-long nil nil nil t nil nil)))

(defn transient-reduce
[reducer ss coll]
Expand All @@ -549,16 +495,6 @@
[ss to-remove]
(transient-reduce disj! ss to-remove))

(defn assoc-all
[sm entries]
(transient-reduce (fn [m [k v]]
(assoc! m k v))
sm entries))

(defn dissoc-all
[sm ks]
(transient-reduce dissoc! sm ks))

(defn last
"Returns the last item in `ss` in constant time as long as `ss` is a sorted
set."
Expand Down Expand Up @@ -617,12 +553,3 @@
(/ 1000)
(double)
(Math/round)))


(defn take
"Takes n flakes from a sorted flake set, retaining the set itself."
[n flake-set]
(if (>= n (count flake-set))
flake-set
(let [k (nth flake-set n)]
(first (avl/split-key k flake-set)))))
Loading