Org Element Cache
The Org Element API allows parsing & manipulating org-mode files in the form of lisp objects.
A cache for these objects is already included in org-mode and disabled by default.
However, when working with large numbers (> 1000) of files, populating and processing this cache for each file takes a long time.
This package tries to alleviate this problem by:
- Allowing users to register hooks that compute some data form an org element
- Persisting the results on disk
- Taking care of keeping the cached data in sync with the actual buffer contents
In the current version, only hooks on a file-level are implemented. Future versions might include a way to register hooks for element types.
Structure of a Cache
A cache has four components:
- A hash table with a plist for each file managed by the cache
- A list of folders used to find files managed by the cache
- A list of hooks to generate cached data
- A file to persist the cache in
Populating the Cache
A cache is populated with entries by opening each file in a temporary
buffer, parsing it to an
org-element and passing this element to
each of the hooks.
Doing this for the first time can take a few seconds, depending on the number of files in the cache.
Once the cache has been saved to disk for the first time, a
combination of the
sha1sum shell commands is used to
determine which entries need to be updated.
Assuming you’re only using one emacs instance at a time, updating the cache on startup should take only a few milliseconds.
def-org-el-cache macro can be used to define a new cache.
(def-org-el-cache my-cache ;; name of the cache / variable to store it in (list "~/org") ;; directories managed by this cache "~/org/.cache.el" ;; file to persist the cache in )
Hooks can be added to a cache with the
(org-el-cache-add-hook my-cache ;; Cache to add the hook to :my-property ;; Property name to use for this hook (lambda (filename element) ;; Do something with the element and return some value ))
(org-el-cache-update cache)updates all outdated entries in the cache
(org-el-cache-force-update cache)re-initializes the cache
If you changed the definition of a hook or added a new one,
org-el-cache-force-update to re-initialize the cache.
Accessing Cached Data
(org-el-cache-get cache filename)returns the cache entry for a file
(org-el-cache-file-property cache filename prop)returns the value of a files property
The following functions work on all entries of a cache.
For more information on them, refer to the functions documentation
C-h f org-el-cache-map).
All caches are saved to disk at regular intervals (configurable via
org-el-cache-presist-interval, defaults to 10 minutes).
Usually, there is no need to manually persist or load a cache. If you want to do so anyway, you can use the following functions:
(org-el-cache-persist-all)persists all caches
(org-el-cache-persist cache)persists a single cache
(org-el-cache-load cache)loads a cache from disk
examples/file-selector.el contains the code for a file-selector (similar to Emacs’ find-file) that uses the titles of files instead of their paths.
For more information, you can check out the integration tests in org-el-cache-test.el.
Prior / Similar Art
Cached data is limited to objects that can be printed / read (serialized / deserialized).
This means that there is no (elegant) way to cache markers in files, e.g. when using cached headline data for org-agenda views.
A possible workaround would be attaching IDs to every headline, then using this ID (instead of a marker) to jump to a headline e.g. to change its TODO state.
org-el-cache uses a number of shell commands to find files that need to be updated.
These should be installed by default on most Linux / Unix distros.
The hash of a file or buffer is used to determine if it has changed since it was last processed by the cache.
When updating a single file,
(buffer-hash) is fast enough.
To speed up recursively searching directories for
.org files and
calculating their hashes, the
sha1sum shell commands are
used instead of
Working on a collection of ~1500 files with ~200k lines in total, initializing the cache takes around ~36s on my machine (Thinkpad L470, SSD).
Updating the cache once it has been initialized / loaded from disk takes around 200ms.
Org-mode already includes a cache for parsed elements. This is disabled by default since there seem to be problems with the implementation that cause Emacs to hang after making changes to org files.
In the long term, it would be nice to reuse as much of the existing code as possible and figure out where the bugs are.
[2020-02-08 Sat 12:50]
defmethod, the function help looks nicer this way and we don’t need overloaded methods