Bijector is a Clojure library for defining and converting between datatypes in a bijective manner (meaning each element in one type has a corresponding element in the other type and vice versa). The desired constraint is that the bijections should be efficient, meaning:
- It should not be possible to input an instance of a type of a reasonable size and have the computer run out of memory because the corresponding instance of the other type is too large.
- It should not be possible to input an instance of a type of a reasonable size and have the computer never return because the conversion takes too long.
You might be able to rigorously define these things mathematically, but I haven't bothered to, mostly because I would then have to check that all the techniques used in this library qualify.
At the moment the library is not too user-friendly, and might never be if nobody is interested in it.
Just to be arbitrary, let's try converting between hex strings and triples of integers:
user=> (use 'bijector.core)
nil
user=> (def HEX-STRINGS (strings-with-chars "0123456789abcdef"))
#'user/HEX-STRINGS
user=> (def INT-TRIPLES (tuples-of 3 INTEGERS))
#'user/INT-TRIPLES
user=> (def-converters to-triples to-hex INT-TRIPLES HEX-STRINGS)
#'user/to-hex
user=> (to-hex [77 77 77])
"c2268cf0"
user=> (to-triples *1)
(77 77 77)
user=> (to-triples "10ff9303b5478713931b6fb4221fa3007c7da728")
(-566312798866180140005020149758 421504070 -109516)
user=> (to-hex *1)
"10ff9303b5478713931b6fb4221fa3007c7da728"
BOOLEANS
: finite type containingtrue
andfalse
NATURALS
: the numbers 1,2,3,...INTEGERS
: the numbers ...,-3,-2,-1,0,1,2,3,...NATURAL-LISTS
: lists ofNATURALS
, e.g.[]
,[1,3929]
, and[2,2,2,2,2,2]
NATURAL-SETS
: sets ofNATURALS
, e.g.#{}
,#{55 3824}
, and#{9 99 999 9999}
NATURAL-BAGS
: bags (i.e., multisets) ofNATURALS
, e.g.[]
,[55 3824]
, and[9 9 9 9999]
SIMPLE-ASCII
: strings containing any of the "normal" ASCII characters, e.g."thomas"
,"\n\r~!@#"
, and""
NESTED-NATURAL-LISTS
: nested lists ofNATURALS
, e.g.[]
,[1,2,[],3]
, and[[4 [2] [] 5]]
Built in families of types.
integer-range-type
: creates a finite type consisting of a range of integersnatural-tuples-type
: likeNATURAL-LISTS
, but of a fixed lengthstrings-with-chars
: given a collection of characters, returns a type of all strings composed of those characters
Functions that take one or more types as arguments and return a new type.
lists-of
: given a typet
, creates a new type consisting of lists of elements oft
sets-of
: given a typet
, creates a new type consisting of sets of elements oft
cartesian-product-type
: givenN
types, creates a new type consisting of lists of lengthN
where the first item in the list is an element of the first type, etc.union-type
: if you've read this far you probably know what this doestuples-of
: likelists-of
but with a given fixed lengthpairs-of
: convenience method that callstuples-of
with an argument of2
without
: given a type and some of its elements, returns a new type without those elementswith
: given a type and some elements not in the type, returns a new type like the old type but including the new elements
- What do I mean by all this?
- If an e.e. set A has a subset B that is also e.e., does that imply that A/B is e.e.? E.g., is the set of all ASCII strings that are not valid JSON e.e.?
- What about type intersections?
- e.e. := efficiently enumerable
- e.e.e. := elegantly efficiently enumerable -- at this level we would try to distinguish from the trivial injection-based bijections.
- A test for e.e. that I just thought of -- we could look at the limiting density of a given property as the length of representation increases. E.g., for the nested-lists, examine the average order of the nested lists.
- Enumerating elements of a regular language: could we use this algorithm to do it?
- Oh hey what about huffman codes? Given a set of character frequencies, could we build the huffman tree and then enumerate all strings in order of their encoded length?