Binary search trees
This little repo is part of an ongoing project to compare how binary search trees are implemented in various languages. The goal is twofold:
- Acquire an intuitive sense for how binary trees work.
- Investigate a range of engineering techniques for implementing binary trees.
In the unlikely chance you have stumbled into this repo, please note that all the code, exercises, problems and the like are quite likely to be wrong. None of this material is vetted. I find (and correct) mistakes and falsehoods regularly. If you use it for homework or interview questions, you are quite likely to be wrong.
- Implement diameter algorithm.
- Handle nodes with duplicate values. Currently, inserting a duplicate node cause a stack overflow. I hope this can be done elegantly, I suspect it will look ugly.
- Develop some sort of plan for reducing tests to minimal size. There should be some analysis or proof techniques for determining the necessary and sufficient conditions for testing recursive code. I don't mind overtesting while learning, it helps build intuitive understanding. But writing too many tests is time I'd rather spend learning more theory or actual implementation.
Tree data structure API:
insertprovisions the tree.
searchreturns a reference to a particular node in the tree.
collectprints keys in order to some container or stream.
is_presentdetermines whether a key exists in the tree.
destroy(c/c++) cleans up memory.
Fun things to do
Here is a list of various exercises and questions pulled from books and web pages.
- Laakman asks (Ex. 4.7, p. 86) how to find the first common ancestor for two nodes in a binary tree, which is not necessarily a binary search tree. (Binary search tree is probably much easier than an arbitrary binary tree.)
The following tables show what has been finished, and what is planned for future implementation.
Binary trees lend themselves particularly well to recursive implementations for most algorithms.
This table was after the beginning of the project, hence some entries simply show "Done" instead of the date completed. Each feature is regarded as complete when its associated test passes.
(Table generated by markdown table generator).
Note that all the implementations in the table are recursive. Each method could be written iteratively as well, which is a good exercise for the future.
Persistence, serialization, etc.
|json||relational||yaml||==||===||destroy||common parent||degrees of separation|
destroy for C and C++ means the tree and all the nodes are
shredded and free'd. (TODO) For the scripting languages and Java,
performs a post-order traversal, setting all the child pointers to
null or whatever flavor necessary.
Implementing these various trees is an interesting engineering problem. The initial approach is to inherit from the binary search tree implementation, adding and overriding as necessary.
Anything which can be done with recursion can be done with iteration.
Algorithms such as breadth-first search are easier to implement by iteration.
Trees implemented with arrays instead of pointers
A gold mine of interesting code awaits implementation.
The discerning programmer may find much of the code here to be somewhat over-tested. This is mostly because I use the testing to examine the behavior of the implementation, and deepen my understanding of the data structure, rather than proving the implementation with necessary and sufficient testing. Writing necessary and sufficient tests is an excellent exercise, and a good way to get even deeper understanding.