# K-d trees part 2
> "Implement a k-d tree in C++"

- toc:true
- branch: master
- badges: false
- comments: false
- author: Alexandros Giavaras
- categories: [multidimensional-search, kd-trees, k-nearest-search, algorithms, c++]

## Note

**Under development** :) :) :) 

## Overview 

In a previous post, we looked into what k-d trees are. In this post we want to go deeper into this view and attempt to implement a k-d tree in C++. The final code can be found <a href="https://github.com/pockerman/cubeai/blob/master/include/cubeai/data_structs/kd_tree.h">here</a>. Furthermore, we will follow the implementation from the excellent book of Marcello La Rocca <a href="https://www.manning.com/books/advanced-algorithms-and-data-structures">Advanced algorithms and data structures</a> by Manning Publications.

## K-d trees part 2

Before starting implementing a k-d tree, recall that this a binary search tree i.e. a hierarchical data structure. Specifically, a  k-d tree is a space partitioning data structure for organizing points in a k-dimensional space [1]. In a k-d tree every node in the tree represents a k-dimensional point [2]. Furthermore, we will assume that the coordinates of k-dimensional vector can be compared with each other.

Following [2], here is the exposed API:

```
template<typename NodeType>
class KDTree
{
public:

 typedef NodeType node_type;
 typedef typename node_type::data_type data_type;
 
 KDTree(uint_t k);

 template<typename Iterator, typename SimilarityPolicy, typename ComparisonPolicy>
 KDTree(uint_t k, Iterator begin, Iterator end, 
        const SimilarityPolicy& sim_policy, 
        const ComparisonPolicy& comp_policy);

 bool empty()const noexcept;
 uint_t size()const noexcept;
 uint_t dim()const noexcept;
 
 template<typename ComparisonPolicy>
 std::shared_ptr<node_type>
 search(const data_type& data, const ComparisonPolicy& comp_policy)const;
 
 template<typename Iterator, typename SimilarityPolicy, typename ComparisonPolicy>
 void build(Iterator begin, Iterator end,
             const SimilarityPolicy& sim_policy, 
             const ComparisonPolicy& comp_policy);
             
 template<typename ComparisonPolicy>
 std::shared_ptr<node_type>
 insert(const data_type& data, const ComparisonPolicy& comp_policy);
   
 template<typename ComparisonPolicy>
 std::vector<std::pair<typename ComparisonPolicy::value_type, typename NodeType::data_type>>
 nearest_search(const data_type& data, uint_t n, const ComparisonPolicy& calculator)const;

};
```

The class above accepts the tree node as a generic parameter that exposes the type of the data to be stored. In this perspective, the ```KDTree``` is a homogeneous container.  

According to the exposed API we can construct a k-d tree in two ways; by specifying the size of the space or by passing a range of data to be stored in the tree. The first construct actually creates an empty tree. We can populate this tree by calling either ```insert``` or preferably ```build```. We will explain below why this is the preferred method.

We can see that the exposed API does not have a ```remove``` or ```delete``` method. Typically, a k-d tree is constructed as remains as is. Furthermore, removing a node may result in an unbalanced tree which implies that the fast look up will not hold any more.

## References

1. <a href="https://en.wikipedia.org/wiki/K-d_tree">k-d tree</a>.
2. Marcello La Rocca, _Advanced algorithms and data structures_, Manning Publications.