## KDTrees - K Dimensional Trees
#### Written by David Terpay
This notebook will demonstrate some of the functionality of the KDTree class I built. You can check it out in the kdtree.py file as well as the attached class found in the Binary_Tree folder (I used this as my backend implementation).
The documentation and all of the functionality as well as decriptions can be found at the bottom of this notebook.


NOTE:
properties() takes care of calling nearly all the important functions in the BinarySeachTree class (apart from a few fun and challanging ones). You can individually call every single function by finding its name and making sure you know how to deal with the return type. 

***some functions do not return numerical data but the treenode (or in this case KDTreeNode) objects themselves***

### Time to mess around!
First let's create a KDTree with a few nodes! Let's make a 2D tree with 8 nodes.
As we can see, the tree changes the dimension we split on as we traverse down the tree.

In [1]:
import kdtree
import kdtreenode
kdtree = kdtree.KDTree(2)
kdtree.insertList([(3, 2),(5, 8),(6, 1),(4, 4),(9, 0),(1, 1),(2, 2),(8, 7)])
print(kdtree)



     (5, 8)

                    (8, 7)

               (9, 0)

          (6, 1)

               (4, 4)

(3, 2)

          (2, 2)

     (1, 1)


### Let's sort our list and make it a more balanced tree to enhance our runtimes
Behind the scences, we use quickselect to sort the list.

In [2]:
kdtree.createBalancedTree()
print(kdtree)



          (8, 7)

     (6, 1)

          (9, 0)

(5, 8)

          (4, 4)

     (3, 2)

          (2, 2)

               (1, 1)


### Let's look at the properties of our KDTree. 

We can inherit the functionality from the BinarySearchTree class to look at our properties since a KDTree is a type of binary search tree in a way.


Properties will return a dictionary of all of the attributes of the BST. printProperties will only print all of the properties as if it were in a dictionary. I implemented all of these properties as seperate functions in the binarytree.py file. You can call the individual functions if need be (read documentation as to what is returned). 

Properties within the dictionary:

#### Nodes: 
Number of nodes in the bst

#### Height: 
The height of our bst

#### Longest Path: 
Returns a list of the longest path (only the data, not the nodes themselves) from root to leaf node

#### Sum Distances: 
Returns the sum of the distances from the root to every single node in the BST

#### Perfect: 
Checks if our tree is perfect

#### Complete: 
Checks if our tree is complete

#### Full: 
Checks if our tree is full

#### Balanced: 
Checks if our tree is balanced

#### Balance Factor: 
Returns the balance factor in our BST

#### Traversals: 
Inorder, preorder, postorder, and levelorder

In [3]:
kdtree.printProperties()

{
Nodes : 8
Height : 3
Longest Path : [(5, 8), (3, 2), (2, 2), (1, 1)]
Sum Distances : 13
Perfect : False
Complete : True
Full : False
Balanced : True
Balance Factor : -1
Inorder Traversal : [(1, 1), (2, 2), (3, 2), (4, 4), (5, 8), (9, 0), (6, 1), (8, 7)]
Preorder Traversal : [(5, 8), (3, 2), (2, 2), (1, 1), (4, 4), (6, 1), (9, 0), (8, 7)]
Postorder Traversal : [(1, 1), (2, 2), (4, 4), (3, 2), (9, 0), (8, 7), (6, 1), (5, 8)]
Level order Traversal : [(5, 8), (3, 2), (6, 1), (2, 2), (4, 4), (9, 0), (8, 7), (1, 1)]
}


### Let's try doing some of the basic functionality we might see in this data structure before moving onto nearest neighbor searches.

In [4]:
found = kdtree.find((6,1))
print(found)

-----------------
   Data: (6, 1)
  Parent: (5, 8)

  LC 	RC 
  (9, 0)	(8, 7)
-----------------

Dimension Discriminator: 1


### Let's try a no child removal

In [5]:
kdtree.remove((8,7))
print(kdtree)



     (6, 1)

          (9, 0)

(5, 8)

          (4, 4)

     (3, 2)

          (2, 2)

               (1, 1)


### Let's try a one child removal

In [6]:
kdtree.remove((2,2))
print(kdtree)



     (6, 1)

          (9, 0)

(5, 8)

          (4, 4)

     (3, 2)

          (1, 1)


### Let's try a two child removal

In [7]:
kdtree.remove((5,8))
print(kdtree)



     (9, 0)

(6, 1)

          (4, 4)

     (3, 2)

          (1, 1)


### Let's create a new KDTree object with lots of data, sort it, and then run find nearest neighbor searches.
We will create 10 dimensional space for each data point.

In [16]:
from random import randint
import kdtree
newTree = kdtree.KDTree(10)
def createRandomPoint(dimension):
    return [randint(-1000,1000) for x in range(dimension)]
newTree.createBalancedTree([createRandomPoint(10) for x in range(1000000)])

[[-564, -918, -598, -996, -760, -900, -458, -806, -700, -940],
 [-589, -933, -983, -504, -864, -884, -420, -824, -695, -245],
 [-636, -658, -633, -755, -748, -858, -488, -719, -180, -942],
 [-548, -929, -812, -912, -962, -787, -381, -622, -942, -731],
 [-635, -883, -920, -754, -903, -572, -486, -565, -701, -475],
 [-578, -972, -887, -647, -518, -938, -731, -280, -601, -677],
 [-814, -934, -967, -814, -954, -650, -565, -269, -197, -247],
 [-554, -598, -576, -934, -905, -897, -288, -370, -786, -978],
 [-740, -911, -516, -924, -773, -958, -71, -436, -951, -855],
 [-690, -646, -867, -990, -605, -874, -222, -361, -511, -376],
 [-871, -958, -985, -867, -895, -662, -86, -650, -118, -681],
 [-760, -677, -855, -440, -884, -725, -188, -144, -482, -768],
 [-613, -539, -503, -593, -921, -871, -129, -37, -571, -551],
 [-637, -714, -940, -726, -990, -765, -169, -123, -406, -279],
 [-792, -770, -777, -969, -790, -591, -262, -108, -79, -655],
 [-589, -604, -862, -910, -939, -558, -591, -581, -497, -44

### Let's run findNearestNeighbor in a linear search where we simply go through all of our points in the kdtree nodes points and find the minimum

In [17]:
from datetime import datetime
start = datetime.now()
randomPoint = (createRandomPoint(10))
node = newTree.linearTestNN(randomPoint)
print('The nearest neighbor is\n\n', node)
print(f'\nThe time it took to run is {datetime.now() - start}')


The nearest neighbor is

 -----------------
   Data: [834, -340, 685, 142, 700, 262, -990, 664, -285, -59]
  Parent: [530, -3, 883, 281, 855, 308, -903, 697, -302, -708]

  LC 	RC 
  None	None
-----------------

Dimension Discriminator: 9

The time it took to run is 0:00:19.742398


In [18]:
from datetime import datetime
start = datetime.now()
node = newTree.findNearestNeighbor(randomPoint)
print('The nearest neighbor is\n\n', node)
print(f'\nThe time it took to run is {datetime.now() - start}')

The nearest neighbor is

 -----------------
   Data: [834, -340, 685, 142, 700, 262, -990, 664, -285, -59]
  Parent: [530, -3, 883, 281, 855, 308, -903, 697, -302, -708]

  LC 	RC 
  None	None
-----------------

Dimension Discriminator: 9

The time it took to run is 0:00:02.298172


### Functionality of the Binary Search Tree class.
Note the code for the functions will not be seen here. You will need to go to the .py files for that

In [19]:
dir(help(kdtree))

Help on module kdtree:

NAME
    kdtree

CLASSES
    Binary_Tree.binarytree.BinarySearchTree(builtins.object)
        KDTree
    
    class KDTree(Binary_Tree.binarytree.BinarySearchTree)
     |  Method resolution order:
     |      KDTree
     |      Binary_Tree.binarytree.BinarySearchTree
     |      builtins.object
     |  
     |  Methods defined here:
     |  
     |  __init__(self, dimensions)
     |      This is a constructor that will give us a default, basic 
     |      KDTree. It will not create a nice sorted and relatively
     |      balanced KDtree. In order to get that, construct a KDTree, 
     |      and then simply insert a list into createBalancedTree() function.
     |      This will sort the list in multidimensional space using quickselect, 
     |      and will recursively construct your tree from your list which will
     |      be an attribute of the object --> points. There are three variables
     |      we will keep track of
     |          root = Will allow 

['__bool__',
 '__class__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__']

### Functionality of the TreeNode class.
Note the code for the functions will not be seen here. You will need to go to the .py files for that

In [20]:
dir(help(kdtreenode))

Help on module kdtreenode:

NAME
    kdtreenode

CLASSES
    Binary_Tree.treenode.TreeNode(builtins.object)
        KDTreeNode
    
    class KDTreeNode(Binary_Tree.treenode.TreeNode)
     |  Method resolution order:
     |      KDTreeNode
     |      Binary_Tree.treenode.TreeNode
     |      builtins.object
     |  
     |  Methods defined here:
     |  
     |  __init__(self, data, dim)
     |      This is the constructor for our tree node. As mentioned earlier we will be
     |      keeping track of the parent, left child, right child, and data. This will create
     |      a instance of a treenode.
     |      INPUT:
     |          data = Data that we will store in our treenode.
     |          parent = The parent of this treenode.
     |          left = The left child of our treenode.
     |          right = The right child of our treenode.
     |      OUTPUT:
     |          A treenode instance.
     |  
     |  __str__(self)
     |      This function will give us a string repre

['__bool__',
 '__class__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__']