### Data Structure is something we can represent using class.

> **QUESTION 1**: As a senior backend engineer at Jovian, you are tasked with developing a fast in-memory data structure to manage profile information (username, name and email) for 100 million users. It should allow the following operations to be performed efficiently:
> 
> 1.  **Insert** the profile information for a new user.
> 2.  **Find** the profile information of a user, given their username
> 3.  **Update** the profile information of a user, given their usrname
> 4.  **List** all the users of the platform, sorted by username
> 
> You can assume that usernames are unique.

### The Method

Here's a systematic strategy we'll apply for solving problems:

1.  State the problem clearly. Identify the input & output formats.
2.  Come up with some example inputs & outputs. Try to cover all edge cases.
3.  Come up with a correct solution for the problem. State it in plain English.
4.  Implement the solution and test it using example inputs. Fix bugs, if any.
5.  Analyze the algorithm's complexity and identify inefficiencies, if any.
6.  Apply the right technique to overcome the inefficiency. Repeat steps 3 to 6.

### 1. State the problem clearly. Identify the input & output formats.

##### Problem

> We need to create a data structure which can store 100 million records and perform insertion, search, update and list operations efficiently.

 ##### Input
> The key inputs to our data structure are user profiles, which contain the username, name and email of a user.

In [22]:
class User():
    def __init__(self, username, name, email):
        self.username = username
        self.name = name
        self.email = email
        print("User Created!")
        
    def __repr__(self):
        return "User(username='{}', name='{}', email='{}')".format(self.username, self.name, self.email)
    
    def __str__(self):
        return self.__repr__()
        

**Note**
> \__repr__() provides the official string representation of an object, aimed at the programmer.

>\__str__() provides the informal string representation of an object, aimed at the user.

##### Output

>We can also express our desired data structure as a Python class `UserDatabase` with four methods: `insert`, `find`, `update` and `list_all`.

In [4]:
class UserDatabase:
    def insert(self, user):
        pass
    
    def find(self, username):
        pass
    
    def update(self, user):
        pass
        
    def list_all(self):
        pass

Note: It's good programming practice to list out the signatures of different class functions before we actually implement the class.

### 2. Come up with some example inputs & outputs.

In [10]:
aakash = User('aakash', 'Aakash Rai', 'aakash@example.com')
biraj = User('biraj', 'Biraj Das', 'biraj@example.com')
hemanth = User('hemanth', 'Hemanth Jain', 'hemanth@example.com')
jadhesh = User('jadhesh', 'Jadhesh Verma', 'jadhesh@example.com')
siddhant = User('siddhant', 'Siddhant Sinha', 'siddhant@example.com')
sonaksh = User('sonaksh', 'Sonaksh Kumar', 'sonaksh@example.com')
vishal = User('vishal', 'Vishal Goel', 'vishal@example.com')

User Created!
User Created!
User Created!
User Created!
User Created!
User Created!
User Created!


### 3. Come up with a correct solution. State it in plain English.

Here's a simple and easy solution to the problem: we store the `User` objects in a list sorted by usernames.

The various functions can be implemented as follows:

1.  **Insert**: Loop through the list and add the new user at a position that keeps the list sorted.
2.  **Find**: Loop through the list and find the user object with the username matching the query.
3.  **Update**: Loop through the list, find the user object matching the query and update the details
4.  **List**: Return the list of user objects.
### 4. Implement the solution and test it using example inputs.
The code for implementing the above solution is also fairly straightfoward **(brute force solution)**


In [27]:
class UserDatabase:
    def __init__(self):
        self.users = []
    
    def insert(self, user):
        i = 0
        while i < len(self.users):
            # Find the first username greater than the new user's username
            if self.users[i].username > user.username:
                break
            i += 1
        self.users.insert(i, user)
    
    def find(self, username):
        for user in self.users:
            if user.username == username:
                return user
    
    def update(self, user):
        target = self.find(user.username)
        target.name, target.email = user.name, user.email
        
    def list_all(self):
        return self.users
    
database = UserDatabase()
database.insert(hemanth)
database.insert(aakash)
database.insert(siddhant)
user = database.find('siddhant')
user
database.update(User(username='siddhant', name='Siddhant U', email='siddhantu@example.com'))
user = database.find('siddhant')
user
database.list_all()

User Created!


[User(username='aakash', name='Aakash Rai', email='aakash@example.com'),
 User(username='hemanth', name='Hemanth Jain', email='hemanth@example.com'),
 User(username='siddhant', name='Siddhant U', email='siddhantu@example.com')]

### 5. Analyze the algorithm's complexity and identify inefficiencies
The time complexities of the various operations are:

1.  Insert: **O(N)**
2.  Find: **O(N)**
3.  Update: **O(N)**
4.  List: **O(1)**

We must come up with a more efficient data structure! Choosing the right data structure for the requirements at hand is an important skill. It's apparent that a sorted list of users might not be the best data structure to organize profile information for millions of users.

### 6. Apply the right technique to overcome the inefficiency
We can limit the number of iterations required for common operations like find, insert and update by organizing our data in  **binary tree** structure.
### Balanced Binary Search Trees 
<img src="https://i.imgur.com/Mqef5b3.png" width="520">
 For our use case, we require the binary tree to have some additional properties: 
 
 1. **Keys and Values**: Each node of the tree stores a key (a username) and a value (a `User` object). Only keys are shown in the picture above for brevity. A binary tree where nodes have both a key and a value is often referred to as a **map** or **treemap** (because it maps keys to values). 
 2. **Binary Search Tree**: The *left subtree* of any node only contains nodes with keys that are lexicographically smaller than the node's key, and the *right subtree* of any node only contains nodes with keys that lexicographically larger than the node's key. A tree that satisfies this property is called a **binary search trees**, and it's easy to locate a specific key by traversing a single path down from the root note.
 3.  **Balanced Tree**: The tree is **balanced** i.e. it does not skew too heavily to one side or the other. The left and right subtrees of any node shouldn't differ in height/depth by more than 1 level. 
 
 **Notes:** 
 > Time complexity in a balanced BST is **O(log N)**.
 
 > The recursion is when the function call itself inside it.
 



**Exercise:** Create the following binary tree using the `TreeNode` class as a node (each node is TreeNode class ) and tuple .
<img src="https://i.imgur.com/d7djJAf.png" width="540">

In [1]:
###implementation of binary tree as Data Structure:
class TreeNode:
    def __init__(self, key):
        self.key = key
        self.left = None
        self.right = None
#convert from tuple to BT.
tree_tuple = ((1,3,None), 2, ((None, 3,4), 5, (6,7,8)))
#we will use recursion tocconvert tuple (tree_tuple) to Binary tree:
def parse_tuple(data):
    if isinstance(data, tuple) and len(data) == 3:
        node = TreeNode(data[1])
        node.left = parse_tuple(data[0]) #recursion
        node.right = parse_tuple(data[2]) #recursion
    elif data == None:
        node = None
    else:
        node = TreeNode(data)
    return node
tree2 = parse_tuple(tree_tuple)
#test the code
print(tree2.right.left.right.key, tree2.right.right.left.key, tree2.right.right.right.key)



4 6 8


### Traversing a Binary Tree
A _traversal_ refers to the process of visiting each node of a tree exactly once. _Visiting a node_ generally refers to adding the node's key to a list.


#### Inorder traversal

1.  Traverse the left subtree recursively inorder.
2.  Traverse the current node.
3.  Traverse the right subtree recursively inorder.
#### Preorder traversal

1.  Traverse the current node.
2.  Traverse the left subtree recursively preorder.
3.  Traverse the right subtree recursively preorder.
#### Postorder traversal


1. Traverse the left subtree recursively preorder.
2. Traverse the right subtree recursively preorder.
3. Traverse the current node.

`check below example for inorder traversal for a binary tree also preorder and postorder traversal are the same concept`
> 1st we should make TreeNode to be the base of our binary tree data structure.

> 2nd we should create the tree from the provided `tuple` (using recursion).

> 3rd we will use the recursion to traverse the whole binary tree.

In [3]:
def traverse_in_order(node):
    if node is None:
        return []
    return (traverse_in_order(node.left) + [node.key] + traverse_in_order(node.right))

traverse_in_order(tree2)

[1, 3, 2, 3, 4, 5, 6, 7, 8]