**Question 1:** As a senior backend engineer, you are tasked with developing a fast in-memory data structure to manage profile information (username, name, email) for 100 million users. It should allow the following operations to be performed efficiently:

1. **Insert** the profile information for a new user.
2. **Find** the profile information of a user, given the username.
3. **Update** the profile information of a user, given their username.
4. **List** all the users of the platform, sorted by username.

> You can assume the usernames are unique.

### 1. State the problem clearly. Identify the input and ouput formats.

**Problem**

>We need to create a data structure which can store 100 million records and perform insertion, search, update and list operations efficiently.



In [1]:
class User:
    pass

In [2]:
user1 = User()

In [3]:
user1

<__main__.User at 0x1e13fb5fe90>

In [4]:
type(user1)

__main__.User

In [5]:
class User:
    def __init__(self,
                 username,
                 name,
                 email):
        
        self.username = username
        self.name = name
        self.email = email
        print("User Created")

In [6]:
user2 = User("Rahul", "Rahul Rawat", "rahul@rawat.com")

User Created


In [7]:
user2

<__main__.User at 0x1e13fb80b50>

In [8]:
user2.name

'Rahul Rawat'

In [9]:
class User:
    def __init__(self,
                 username,
                 name,
                 email):
        
        self.username = username
        self.name = name
        self.email = email
    
    def introduce_yourself(self, guest_name):
        print(f"Hi {guest_name}, I'm {self.name} contact me at {self.email}")

In [10]:
user3 = User("Rahul", "Rahul Rawat", "rahul@rawat.com")

In [11]:
user3.introduce_yourself("Jaggi")

Hi Jaggi, I'm Rahul Rawat contact me at rahul@rawat.com


In [12]:
class User:
    def __init__(self,
                 username,
                 name,
                 email):
        self.username = username
        self.name = name
        self.email = email

    def __repr__(self) -> str:
        return f"User(username = \"{self.username}\", name = \"{self.name}\", email = \"{self.email}\" )"
    def __str__(self):
        return self.__repr__()

In [13]:
user4 = User("Jaggi", "Sehaj Jaggi", "sehaj@jaggi.com")

In [14]:
user4

User(username = "Jaggi", name = "Sehaj Jaggi", email = "sehaj@jaggi.com" )

In [15]:
class UserDatabase:
    def insert(self, user):
        pass
    def find(self, username):
        pass
    def update(self, user):
        pass
    def list_all(self):
        pass

In [16]:
rahul = User("rahul", "Rahul Rawat", "rahul@rawat.com")
nidhi = User("nidhi", "Nidhi Rajput", "nidhi@rajput.com")

In [17]:
users = [rahul, nidhi]

In [18]:
print(nidhi)

User(username = "nidhi", name = "Nidhi Rajput", email = "nidhi@rajput.com" )


### 3. Come up with a correct solution for the problem. State it in plain English.

1. **Insert:** Loop through the list and add the user at a position that keeps the list sorted.
2. **Find:** Loop through the list and find the user object with the username matching the query.
3. **Update:** Loop through the list, find the user object matching the query and update the details.
4. **List:** Return the list of all the users.

### 4. Implement the solution and test it using example inputs. Fix bugs, if any.

In [19]:
class UserDatabase:
    def __init__(self):
        self.users = []

    def insert(self, user):
        i = 0
        while i < len(self.users):
            if self.users[i].username > user.username:
                break
            i += 1
        self.users.insert(i, user)

    def find(self, username):
        for user in self.users:
            if user.username == username:
                return user
    
    def update(self, user):
        target = self.find(user.username)
        target.name, target.email = user.name, user.email

    def list(self):
        return self.users

In [20]:
database = UserDatabase()

In [21]:
database.insert(rahul)
database.insert(nidhi)

In [23]:
user = database.find("rahul")
user

User(username = "rahul", name = "Rahul Rawat", email = "rahul@rawat.com" )

In [24]:
database.list()

[User(username = "nidhi", name = "Nidhi Rajput", email = "nidhi@rajput.com" ),
 User(username = "rahul", name = "Rahul Rawat", email = "rahul@rawat.com" )]

In [25]:
database.update(User(username = "rahul", name = "Rahul rawat", email = "rahulrawat@rawat.com"))

In [26]:
database.list()

[User(username = "nidhi", name = "Nidhi Rajput", email = "nidhi@rajput.com" ),
 User(username = "rahul", name = "Rahul rawat", email = "rahulrawat@rawat.com" )]

### 5. Analyze the algorithm's complexity and identify inefficiencies, if any.

The time complexities of the various operations are:-
1. Insert **O(N)**
2. Find **O(N)**
3. Update **O(N)**
4. List **O(1)**

In [28]:
%%time
for i in range(100_000_000):
    j = i * i

CPU times: total: 5.41 s
Wall time: 13.3 s


### 6. Apply the right technique to overcome the inefficiency. Repeat steps 3 to 6.

The insert, find and update operations in a balanced BST have time complexity **O(log N)** since they all involve traversing a single path down from the root of the tree.

In [29]:
class TreeNode:
    def __init__(self, key):
        self.key = key
        self.left = None
        self.right = None

In [30]:
node0 = TreeNode(3)
node1 = TreeNode(4)
node2 = TreeNode(5)

In [31]:
node0.left = node1
node0.right = node2

In [32]:
tree = node0

In [33]:
tree.key

3

In [35]:
tree.left.key

4

In [36]:
tree.right.key

5

In [37]:
tree_tuple = ((1, 3, None), 2, ((None, 3, 4), 5, (6, 7, 8)))

In [38]:
def parse_tuple(data):
    print(data)
    if isinstance(data, tuple) and len(data) == 3:
        node = TreeNode(data[1])
        node.left = parse_tuple(data[0])
        node.right = parse_tuple(data[2])

    elif data is None:
        node = TreeNode(data)
    
    else:
        node = TreeNode(data)
    
    return node

In [39]:
tree2 = parse_tuple(((1, 3, None), 2, ((None, 3, 4), 5, (6, 7, 8))))

((1, 3, None), 2, ((None, 3, 4), 5, (6, 7, 8)))
(1, 3, None)
1
None
((None, 3, 4), 5, (6, 7, 8))
(None, 3, 4)
None
4
(6, 7, 8)
6
8


In [40]:
tree2

<__main__.TreeNode at 0x1e13fbb7790>

In [41]:
tree2.key

2

In [42]:
tree2.left.key, tree2.right.key

(3, 5)

In [44]:
tree2.left.left.key, tree2.left.right.key, tree2.right.left.key, tree2.right.right.key

(1, None, 3, 7)

In [45]:
def display_keys(node, space = "\t", level = 0):
    if node is None:
        print(space * level + "None")

    if node.left is None and node.right is None:
        print(space * level + str(node.key))
        return
    
    display_keys(node.right, space, level + 1)
    print(space * level + str(node.key))
    display_keys(node.left, space, level + 1)

In [46]:
display_keys(tree2)

			8
		7
			6
	5
			4
		3
			None
2
		None
	3
		1


## Binary Search Tree (BST)

A binary search tree or BST is a binary search tree that satisfies the following conditions:

1. The left subtree of any node only contains nodes with keys less than the node's key.
2. The right subtree of any node only cotains nodes with keys greater than the node's key.