## In Chapter 8, you worked on an in-memory data store.

So far it wraps a dictionary with keys that point to each of the items. Now we'll start building out some of the functionality that more sophisticated data stores have. Let's make a new key-value data store. It has a function to `get` values and `set` values. It also has a `count` function, which currently does nothing.

In [None]:
import random
from datetime import datetime

class Database():
    def __init__(self):
        self._data = {}
        
        self.foods = ["apple", "banana", "carrot", "celery", "mirepoix", "clementine"]

        for x in range(1000000):
            key = hash(datetime.now()) # This is how we're getting a pseudo-random and probably unique ID
            self.set(key, random.choices(self.foods)[0])
    
    def get(self, key):
        return self._data[key]
    
    def set(self, key, value):
        self._data[key] = value
    
    def count(self, value):
        pass
    

Take a look at the initializer for this data store; we fill it up with lots of instances of a few different fruit and vegetable values. The below code will take a few seconds to run because we are adding so many values to our data store!

In [None]:
db = Database()

Now, when we have a table full of data, it's _pretty common_ for clients to want to be able to find out _how many_ of each value are in a given column. For example, if we have a table of different foods, maybe we want to know how many times each food appears in the database. 

Today we are going to implement the `count` function. See the automated test below, which illustrate exactly how this function should work. Two things to note:

1. Right now, the test _fails_ because you'll have to implement `count` to get it to pass.
2. The test is looking for a _range_ rather than an exact answer because we're randomly assigning values into the database, so the test represents a 99.8% **confidence interval** around the number of clementine values your database instance will randomly have (see how confidence intervals are useful? :)

In [None]:
# You need these imports to run the tests
import sys
!{sys.executable} -m pip install colorama 

sys.path.insert(0, '..')
from test_framework_exercise.phoenix_test.matchers import FailedAssertion, Assertion, assert_that
from test_framework_exercise.phoenix_test.test import Test
sys.path.remove('..')

In [None]:
class DatabaseCountTest(Test):
    def setup(self):
        self.db = Database()
                
    def test_count(self):
        result = self.db.count("clementine")
        assert_that(160000 < result < 170000).is_true()

DatabaseCountTest().run()

### Challenge: Implement the `count` function. 

The `count` function will allow you to get the number of a certain value stored in your key value store so that the above test passes.

## Done? OK. 

Now it's time to tell you about my secret motive for this problem: this exercise is a helpful introduction to the concepts of time and space efficiency in software engineering and how to evaluate those tradeoffs.

To make it easier to do that, let's introduce everyone's favorite thing: a decorator! This decorator, calle `stopwatch`, prints out how long a method took to run.

In [None]:
def stopwatch(func):
    def wrapper(*args, **kwargs):
        start = datetime.now()
        result = func(*args, **kwargs)
        end = datetime.now()
        print(f"Operation took {end - start} seconds")
        return result
    
    return wrapper

### Challenge: Annotate the `count` function with the `@stopwatch` decorator. 

Then, duplicate the line ` result = self.db.count("clementine")` a few times in your test so it runs several times.

Run the test again. You should get several printouts of how long the test is taking.

Why do you think the test is taking so long?

.
..
...
....
...
..
.

## Introducing performance testing!

In software engineering, **performance** usually refers to a program's _speed_. And especially when it comes to fetching data, speed can be really, _really_ important. So, in addition to tests that make sure our code does the right thing, we might have tests to make sure our code does the right thing _fast enough_.

Run the below test on your implementation of the database. Does it pass?

In [None]:
class DatabasePerformanceTest(Test):
    def setup(self):
        self.db = Database()
                
    def test_count_performance(self):
        start = datetime.now()
        self.db.count("clementine")
        end = datetime.now()
        assert_that((end - start).total_seconds() < 0.01).is_true()        
        
DatabasePerformanceTest().run()

### Challenge: Implement your `count` function so that _both_ test suites pass.

What do you have to do to make that work? 

What are the implications of that strategy for your data store?

In [None]:
import sys

sys.getsizeof(str(db._data))

In [None]:
sys.getsizeof(str(db._count))