# Weelky Assignment 4, Task 1 (Command-Line Interface)

In [None]:
import cli

The file `cli.py` provides us with a dictionary `command_freqs` for different commands and their frequencies and a class `cli` for creating objects that can be used to simulate learning of said commands. Here are the commands and their frequencies. Frequency here means that commands with larger numbers are more frequent and therefore used more often and learned faster. Commands with a small frequency take longer to learn, because they are used less. Note that these numbers are abstract but relative: a command with twice the frequency value will be used twice as often, but these numbers are not probabilities because they do not need to sum up to one.

In [None]:
cli.command_freqs

The cell below shows how to create a `cli` object and train it for one hour with `command_freqs`. The first parameter is seconds of training, and the second is the disctionary assigning commands to their frequencies. Use `rt()` with a command in the parameter to return the LTM retrieval time of that command, in seconds, at the present stage of learning. Note that because this is a simulation with randomness in selecting the commands, running the cell multiple times will yield a different result.

In [None]:
my_cli = cli.cli()
my_cli.train(60*60, cli.command_freqs) # 60 seconds in a minute, 60 minutes in an hour
# The output of the recall time is in seconds
print("After training, the time to retrieve the command ls is", my_cli.rt("ls"))

In [None]:
# Here is how you can show the retrieval time of all commands.
# Note that these result don't change with reruns, unless you rerun the cell above.
for c in cli.command_freqs.keys():
    print(c, my_cli.rt(c))

In [None]:
# You can add new commands to the cli by either copying cli.command_freqs, accessing the dictionary directly,
# or by going to cli.py and copy-pasting the dictionary definition here.
import copy
my_commands = copy.copy(cli.command_freqs)
my_commands['my_new_command'] = 0.5 # VERY frequent command!
my_cli = cli.cli()
my_cli.train(60*60, my_commands) # 60 seconds in a minute, 60 minutes in an hour
print("After training, the time to retrieve the command my_new_command is", my_cli.rt("my_new_command"))

In [None]:
# How about adding the new command only after some training?

# Create cli and first train the user for 1 hour
my_cli = cli.cli()
my_commands = copy.copy(cli.command_freqs)
my_cli.train(60*60, my_commands)

# Then add the new command. And train again for 1 hour.
my_commands['my_new_command'] = 0.5 # VERY frequent command!
my_cli.train(60*60, my_commands) 
print("After training, the time to retrieve the command my_new_command is", my_cli.rt("my_new_command"))

# Important note about noise
The model types in commands and learns to use them, with the next command to use being randomly provided, depending on how frequent it is. This introduces stochasticity (randomness) to the model. So, if you for instance run the model for 1 hour and then ask how long did it take to learn `ls`, you will get different results upon repeating it. That's a problem! While understandable feature of the model, it means that a single run of the model does not really provide us with robust information.

The solution is to run the model multiple times, collect the results and then aggregate them using some summary statistic. The instruction is to use the mean result from at least 10 models. So, let's look how to do that for `ls`.

In [None]:
import numpy as np
results = [] # create an empty array
for i in range(10): # iterate 10 times
    # Always create a new cli instance, train it from the scratch, and save the recall time into results.
    my_cli = cli.cli()
    my_cli.train(60*60, cli.command_freqs)
    results.append(my_cli.rt("ls"))
print("10 retrieval times (our raw data):", results)
print("The average retrieval time is", np.mean(results))

For these small runs (1 hour of training), there is a lot of variation. For longer runs, the models are more stable. Please be sure to put the whole training section within the `for` loop and always create new objects used in that training, so that you are not carrying over any information between training runs.