# Collaborative Filtering
---
This notebook uses Cirrus to run logistic regression on the Netflix dataset.

## Setup
---

In [None]:
# To ease development, each time a cell is run, all modules will be reloaded.
%load_ext autoreload
%autoreload 2

In [None]:
import logging
import sys
import atexit

In [None]:
# Cirrus produces logs, but they will not show unless we add a handler that prints.
from cirrus import utilities
utilities.set_logging_handler()

In [None]:
from cirrus import instance, parameter_server, automate, GridSearch, cf, graph

## Instance, server, and task
---

First, we start an EC2 instance.

In [None]:
inst = instance.Instance(
    name="lr_example_instance",
    disk_size=32,
    typ="m4.2xlarge",
    username="ubuntu",
    ami_owner_name=("self", "cirrus_server_image")
)
inst.start()
instances = [inst]

Second, we create a parameter server to run on our instance.

In [None]:
server = parameter_server.ParameterServer(
    instance=inst,
    ps_port=1337,
    error_port=1338,
    num_workers=64
)

In [None]:
base_task_config= {
    "n_workers": 16,
    "n_ps": 1,
    "dataset": "netflix-ryan",
    "learning_rate": 0.01,
    "epsilon": 0.0001,
    "progress_callback": None,
    "train_set": (0, 799),
    "test_set": (800, 850),
    "minibatch_size": 20,
    "model_bits": 19,
    "ps": server
}

Third, we define our machine learning task.

In [None]:
search = GridSearch(
    task=cf.CollaborativeFiltering,
    param_base=base_task_config,
    hyper_vars=[],
    hyper_params=[],
    instances=instances
)

## Run
---

Next, we run our machine learning task.

In [None]:
search.run(UI=True)

Run this cell to see the present accuracy of the model.

In [None]:
graph.display_dash()

## Cleanup
---

When we're satisfied with the results, we kill our task.

In [None]:
search.kill_all()

We also need to terminate our instance in order to avoid continuing charges.

In [None]:
inst.cleanup()