# Streaming Videos Cache Optimization Problem
Using cache servers, we can optimize requests for videos from a data center to endpoints. Based on the predicted requests from endpoints, can we find a way to optimize the distribution and storage of said videos?

## Input and Parsing
Data is provided as text. We can parse said data into various tokens. We read from `data/input.txt` in this case.
The `problem_description` array holds, in order, from 0 to 4, the number of videos, number of endpoints, number of request descriptions, number of cache servers, and the capacity of each cache server in megabytes.
The `video_size` array holds the size of each video in MB.
We then parse based on the ammount of endpoints, to connect each endpoint to the caches. The `endpoint_data_description` describes the latency between an endpoint (serves as the index of the array) and the data center (latency is the value stored), and the `endpoint_cache_description` has the key/value specification of key:(endpoint, cache) -> value:latency.
Finally, `request_description` is a dictionary that holds the ammount of requests a certain video at an endpoint holds, specification of key:(endpoint, video) -> value:nº of requests.

In [23]:
def parse_results(file: str):
    problem_description = []
    video_size = []
    endpoint_data_description = []
    endpoint_cache_description = {}
    request_description = {}
    with open('data/' + file, 'r') as file:
        line = file.readline()
        tokens = line.strip().split()
        for token in tokens:
            problem_description.append(int(token))
        line = file.readline()
        tokens = line.strip().split()
        for token in tokens:
            video_size.append(int(token))
        i = 0
        while i != problem_description[1]:
            line = file.readline()
            tokens = line.strip().split()
            endpoint_data_description.append(int(tokens[0]))
            connections = int(tokens[1])
            j = 0
            while j < connections:
                line = file.readline()
                tokens = line.strip().split()
                endpoint_cache_description[(i, tokens[0])] = tokens[1]
                j += 1
            i += 1
        i = 0
        c = 0
        while i != problem_description[2]:
            i+=1
            line = file.readline()
            tokens = line.strip().split()
            key = (tokens[1], tokens[0])
            if key in request_description:
                request_description[key] += tokens[2]
            else:
                request_description[key] = token[2]
    return problem_description, video_size, endpoint_data_description, endpoint_cache_description, request_description

problem_description, video_size, endpoint_data_description, endpoint_cache_description, request_description = parse_results('kittens.in.txt')


## Problem State
The problem state is defined by a dictionary that maps a cache to a list of videos. We must careful with the underlying constraints of the total video sizes not surpassing cache size.

## Goal and Scoring
Our goal is to maximize time saved by the caches, for this, we must go through our current cache configuration and figure out how much time we are saving in total based on the requests. Then, we multiply this value, in milliseconds, by 1000, to get the score. 
Time Saved (Request Description) = Nº of Requests * min(Latency of Data Center - Latency of Cache with Video)
Also, when re-scoring our problem, it makes more sense to update the current score with the alteration instead of re-calculating the score from scratch.
The score presumes a valid problem state.

In [25]:
import math
def score(problem_state: dict, endpoint_data_description: list, endpoint_cache_description: dict, request_description: dict) -> float:
    score = 0
    for (endpoint, video), request_number in request_description:
        data_center_latency = endpoint_data_description[endpoint]
        cache_latency = data_center_latency
        for cache, videos in problem_state:
            if video in videos:
                cache_latency = min(cache_latency, endpoint_cache_description[(endpoint, cache)])
        score += (data_center_latency - cache_latency) * request_number
    return math.floor(score * 1000)

### Updating the score
Now, for computational efficiency effects, we create a re-score function that based on a cache change, a current score and the descriptions, updates the score.

In [14]:
#def re_score(problem_state)

In [5]:
## Meta Heuristics

### Tabu 



In [27]:
def tabu(initial_solution: dict,endpoint_data_description:list, endpoint_cache_description:dict,request_description:dict):
    taboos = [] # List with the index of the dict
    best = initial_solution
    best_score = score(initial_solution,endpoint_data_description,endpoint_cache_description,request_description)
tabu({},endpoint_data_description,endpoint_cache_description,request_description)


ValueError: too many values to unpack (expected 2)