# AIROYoung Exercise Solved

Let's use the classes we **exposed in C++**! Data input in complex application can be a real pain, but still our algorithms need much efficiency. We keep as many as possible components of our software outside the C++ nasty world, let's see **how easily Python can deal** with this tasks!

In [2]:
import networkx as nx
from scipy.spatial.distance import cdist
import numpy as np
from itertools import product
import matplotlib.pyplot as plt
import codecs
import datetime as dt
from ast import literal_eval as make_tuple

the following packages are the one we are going to use to read input data from different sources

In [3]:
import ConfigParser as cp
import json
import csv
import xml.etree.ElementTree as xml
import pandas as pd

here we import our own files and packages

In [4]:
import Exercise_Solved.AYT_exercise as AYT
import utils.misc as ut

## CONFIG

In a *config.ini* we find the date we are considering in our exercise

In [5]:
conf_parser = cp.RawConfigParser()

read the value the file and assign it to an object of type *AYT.date*

In [6]:
conf_parser.read('Exercise_Solved/data/config.ini');
date = ut.stringToDate(conf_parser.get('initialize', 'date'))
date = AYT.date(date.year, date.month, date.day)

In [7]:
print date, type(date)

2019-Mar-29 <class 'Exercise_Solved.AYT_exercise.date'>


## XML

We want to create a list of recipes, the ingredients are contained in an xml file

In [8]:
recipes = []

In [9]:
xml_tree = xml.parse('Exercise_Solved/data/ingredients.xml').getroot()
input_recipes = xml_tree.findall('recipe')

make use of the following dictionaries to get the AYT enum value from the string read in the xml file

In [10]:
cook_type_dict = {'oven' : AYT.CookingType.Oven, 'pot' : AYT.CookingType.Pot, 'pan' : AYT.CookingType.Pan}
conserv_dict = {'anywhere' : AYT.Conservation.Anywhere, 'fridge' : AYT.Conservation.Fridge, 'refrigerator' : AYT.Conservation.Refrigerator}

use an utility function available to read the ingredients of the recipe

In [11]:
for recipe in input_recipes:
    r = AYT.Recipe(recipe.get('name'))
    r.cooking = cook_type_dict[ut.getXMLelem(recipe, 'cooking_type')]
    r.butter_g = float(ut.getXMLelem(recipe, 'butter'))
    r.chocolate_g = float(ut.getXMLelem(recipe, 'chocolate'))
    r.conservation = conserv_dict[ut.getXMLelem(recipe, 'conservation')]
    r.cooking_duration = AYT.time_duration(0, int(ut.getXMLelem(recipe, 'cooking_duration')),0)
    r.milk_ml = float(ut.getXMLelem(recipe, 'milk'))
    r.eggs = int(ut.getXMLelem(recipe, 'eggs'))
    r.flour_g = float(ut.getXMLelem(recipe, 'flour'))
    recipes.append(r)

In [12]:
for r in recipes:
    print r

recipe_1
recipe_2
recipe_3
recipe_4
recipe_5
recipe_6


check how many common ingredients there are between the first two recipes

In [13]:
AYT.Recipe.numCommonIngredients(recipes[0], recipes[1])

3

does the first recipe takes longer to cook than the second one?

In [14]:
recipes[0].cooksSlowerThan(recipes[1])

False

## XLS

In an Excel file we can find all data corresponding to the employees (cooks and waiters)

In [15]:
cooks, waiters, employees = [], [], []

read the employee file Excel with Pandas functions

In [16]:
df_empl = pd.read_excel('Exercise_Solved/data/employees.xlsx')

In [17]:
df_empl

Unnamed: 0,Name,Role,Time per Day
0,tommy,cook,8
1,anna,waiter,6
2,marty,waiter,4
3,lavi,cook,6
4,gasta,,1
5,rosario,cook,7
6,veronica,waiter,5


drop who does not have a role, so he will not work **;p**

In [18]:
df_empl.dropna(inplace=True)

create employees by iterating on dataframe rows

In [19]:
for idx, data in df_empl.iterrows():
    if data.Role == 'cook':
        c = AYT.Cook.create(str(data.Name))
        c.time_per_day = AYT.time_duration(data['Time per Day'], 0, 0)
        cooks.append(c)
    else:
        w = AYT.Waiter.create(str(data.Name))
        w.time_per_day = AYT.time_duration(data['Time per Day'], 0, 0)
        waiters.append(w)

In [20]:
employees = cooks + waiters

In [21]:
for e in employees:
    print 'My name is {}, and {} Today I work {} hours'.format(e.name, e.whatDoIDo(), e.time_per_day)

My name is tommy, and I am a cook! Today I work 08:00:00 hours
My name is lavi, and I am a cook! Today I work 06:00:00 hours
My name is rosario, and I am a cook! Today I work 07:00:00 hours
My name is anna, and I am a Waiter! Today I work 06:00:00 hours
My name is marty, and I am a Waiter! Today I work 04:00:00 hours
My name is veronica, and I am a Waiter! Today I work 05:00:00 hours


## CSV

now we want to assign to each cook the recipes he knows. They are stored in a csv file

In [22]:
recipes_cooks = []

read data with csv package

In [23]:
with codecs.open('Exercise_Solved/data/recipes_of_cooks.csv', 'rb', 'utf-16') as f:
    reader = csv.reader(f)
    for row in reader:
        recipes_cooks.append(row)

*Hint*: you might have encoding issues...try `codecs.open(filename, 'rb', 'utf-16')` and you're good to go!

In [24]:
recipes_cooks

[['Cook',
  'Recipe_1',
  'Recipe_2',
  'Recipe_3',
  'Recipe_4',
  'Recipe_5',
  'Recipe_6'],
 ['tommy', '1', '0', '1', '1', '0', '0'],
 ['rosario', '0', '1', '1', '0', '1', '1'],
 ['lavi', '0', '1', '0', '1', '0', '1']]

read the same csv file with pandas and make *Cook* as the index column

In [25]:
recipes_cooks = pd.read_csv(codecs.open('Exercise_Solved/data/recipes_of_cooks.csv', 'rb', 'utf-16'), index_col=['Cook'])

In [26]:
recipes_cooks

Unnamed: 0_level_0,Recipe_1,Recipe_2,Recipe_3,Recipe_4,Recipe_5,Recipe_6
Cook,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
tommy,1,0,1,1,0,0
rosario,0,1,1,0,1,1
lavi,0,1,0,1,0,1


compute for each cook how many recipes he/she knows (use apply function of dataframe)

In [27]:
recipes_cooks.apply(sum, axis=1)

Cook
tommy      3
rosario    4
lavi       3
dtype: int64

compute for each recipe how many cooks know it (use apply function of dataframe)

In [28]:
recipes_cooks.apply(sum, axis=0)

Recipe_1    1
Recipe_2    2
Recipe_3    2
Recipe_4    2
Recipe_5    1
Recipe_6    2
dtype: int64

assign to each cook in employees the recipes he knows (use *next* to search by condition). `'recipe_{}'.format()` can be useful to find the recipe by its name

In [29]:
for name, recipes_data in recipes_cooks.iterrows():
    cook = next(c for c in employees if c.name == name)
    known_recipes = recipes_data.tolist()
    for i in range(len(known_recipes)):
        if known_recipes[i] == 1:
            r = next(rec for rec in recipes if rec.name == 'recipe_{}'.format(i+1))
            cook.known_recipes.append(r)

create the list of *cooks* by removing the waiters from the employees

In [30]:
cooks2 = list(set(employees) - set(waiters))

for the next task you may find useful *join()* and *map* python functions

In [31]:
for c in cooks2:
    print '{} can cook: {}'.format(c.name, ','.join(map(str, c.known_recipes)))

tommy can cook: recipe_1,recipe_3,recipe_4
rosario can cook: recipe_2,recipe_3,recipe_5,recipe_6
lavi can cook: recipe_2,recipe_4,recipe_6


## JSON

We want to collect data about some restaurants in Padova area (the best I know!). They are stored in a json file. Besides the restaurants, we want to save also a list of coordinates for each restaurant that will represent a graph

In [32]:
restaurants = []
graph = []

In [33]:
input_json = json.load(open('Exercise_Solved/data/restaurants.json'))

create restaurants and coordinates, for the latter you may take advantage of the `make_tuple()` python function

In [34]:
for k, v in input_json['restaurants'].iteritems():
    R = AYT.Restaurant(str(k))
    o_time = dt.datetime.strptime(v['open'], '%H:%M')
    c_time = dt.datetime.strptime(v['close'], '%H:%M')
    opening = AYT.time(date, AYT.time_duration(o_time.hour, o_time.minute, 0))
    closing = AYT.time(date, AYT.time_duration(c_time.hour, c_time.minute, 0))
    R.opening_period = AYT.time_period(opening, closing)
    restaurants.append(R)
    coord = make_tuple(v['position'])
    graph.append(coord)

In [35]:
for R in restaurants:
    print R.name

Da Bepi ae scoe
Dalla Ofelia
Il Falco d'Oro
Rosso Pomodoro
Kopfler
La Rosa dei Venti


Assign employees to restaurants and viceversa

In [36]:
for i in range(len(cooks)):
    cooks[i].restaurant = restaurants[i]
    waiters[i].restaurant = restaurants[i]
    restaurants[i].employees.append(cooks[i])
    restaurants[i].employees.append(waiters[i])

In [37]:
for i in range(len(cooks)):
    print cooks[i].name, 'works', cooks[i].restaurant.name
    print waiters[i].name, 'works', waiters[i].restaurant.name

tommy works Da Bepi ae scoe
anna works Da Bepi ae scoe
lavi works Dalla Ofelia
marty works Dalla Ofelia
rosario works Il Falco d'Oro
veronica works Il Falco d'Oro


let's see our graph (it is a grid)

In [38]:
graph.sort()

In [39]:
graph

[(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2)]

## Customers

Customers arrive according to a Poisson Process of rate 30. We consider the first 10 clients arriving from 12:00 at the first restaurant

In [40]:
arrival_times = map( lambda x: AYT.time(date, AYT.time_duration(12,x,0)), np.cumsum(np.random.poisson(30, 10)))

In [41]:
for a in arrival_times:
    print a

2019-Mar-29 12:38:00
2019-Mar-29 13:00:00
2019-Mar-29 13:32:00
2019-Mar-29 14:04:00
2019-Mar-29 14:32:00
2019-Mar-29 15:02:00
2019-Mar-29 15:34:00
2019-Mar-29 15:57:00
2019-Mar-29 16:34:00
2019-Mar-29 17:00:00


create the last customer arriving at the restaurant, let's say he wants to eat the first recipe 

In [42]:
c = AYT.Customer()
c.arrival_time = arrival_times[-1]
c.desired_meal = recipes[0]

check if there is at least one cook at the restaurant when the customer gets in that can cook that recipe

In [43]:
cooks_first_restaurants = [ck for ck in cooks if ck.restaurant.name == restaurants[0].name]
cooks_available = [ck for ck in cooks_first_restaurants if restaurants[0].opening_period.begin() + ck.time_per_day > c.arrival_time]

In [44]:
cooks_for_recipe = [ck for ck in cooks_available if c.desired_meal in ck.known_recipes]

In [45]:
for ck in cooks_for_recipe:
    print 'customer arrives at {}. He wants {}. Cook {} works till {} and can cook {}'.format(c.arrival_time.timeOfDay(),
                                                                           c.desired_meal,
                                                                           ck.name,
                                                                           (ck.restaurant.opening_period.begin() + ck.time_per_day).timeOfDay(),
                                                                           ','.join(map(str,ck.known_recipes)))

customer arrives at 17:00:00. He wants recipe_1. Cook tommy works till 20:00:00 and can cook recipe_1,recipe_3,recipe_4


## TSP

Anna is a very glutton girl, and wants to eat at every restaurant we have! Let's say they all keep open for Anna (or else she would freak out). Anna is also a very lazy girl, and she does not want to do much effort walking through the city, so let's suggest her a short tour by solving a **Traveling Salesman Problem** on our grid graph!

cast the graph to a special Python type, optimized for matrix computations

In [46]:
grid = np.asarray(graph)

some python magic for creating a distance matrix computed with *Manhattan distance*

In [47]:
M = np.asmatrix(cdist(grid,grid, 'cityblock'))

In [48]:
print M

[[0. 1. 2. 1. 2. 3.]
 [1. 0. 1. 2. 1. 2.]
 [2. 1. 0. 3. 2. 1.]
 [1. 2. 3. 0. 1. 2.]
 [2. 1. 2. 1. 0. 1.]
 [3. 2. 1. 2. 1. 0.]]


create an instance of our network class

In [49]:
network = AYT.Network(len(grid))

In [50]:
for i in range(len(network)):
    network[i] = restaurants[i]

In [51]:
for i,j in product(range(len(network)),range(len(network))):
    if i != j:
        t = AYT.Trip(M[i,j], AYT.time_duration(0, int(M[i,j]), 0))
        network.pushArc(i,j,t)

create an instance of *Networkx* graph class

In [52]:
G = nx.from_numpy_matrix(M)

let's solve the problem with our C++ function!

In [None]:
route = AYT.TSP(network)

In [None]:
route

compute the length of the path

In [None]:
sum(M[route[i],route[i+1]] for i in range(len(route)-1))

build a list of the arcs of the route, that is each pair of consecutive nodes

In [None]:
route_edges = zip(route[:-1], route[1:])

In [None]:
route_edges

finally, let's plot the solution found

In [None]:
pos = nx.circular_layout(G)  # positions for all nodes

nx.draw(G, pos, node_color='orange', with_labels=True)

nx.draw_networkx_edges(G, pos, edgelist=route_edges, width=6, alpha=0.5, edge_color='b')

plt.show()