# Data Processing
We used a data set that contains images, each image is a representation of a planning task, 
and a table with total run time for a collection of planners on these tasks. 
Firstly, we simplify the table to indicate which planners are considered preferable to use on each planning task.
Secondly, we 

## Data-Frame Processing   
Import the necessary packages.

In [17]:
import os
import pandas as pd
from utils import *

Load data frame.

In [18]:
CURRENT_DIR = os.getcwd()
DATA_DIR = os.path.expanduser(CURRENT_DIR + '/IPC-image-data/lifted')
df = pd.read_csv(CURRENT_DIR + '/IPC-image-data/runtimes.csv')

Let's take a look at the data:

In [19]:
df.head(5)

Unnamed: 0,filename,h2-simpless-dks-celmcut,h2-simpless-dks-cpdbshc900,h2-simpless-dks-900masb50ksccdfp,h2-simpless-oss-900masb50ksbmiasm,h2-simpless-dks-blind,h2-simpless-oss-zopdbsgenetic,h2-simpless-oss-blind,h2-simpless-dks-900masb50ksbmiasm,seq-opt-symba-1,...,DecStar,FDMS1,FDMS2,Metis1,Metis2,Planning_PDBs,Scorpion,SymbolicBidirectional,Symple_1,Symple_2
0,agricola-opt18-p01.pddl,10000.0,480.029,84.7561,978.635,105.055,94.7066,94.4991,87.5693,6.58,...,10000.0,90.123,109.198,10000.0,10000.0,56.453,10000.0,21.796,1743.946,1730.012
1,agricola-opt18-p02.pddl,10000.0,478.332,72.9656,969.578,44.7857,42.1715,42.8049,532.052,17.39,...,10000.0,85.876,293.176,10000.0,10000.0,245.053,1088.101,78.054,10000.0,10000.0
2,agricola-opt18-p03.pddl,10000.0,269.491,100.855,928.244,39.3269,31.5789,31.5499,329.127,56.72,...,10000.0,112.842,405.213,10000.0,10000.0,10000.0,941.246,87.209,10000.0,10000.0
3,agricola-opt18-p04.pddl,10000.0,434.582,339.602,971.922,10000.0,494.11,537.843,284.479,14.7,...,10000.0,355.097,157.46,10000.0,10000.0,139.942,10000.0,52.346,10000.0,10000.0
4,agricola-opt18-p05.pddl,10000.0,594.021,188.336,993.565,100.283,97.581,94.1114,361.916,69.65,...,10000.0,204.673,395.223,10000.0,10000.0,991.816,10000.0,52.934,10000.0,10000.0


Each row contains statistics about a certain planning problem: 
its name and the time it took for each planner to solve it
 
The list of planners is:   

In [20]:
list(df.columns[1:])

['h2-simpless-dks-celmcut',
 'h2-simpless-dks-cpdbshc900',
 'h2-simpless-dks-900masb50ksccdfp',
 'h2-simpless-oss-900masb50ksbmiasm',
 'h2-simpless-dks-blind',
 'h2-simpless-oss-zopdbsgenetic',
 'h2-simpless-oss-blind',
 'h2-simpless-dks-900masb50ksbmiasm',
 'seq-opt-symba-1',
 'h2-simpless-oss-masginfsccdfp',
 'h2-simpless-dks-900masginfsccdfp',
 'h2-simpless-oss-cpdbshc900',
 'h2-simpless-dks-zopdbsgenetic',
 'simpless-oss-masb50kmiasmdfp',
 'h2-simpless-oss-900masb50ksccdfp',
 'simpless-dks-masb50kmiasmdfp',
 'h2-simpless-oss-celmcut',
 'Complementary1',
 'Complementary2',
 'DecStar',
 'FDMS1',
 'FDMS2',
 'Metis1',
 'Metis2',
 'Planning_PDBs',
 'Scorpion',
 'SymbolicBidirectional',
 'Symple_1',
 'Symple_2']

We aim to establish a net that will classify each planning problem to its preferable planner.
For training purposes we create a new abstracted data-frame.
In the new data-frame we mark 1 where the planner is preferable (top 25%), else we mark 0.

Creation of a new data-frame:

In [21]:
temp_df = df.drop('filename', axis=1)


For each planning problem we establish a threshold that will help us decide which planner is preferable.
A planner is preferable if it did not reach timeout and it is faster than 75% of the other planners.

In [22]:
threshold = temp_df.apply(lambda x: np.percentile(x,25), axis=1)
threshold[threshold==10000] = temp_df.apply(lambda x: np.percentile(x,20), axis=1)[threshold==10000]
threshold[threshold==10000] = temp_df.apply(lambda x: np.percentile(x,15), axis=1)[threshold==10000]
threshold[threshold==10000] = temp_df.apply(lambda x: np.percentile(x,10), axis=1)[threshold==10000]
threshold[threshold==10000] = temp_df.apply(lambda x: np.percentile(x,5), axis=1)[threshold==10000]
threshold[threshold==10000] = temp_df.apply(lambda x: np.percentile(x,1), axis=1)[threshold==10000]
# list(threshold)

Add the threshold to the a new column in the data-frame. 

In [23]:
df['threshold'] = threshold

In [24]:
columns = list(df.columns)
columns.remove('filename')

for col in columns:
    cond = (df[col] < df['threshold']) & (df[col] != -1)
    df.loc[cond, col] = 1

df.head()

Unnamed: 0,filename,h2-simpless-dks-celmcut,h2-simpless-dks-cpdbshc900,h2-simpless-dks-900masb50ksccdfp,h2-simpless-oss-900masb50ksbmiasm,h2-simpless-dks-blind,h2-simpless-oss-zopdbsgenetic,h2-simpless-oss-blind,h2-simpless-dks-900masb50ksbmiasm,seq-opt-symba-1,...,FDMS1,FDMS2,Metis1,Metis2,Planning_PDBs,Scorpion,SymbolicBidirectional,Symple_1,Symple_2,threshold
0,agricola-opt18-p01.pddl,10000.0,480.029,1.0,978.635,105.055,94.7066,94.4991,1.0,1.0,...,1.0,109.198,10000.0,10000.0,1.0,10000.0,1.0,1743.946,1730.012,94.4991
1,agricola-opt18-p02.pddl,10000.0,478.332,72.9656,969.578,1.0,1.0,1.0,532.052,1.0,...,85.876,293.176,10000.0,10000.0,245.053,1088.101,78.054,10000.0,10000.0,72.9656
2,agricola-opt18-p03.pddl,10000.0,269.491,100.855,928.244,1.0,1.0,1.0,329.127,1.0,...,112.842,405.213,10000.0,10000.0,10000.0,941.246,87.209,10000.0,10000.0,87.209
3,agricola-opt18-p04.pddl,10000.0,434.582,339.602,971.922,10000.0,494.11,537.843,1.0,1.0,...,355.097,1.0,10000.0,10000.0,1.0,10000.0,1.0,10000.0,10000.0,339.602
4,agricola-opt18-p05.pddl,10000.0,594.021,188.336,993.565,1.0,1.0,1.0,361.916,1.0,...,204.673,395.223,10000.0,10000.0,991.816,10000.0,1.0,10000.0,10000.0,160.463


In [25]:
df.replace(10000, 0, inplace=True)
df.head()

Unnamed: 0,filename,h2-simpless-dks-celmcut,h2-simpless-dks-cpdbshc900,h2-simpless-dks-900masb50ksccdfp,h2-simpless-oss-900masb50ksbmiasm,h2-simpless-dks-blind,h2-simpless-oss-zopdbsgenetic,h2-simpless-oss-blind,h2-simpless-dks-900masb50ksbmiasm,seq-opt-symba-1,...,FDMS1,FDMS2,Metis1,Metis2,Planning_PDBs,Scorpion,SymbolicBidirectional,Symple_1,Symple_2,threshold
0,agricola-opt18-p01.pddl,0.0,480.029,1.0,978.635,105.055,94.7066,94.4991,1.0,1.0,...,1.0,109.198,0.0,0.0,1.0,0.0,1.0,1743.946,1730.012,94.4991
1,agricola-opt18-p02.pddl,0.0,478.332,72.9656,969.578,1.0,1.0,1.0,532.052,1.0,...,85.876,293.176,0.0,0.0,245.053,1088.101,78.054,0.0,0.0,72.9656
2,agricola-opt18-p03.pddl,0.0,269.491,100.855,928.244,1.0,1.0,1.0,329.127,1.0,...,112.842,405.213,0.0,0.0,0.0,941.246,87.209,0.0,0.0,87.209
3,agricola-opt18-p04.pddl,0.0,434.582,339.602,971.922,0.0,494.11,537.843,1.0,1.0,...,355.097,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,339.602
4,agricola-opt18-p05.pddl,0.0,594.021,188.336,993.565,1.0,1.0,1.0,361.916,1.0,...,204.673,395.223,0.0,0.0,991.816,0.0,1.0,0.0,0.0,160.463


In [26]:
for col in columns:
    cond = (df[col] != 1) & (df[col] != -1)
    df.loc[cond, col] = 0
df.head()

Unnamed: 0,filename,h2-simpless-dks-celmcut,h2-simpless-dks-cpdbshc900,h2-simpless-dks-900masb50ksccdfp,h2-simpless-oss-900masb50ksbmiasm,h2-simpless-dks-blind,h2-simpless-oss-zopdbsgenetic,h2-simpless-oss-blind,h2-simpless-dks-900masb50ksbmiasm,seq-opt-symba-1,...,FDMS1,FDMS2,Metis1,Metis2,Planning_PDBs,Scorpion,SymbolicBidirectional,Symple_1,Symple_2,threshold
0,agricola-opt18-p01.pddl,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,...,1.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0
1,agricola-opt18-p02.pddl,0.0,0.0,0.0,0.0,1.0,1.0,1.0,0.0,1.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,agricola-opt18-p03.pddl,0.0,0.0,0.0,0.0,1.0,1.0,1.0,0.0,1.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,agricola-opt18-p04.pddl,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,...,0.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0
4,agricola-opt18-p05.pddl,0.0,0.0,0.0,0.0,1.0,1.0,1.0,0.0,1.0,...,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0


In [27]:
df.drop('threshold', axis=1, inplace=True)

df.head()

Unnamed: 0,filename,h2-simpless-dks-celmcut,h2-simpless-dks-cpdbshc900,h2-simpless-dks-900masb50ksccdfp,h2-simpless-oss-900masb50ksbmiasm,h2-simpless-dks-blind,h2-simpless-oss-zopdbsgenetic,h2-simpless-oss-blind,h2-simpless-dks-900masb50ksbmiasm,seq-opt-symba-1,...,DecStar,FDMS1,FDMS2,Metis1,Metis2,Planning_PDBs,Scorpion,SymbolicBidirectional,Symple_1,Symple_2
0,agricola-opt18-p01.pddl,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,...,0.0,1.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0
1,agricola-opt18-p02.pddl,0.0,0.0,0.0,0.0,1.0,1.0,1.0,0.0,1.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,agricola-opt18-p03.pddl,0.0,0.0,0.0,0.0,1.0,1.0,1.0,0.0,1.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,agricola-opt18-p04.pddl,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,...,0.0,0.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0
4,agricola-opt18-p05.pddl,0.0,0.0,0.0,0.0,1.0,1.0,1.0,0.0,1.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0


## Image Processing
We use standard python packages to load image data into a numpy array. 
Then we convert this array into a torch.Tensor.

In [28]:
import os
import numpy as np
# import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms
# import matplotlib.pyplot as plt
# from skimage import io, transform, util
# from scipy.fftpack import fft2
# from sklearn.preprocessing import RobustScaler, MinMaxScaler
from pp_dataset import PlannerPortfolioDataset
from architectures import PlaNet
import argparse
import logging
import time
torch.manual_seed(42)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)

cuda


Get path information:

In [29]:
CURRENT_DIR = os.getcwd()
processed_df_path = os.path.join(CURRENT_DIR, 'df.csv')
image_folder_path = os.path.join(CURRENT_DIR, 'IPC-image-data/lifted/')

In [None]:
plan_dataset = PlannerPortfolioDataset(processed_df_path, image_folder_path, ftransform=exp_dict['fourier'])