# Difficulty Decomposition

Can be difficulty of tasks decomposed into difficulty of contained concepts (linearly, with crosses, logistically, max, min)?

In [96]:
# Settings and imports.
%matplotlib inline
from collections import OrderedDict
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from pandas.plotting import parallel_coordinates
import seaborn as sns
import sklearn
from sklearn.dummy import DummyRegressor
from sklearn.linear_model import LinearRegression, Ridge, Lasso
from sklearn.preprocessing import PolynomialFeatures
from sklearn.model_selection import cross_validate
from sklearn.pipeline import Pipeline
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor
import data
from features import get_feature_df

sns.set()
pd.options.display.float_format = '{:.2f}'.format

# Preparing Data

In [7]:
tasks = data.load('robomission-2018-03-10/tasks.csv')
ts = data.load('robomission-2018-03-10/task_sessions.csv')
ts = ts[ts.time_spent > 0]
ts.time_spent = ts.time_spent.clip_upper(30 * 60)
ts['time_log'] = ts.time_spent.apply(np.log)
features = get_feature_df(transform='bin')

**TODO:**
- try smooth features instead of binarized

In [13]:
# Use median log-time as the proxy of task difficulty.
tasks['difficulty'] = ts.groupby('task').time_log.median()

# Models of Difficulty Composition

In [55]:
def evaluate(model, include_train_scores=False):
    scores = cross_validate(
        estimator=model,
        X=features, y=tasks.difficulty,
        cv=5,
        scoring=['neg_mean_squared_error'],
        return_train_score=include_train_scores)
    test_scores = scores['test_neg_mean_squared_error']
    result = np.mean(test_scores), np.std(test_scores)
    if include_train_scores:
        train_scores = scores['train_neg_mean_squared_error'] 
        result += (np.mean(train_scores),)
    return result

## Baseline Model (const)

In [40]:
evaluate(DummyRegressor(strategy='mean'))

(-0.89836097445164853, 0.38335648613983458)

## Additive Model

In [44]:
evaluate(LinearRegression(normalize=True))

(-0.41581329394602651, 0.11380356929978278)

## Additive Model with Crosses

Strongly overfitting (more features than tasks!)

In [89]:
feature_crosses = PolynomialFeatures(
    degree=2, interaction_only=True,
    include_bias=False)
model = Pipeline(steps=[
    ('feature-crosses', feature_crosses),
    ('lin-reg', LinearRegression(normalize=True))
])
evaluate(model, include_train_scores=True)

(-9.6670721786403808e+23, 1.281831987588877e+24, -0.072091788461891834)

Overfitting can be reduced by the weight penalty, but it will not outperform the simple additive model:

In [75]:
model = Pipeline(steps=[
    ('feature-crosses', feature_crosses),
    ('lin-reg', Ridge(alpha=10.0))
])
evaluate(model, include_train_scores=True)

(-0.41532864572485978, 0.13815873339697538, -0.20306807016861539)

In [88]:
model = Pipeline(steps=[
    ('feature-crosses', feature_crosses),
    ('lin-reg', Lasso(alpha=0.015))
])
evaluate(model, include_train_scores=True)

(-0.41637244725771794, 0.091098034704529735, -0.21904470341687823)

## Logistic Model

?? Only makes sense if we predict difficulty in limited range [0,1]?

## Decision Tree

In [99]:
evaluate(DecisionTreeRegressor())

(-0.60598388865191788, 0.11805410490233775)

In [100]:
evaluate(RandomForestRegressor())

(-0.42534040235537596, 0.095971388395491283)

## Nearest Neightbors

In [103]:
from sklearn.neighbors import KNeighborsRegressor
evaluate(KNeighborsRegressor())

(-0.45514983559998151, 0.14812178341162904)

## Max Model / Softmax Model

?? Greedy:
1. Take a task with a minimum difficulty.
2. Take any of its concepts.
3. Set difficulty of this concept to the avg difficulty of tasks containing this concepts, but no other unused concept.
4. Remove this concept and covered tasks. Repeat.

## Min Model / Softmin Model