
The general recipe is a short list of four main steps:

1.   Compose a function to **read** input data and prepare a Tensorflow Dataset;
2.   Define a **scoring** function that, given a (set of) query-document feature vector(s), produces a score indicating the query's level of relevance to the document;
3.   Create a **loss** function that measures how far off the produced scores from step (2) are from the ground truth; and,
4.   Define evaluation **metrics**.

## Imports

In [2]:
import tensorflow as tf
import tensorflow_ranking as tfr

tf.enable_eager_execution()
tf.executing_eagerly()

True

## Constants

In [3]:
# Store the paths to files containing training and test instances.
# As noted above, we will assume the data is in the LibSVM format
# and that the content of each file is sorted by query ID.

_TRAIN_DATA_PATH="../data/interim/test_10.libsvm"
_TEST_DATA_PATH="../data/interim/test_10.libsvm"

# Define a loss function. To find a complete list of available
# loss functions or to learn how to add your own custom function
# please refer to the tensorflow_ranking.losses module.
_LOSS="pairwise_logistic_loss"

# In the TF-Ranking framework, a training instance is represented
# by a Tensor that contains features from a list of documents
# associated with a single query. For simplicity, we fix the shape
# of these Tensors to a maximum list size and call it "list_size,"
# the maximum number of documents per query in the dataset.
# In this demo, we take the following approach:
#   * If a query has fewer documents, its Tensor will be padded
#     appropriately.
#   * If a query has more documents, we shuffle its list of
#     documents and trim the list down to the prescribed list_size.
_LIST_SIZE=5

# The total number of features per query-document pair.
# We set this number to the number of features in the MSLR-Web30K
# dataset.
_NUM_FEATURES=4

# Parameters to the scoring function.
_BATCH_SIZE=32
_HIDDEN_LAYER_DIMS=["20", "10"]

#_OUT_DIR = "../models/tfranking/"

## Function to read in data and form tensorflow dataset

Out train and test dataset is in the lib svm format which is normally used for Support Vector Machines. 

In [4]:
fo = open(_TRAIN_DATA_PATH)
i=0
for f in fo:
    if i != 10:
        print(f)
    else:
        break
    i+=1

0 qid:2629085 1:9 2:13207 3:2790 4:400

1 qid:2629085 1:3 2:8175 3:1656 4:700

0 qid:2629085 1:4 2:8175 3:1776 4:2700

0 qid:2629085 1:10 2:9620 3:2424 4:1600

0 qid:2629085 1:6 2:8079 3:2443 4:700

0 qid:2629085 1:7 2:13428 3:3777 4:800

1 qid:2848914 1:1 2:53156 3:6456 4:700

0 qid:2848914 1:3 2:48112 3:3535 4:700

0 qid:2848914 1:4 2:48112 3:3655 4:16500

1 qid:2848914 1:1 2:51641 3:8871 4:1200



In this example they first number shows the target

1 = Important 

0 = Not important

qid: Describes which lines belong together

E.g. query 10 had 6 suggestions for plans and just one is important. Here we would take the suggested transport mode from this plan. 

### Input Pipeline

The first step to construct an input pipeline that reads your dataset and produces a `tensorflow.data.Dataset` object. In this example, we will invoke a LibSVM parser that is included in the `tensorflow_ranking.data` module to generate a `Dataset` from a given file.

We parameterize this function by a `path` argument so that the function can be used to read both training and test data files.

 Read tf DataSet
 
 Dic for feature

In [5]:
def input_fn(path):
    train_dataset = tf.data.Dataset.from_generator(
      tfr.data.libsvm_generator(path, _NUM_FEATURES, _LIST_SIZE),
      output_types=(
          {str(k): tf.float32 for k in range(1,_NUM_FEATURES+1)},
          tf.float32
      ),
      output_shapes=(
          {str(k): tf.TensorShape([_LIST_SIZE, 1])
            for k in range(1,_NUM_FEATURES+1)},
          tf.TensorShape([_LIST_SIZE])
      )
    )

    train_dataset = train_dataset.shuffle(1000).repeat().batch(_BATCH_SIZE)
    return train_dataset.make_one_shot_iterator().get_next()

In [6]:
train_dataset = tf.data.Dataset.from_generator(
      tfr.data.libsvm_generator(_TRAIN_DATA_PATH, _NUM_FEATURES, _LIST_SIZE),
      output_types=(
          {str(k): tf.float32 for k in range(1,_NUM_FEATURES+1)},
          tf.float32
      ),
      output_shapes=(
          {str(k): tf.TensorShape([_LIST_SIZE, 1])
            for k in range(1,_NUM_FEATURES+1)},
          tf.TensorShape([_LIST_SIZE])
      )
    )

train_dataset

Instructions for updating:
tf.py_func is deprecated in TF V2. Instead, use
    tf.py_function, which takes a python function which manipulates tf eager
    tensors instead of numpy arrays. It's easy to convert a tf eager tensor to
    an ndarray (just call tensor.numpy()) but having access to eager tensors
    means `tf.py_function`s can use accelerators such as GPUs as well as
    being differentiable using a gradient tape.
    


<DatasetV1Adapter shapes: ({1: (5, 1), 2: (5, 1), 3: (5, 1), 4: (5, 1)}, (5,)), types: ({1: tf.float32, 2: tf.float32, 3: tf.float32, 4: tf.float32}, tf.float32)>

In [7]:
a=train_dataset.make_one_shot_iterator().get_next()

Instructions for updating:
Colocations handled automatically by placer.


In [8]:
a[0]

{'1': <tf.Tensor: id=28, shape=(5, 1), dtype=float32, numpy=
 array([[10.],
        [ 9.],
        [ 6.],
        [ 4.],
        [ 7.]], dtype=float32)>,
 '2': <tf.Tensor: id=29, shape=(5, 1), dtype=float32, numpy=
 array([[ 9620.],
        [13207.],
        [ 8079.],
        [ 8175.],
        [13428.]], dtype=float32)>,
 '3': <tf.Tensor: id=30, shape=(5, 1), dtype=float32, numpy=
 array([[2424.],
        [2790.],
        [2443.],
        [1776.],
        [3777.]], dtype=float32)>,
 '4': <tf.Tensor: id=31, shape=(5, 1), dtype=float32, numpy=
 array([[1600.],
        [ 400.],
        [ 700.],
        [2700.],
        [ 800.]], dtype=float32)>}

## Estimator Creation


Next, we turn to the scoring function which is arguably at the heart of a TF Ranking model. The idea is to compute a relevance score for a (set of) query-document pair(s). The TF-Ranking model will use training data to learn this function.

Here we formulate a scoring function using a feed forward network. The function takes the features of a single example (i.e., query-document pair) and produces a relevance score

In [9]:
def example_feature_columns():
    """Returns the example feature columns."""
    feature_names = [
      "%d" % (i + 1) for i in range(0, _NUM_FEATURES)
    ]
    return {
      name: tf.feature_column.numeric_column(
          name, shape=(1,), default_value=0.0) for name in feature_names
    }

def make_score_fn(mode):
        """Returns a scoring function to build `EstimatorSpec`."""
        
        def _score_fn(context_features, group_features, mode, params, config):
            """Defines the network to score a documents."""
            del params
            del config
            # Define input layer.
            example_input = [
                tf.layers.flatten(group_features[name])
                for name in sorted(example_feature_columns())
            ]
            print("MAKE SCORE FUNCTION:")
            print(example_input)
            input_layer = tf.concat(example_input, 1)

            cur_layer = input_layer
            for i, layer_width in enumerate(int(d) for d in _HIDDEN_LAYER_DIMS):
                cur_layer = tf.layers.dense(
                  cur_layer,
                  units=layer_width,
                  activation="tanh")

            logits = tf.layers.dense(cur_layer, units=1)
            return logits
        return _score_fn

In [10]:
def eval_metric_fns():
    """Returns a dict from name to metric functions.

    This can be customized as follows. Care must be taken when handling padded
    lists.

    def _auc(labels, predictions, features):
    is_label_valid = tf_reshape(tf.greater_equal(labels, 0.), [-1, 1])
    clean_labels = tf.boolean_mask(tf.reshape(labels, [-1, 1], is_label_valid)
    clean_pred = tf.boolean_maks(tf.reshape(predictions, [-1, 1], is_label_valid)
    return tf.metrics.auc(clean_labels, tf.sigmoid(clean_pred), ...)
    metric_fns["auc"] = _auc

    Returns:
    A dict mapping from metric name to a metric function with above signature.
    """
    metric_fns = {}
    metric_fns.update({
      "metric/ndcg@%d" % topn: tfr.metrics.make_ranking_metric_fn(
          tfr.metrics.RankingMetricKey.NDCG, topn=topn)
      for topn in [1, 3, 5, 10]
    })

    return metric_fns

In [11]:
def get_estimator(hparams, mode):
    """Create a ranking estimator.

    Args:
    hparams: (tf.contrib.training.HParams) a hyperparameters object.

    Returns:
    tf.learn `Estimator`.
    """
    def _train_op_fn(loss):
        """Defines train op used in ranking head."""
        return tf.contrib.layers.optimize_loss(
            loss=loss,
            global_step=tf.train.get_global_step(),
            learning_rate=hparams.learning_rate,
            optimizer="Adagrad")

    ranking_head = tfr.head.create_ranking_head(
      loss_fn=tfr.losses.make_loss_fn(_LOSS),
      eval_metric_fns=eval_metric_fns(),
      train_op_fn=_train_op_fn)

    return tf.estimator.Estimator(
      model_fn=tfr.model.make_groupwise_ranking_fn(
          group_score_fn=make_score_fn(mode),
          group_size=1,
          transform_fn=None,
          ranking_head=ranking_head),
        params=hparams)

In [12]:
hparams = tf.contrib.training.HParams(learning_rate=0.05)
ranker = get_estimator(hparams, tf.estimator.ModeKeys.TRAIN)

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmplaniofvd', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f6044364a20>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


In [13]:
ranker.train(input_fn=lambda: input_fn(_TRAIN_DATA_PATH), steps=100)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Use groupwise dnn v2.
Instructions for updating:
Use tf.cast instead.
Instructions for updating:
Use keras.layers.flatten instead.
MAKE SCORE FUNCTION:
[<tf.Tensor 'groupwise_dnn_v2/group_score/flatten/Reshape:0' shape=(?, 1) dtype=float32>, <tf.Tensor 'groupwise_dnn_v2/group_score/flatten_1/Reshape:0' shape=(?, 1) dtype=float32>, <tf.Tensor 'groupwise_dnn_v2/group_score/flatten_2/Reshape:0' shape=(?, 1) dtype=float32>, <tf.Tensor 'groupwise_dnn_v2/group_score/flatten_3/Reshape:0' shape=(?, 1) dtype=float32>]
Instructions for updating:
Use keras.layers.dense instead.
Instructions for updating:
Use tf.cast instead.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.


  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmplaniofvd/model.ckpt.
INFO:tensorflow:loss = 0.6945124, step = 1
INFO:tensorflow:Saving checkpoints for 100 into /tmp/tmplaniofvd/model.ckpt.
INFO:tensorflow:Loss for final step: 0.25966698.


<tensorflow_estimator.python.estimator.estimator.Estimator at 0x7f6044364208>

In [14]:
fo = open(_TEST_DATA_PATH)
i=0
for f in fo:
    if i != 10:
        print(f)
    else:
        break
    i+=1

0 qid:2629085 1:9 2:13207 3:2790 4:400

1 qid:2629085 1:3 2:8175 3:1656 4:700

0 qid:2629085 1:4 2:8175 3:1776 4:2700

0 qid:2629085 1:10 2:9620 3:2424 4:1600

0 qid:2629085 1:6 2:8079 3:2443 4:700

0 qid:2629085 1:7 2:13428 3:3777 4:800

1 qid:2848914 1:1 2:53156 3:6456 4:700

0 qid:2848914 1:3 2:48112 3:3535 4:700

0 qid:2848914 1:4 2:48112 3:3655 4:16500

1 qid:2848914 1:1 2:51641 3:8871 4:1200



In [15]:
preds = ranker.predict(input_fn=lambda: input_fn(_TEST_DATA_PATH))

In [16]:
type(preds)

generator

In [17]:
next(preds)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Use groupwise dnn v2.
MAKE SCORE FUNCTION:
[<tf.Tensor 'groupwise_dnn_v2/group_score/flatten/Reshape:0' shape=(?, 1) dtype=float32>, <tf.Tensor 'groupwise_dnn_v2/group_score/flatten_1/Reshape:0' shape=(?, 1) dtype=float32>, <tf.Tensor 'groupwise_dnn_v2/group_score/flatten_2/Reshape:0' shape=(?, 1) dtype=float32>, <tf.Tensor 'groupwise_dnn_v2/group_score/flatten_3/Reshape:0' shape=(?, 1) dtype=float32>]
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
INFO:tensorflow:Restoring parameters from /tmp/tmplaniofvd/model.ckpt-100
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.


array([-2.0263803, -2.0263803,  1.2206907, -2.0263803, -2.0263803],
      dtype=float32)

Nimm zu einer query die predictions, höchste pred transport mode 

In [18]:
count=0
for i in preds:
    print(count)
    print(i)

    count+=1
    if count == 20:
        break

0
[4.226252   0.44031858 4.226252   4.226252   0.01470059]
1
[4.226252   0.44031858 4.226252   4.226252   0.01470059]
2
[ 1.2206907  1.2206907 -2.0263803 -2.0263803 -2.0263803]
3
[-2.0263803 -2.0263803  1.2206907 -2.0263803  1.2206907]
4
[4.226252   0.44031858 4.226252   4.226252   0.01470059]
5
[-2.0263803  1.2206907 -2.0263803  1.2206907 -2.0263803]
6
[0.44031858 4.226252   4.226252   4.226252   0.01470059]
7
[4.226252   4.226252   0.44031858 4.226252   0.01470059]
8
[-2.0263803  1.2206907 -2.0263803 -2.0263803 -2.0263803]
9
[-2.0263803  1.2206907  1.2206907 -2.0263803 -2.0263803]
10
[4.226252   4.226252   4.226252   0.44031858 0.01470059]
11
[4.226252   4.226252   0.44031858 4.226252   0.01470059]
12
[ 1.2206907 -2.0263803 -2.0263803 -2.0263803  1.2206907]
13
[4.226252   4.226252   4.226252   0.44031858 0.01470059]
14
[ 1.2206907 -2.0263803  1.2206907 -2.0263803 -2.0263803]
15
[-2.0263803 -2.0263803 -2.0263803  1.2206907 -2.0263803]
16
[4.226252   0.44031858 4.226252   4.226252   0.

In [19]:
import pandas as pd

In [20]:
test_X = pd.read_pickle("../data/interim/test_10.pickle")

In [21]:
test_X

Unnamed: 0,sid,click_time,click_mode,distance_plan,eta,price,transport_mode,plan_time,pid,req_time,o_long,o_lat,d_long,d_lat,distance_query,target
4,2629085,2018-10-12 16:28:13,3,13207,2790,400.0,9,2018-10-12 16:27:55,203797.0,2018-10-12 16:27:55,116.35,40.08,116.33,40.03,5.808139,0
5,2629085,2018-10-12 16:28:13,3,8175,1656,700.0,3,2018-10-12 16:27:55,203797.0,2018-10-12 16:27:55,116.35,40.08,116.33,40.03,5.808139,1
6,2629085,2018-10-12 16:28:13,3,8175,1776,2700.0,4,2018-10-12 16:27:55,203797.0,2018-10-12 16:27:55,116.35,40.08,116.33,40.03,5.808139,0
7,2629085,2018-10-12 16:28:13,3,9620,2424,1600.0,10,2018-10-12 16:27:55,203797.0,2018-10-12 16:27:55,116.35,40.08,116.33,40.03,5.808139,0
8,2629085,2018-10-12 16:28:13,3,8079,2443,700.0,6,2018-10-12 16:27:55,203797.0,2018-10-12 16:27:55,116.35,40.08,116.33,40.03,5.808139,0
9,2629085,2018-10-12 16:28:13,3,13428,3777,800.0,7,2018-10-12 16:27:55,203797.0,2018-10-12 16:27:55,116.35,40.08,116.33,40.03,5.808139,0
0,2848914,2018-11-17 18:42:17,1,53156,6456,700.0,1,2018-11-17 12:56:15,101804.0,2018-11-17 12:56:15,116.36,40.07,116.0,40.35,43.65657,1
1,2848914,2018-11-17 18:42:17,1,48112,3535,700.0,3,2018-11-17 12:56:15,101804.0,2018-11-17 12:56:15,116.36,40.07,116.0,40.35,43.65657,0
2,2848914,2018-11-17 18:42:17,1,48112,3655,16500.0,4,2018-11-17 12:56:15,101804.0,2018-11-17 12:56:15,116.36,40.07,116.0,40.35,43.65657,0
3,2848914,2018-11-17 18:42:17,1,51641,8871,1200.0,1,2018-11-17 12:56:15,101804.0,2018-11-17 12:56:15,116.36,40.07,116.0,40.35,43.65657,1


In [22]:
query_10 = pd.read_pickle("../data/interim/query.pickle")

In [23]:
query_10

969006    3260337
969003    3260337
969008    3260337
969007    3260337
969005    3260337
3033      3260360
3035      3260360
3034      3260360
141189    3260361
141185    3260361
Name: sid, dtype: int64

In [24]:
test_X

Unnamed: 0,sid,click_time,click_mode,distance_plan,eta,price,transport_mode,plan_time,pid,req_time,o_long,o_lat,d_long,d_lat,distance_query,target
4,2629085,2018-10-12 16:28:13,3,13207,2790,400.0,9,2018-10-12 16:27:55,203797.0,2018-10-12 16:27:55,116.35,40.08,116.33,40.03,5.808139,0
5,2629085,2018-10-12 16:28:13,3,8175,1656,700.0,3,2018-10-12 16:27:55,203797.0,2018-10-12 16:27:55,116.35,40.08,116.33,40.03,5.808139,1
6,2629085,2018-10-12 16:28:13,3,8175,1776,2700.0,4,2018-10-12 16:27:55,203797.0,2018-10-12 16:27:55,116.35,40.08,116.33,40.03,5.808139,0
7,2629085,2018-10-12 16:28:13,3,9620,2424,1600.0,10,2018-10-12 16:27:55,203797.0,2018-10-12 16:27:55,116.35,40.08,116.33,40.03,5.808139,0
8,2629085,2018-10-12 16:28:13,3,8079,2443,700.0,6,2018-10-12 16:27:55,203797.0,2018-10-12 16:27:55,116.35,40.08,116.33,40.03,5.808139,0
9,2629085,2018-10-12 16:28:13,3,13428,3777,800.0,7,2018-10-12 16:27:55,203797.0,2018-10-12 16:27:55,116.35,40.08,116.33,40.03,5.808139,0
0,2848914,2018-11-17 18:42:17,1,53156,6456,700.0,1,2018-11-17 12:56:15,101804.0,2018-11-17 12:56:15,116.36,40.07,116.0,40.35,43.65657,1
1,2848914,2018-11-17 18:42:17,1,48112,3535,700.0,3,2018-11-17 12:56:15,101804.0,2018-11-17 12:56:15,116.36,40.07,116.0,40.35,43.65657,0
2,2848914,2018-11-17 18:42:17,1,48112,3655,16500.0,4,2018-11-17 12:56:15,101804.0,2018-11-17 12:56:15,116.36,40.07,116.0,40.35,43.65657,0
3,2848914,2018-11-17 18:42:17,1,51641,8871,1200.0,1,2018-11-17 12:56:15,101804.0,2018-11-17 12:56:15,116.36,40.07,116.0,40.35,43.65657,1


In [25]:
fo = open('../data/interim/test_10.libsvm')

for i in fo:
    print(i)

0 qid:2629085 1:9 2:13207 3:2790 4:400

1 qid:2629085 1:3 2:8175 3:1656 4:700

0 qid:2629085 1:4 2:8175 3:1776 4:2700

0 qid:2629085 1:10 2:9620 3:2424 4:1600

0 qid:2629085 1:6 2:8079 3:2443 4:700

0 qid:2629085 1:7 2:13428 3:3777 4:800

1 qid:2848914 1:1 2:53156 3:6456 4:700

0 qid:2848914 1:3 2:48112 3:3535 4:700

0 qid:2848914 1:4 2:48112 3:3655 4:16500

1 qid:2848914 1:1 2:51641 3:8871 4:1200



In [26]:
X = pd.read_pickle("../data/interim/X_10.pickle")

In [27]:
X

Unnamed: 0,transport_mode,distance_plan,eta,price
4,9,13207,2790,400.0
5,3,8175,1656,700.0
6,4,8175,1776,2700.0
7,10,9620,2424,1600.0
8,6,8079,2443,700.0
9,7,13428,3777,800.0
0,1,53156,6456,700.0
1,3,48112,3535,700.0
2,4,48112,3655,16500.0
3,1,51641,8871,1200.0


In [28]:
test_X[[
    'sid',
    'transport_mode',
    'distance_plan',
    'eta', 
    'price'
]]

Unnamed: 0,sid,transport_mode,distance_plan,eta,price
4,2629085,9,13207,2790,400.0
5,2629085,3,8175,1656,700.0
6,2629085,4,8175,1776,2700.0
7,2629085,10,9620,2424,1600.0
8,2629085,6,8079,2443,700.0
9,2629085,7,13428,3777,800.0
0,2848914,1,53156,6456,700.0
1,2848914,3,48112,3535,700.0
2,2848914,4,48112,3655,16500.0
3,2848914,1,51641,8871,1200.0


In [29]:
next(preds)

array([4.226252  , 4.226252  , 4.226252  , 0.44031858, 0.01470059],
      dtype=float32)

In [77]:
preds = ranker.predict(input_fn=lambda: input_fn(_TEST_DATA_PATH), yield_single_examples=True)

In [78]:
type(preds)

generator

In [79]:
import itertools
preds_10 = itertools.islice(preds, 10) # grab 

In [82]:
a = np.zeros((10,5))
a

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

In [84]:
preds = ranker.predict(input_fn=lambda: input_fn(_TEST_DATA_PATH), yield_single_examples=True)
import itertools
preds_10 = itertools.islice(preds, 10) # grab 
count=0
a = np.zeros((10,5))

for i in preds_10:
    print(count)
    print(i)
    a[count]=i
    count+=1

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Use groupwise dnn v2.
MAKE SCORE FUNCTION:
[<tf.Tensor 'groupwise_dnn_v2/group_score/flatten/Reshape:0' shape=(?, 1) dtype=float32>, <tf.Tensor 'groupwise_dnn_v2/group_score/flatten_1/Reshape:0' shape=(?, 1) dtype=float32>, <tf.Tensor 'groupwise_dnn_v2/group_score/flatten_2/Reshape:0' shape=(?, 1) dtype=float32>, <tf.Tensor 'groupwise_dnn_v2/group_score/flatten_3/Reshape:0' shape=(?, 1) dtype=float32>]
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmplaniofvd/model.ckpt-100
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
0
[4.226252   0.44031858 4.226252   4.226252   0.01470059]
1
[ 1.2206907 -2.0263803 -2.0263803 -2.0263803 -2.0263803]
2
[ 1.2206907 -2.0263803 -2.0263803  1.2206907 -2.0263803]
3
[4.226252   4.226252   0.44031858 4.226252   0.01470059]
4
[4.226252   4.226252   0.44031858 4.226252   0.01470059]
5
[ 1.2206907  1

In [93]:
a[:,0]

array([ 4.22625208,  1.22069073,  1.22069073,  4.22625208,  4.22625208,
        1.22069073,  0.44031858, -2.0263803 ,  1.22069073,  4.22625208])

In [96]:
test_X = test_X[[
    'sid',
    'transport_mode',
    'distance_plan',
    'eta', 
    'price'
]]

In [100]:
test_X = test_X.assign(yhat = a[:,0])

In [103]:
test_X.groupby("sid").max()

Unnamed: 0_level_0,transport_mode,distance_plan,eta,price,yhat
sid,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2629085,10,13428,3777,2700.0,4.226252
2848914,4,53156,8871,16500.0,4.226252
