# Polynomial Regression Problem

## Solution provided for very polynomial regression example on hackerrank, using scikit-leran library and also using simple neural network architecture

### The Problem

Charlie wants to purchase office-space. He does a detailed survey of the offices and corporate complexes in the area, and tries to quantify a lot of factors, such as the distance of the offices from residential and other commercial areas, schools and workplaces; the reputation of the construction companies and builders involved in constructing the apartments; the distance of the offices from highways, freeways and important roads; the facilities around the office space and so on.

Each of these factors are quantified, normalized and mapped to values on a scale of 0 to 1. Charlie then makes a table. Each row in the table corresponds to Charlie's observations for a particular house. If Charlie has observed and noted F features, the row contains F values separated by a single space, followed by the office-space price in dollars/square-foot. If Charlie makes observations for H houses, his observation table has (F+1) columns and H rows, and a total of (F+1) * H entries.

Charlie does several such surveys and provides you with the tabulated data. At the end of these tables are some rows which have just F columns (the price per square foot is missing). Your task is to predict these prices. F can be any integer number between 1 and 5, both inclusive. There is one important observation which Charlie has made. The prices per square foot, are (approximately) a polynomial function of the features in the observation table. This polynomial always has an order less than 4

### Input Format

The first line contains two space separated integers, F and N. Over here, F is the number of observed features. N is the number of rows for which features as well as price per square-foot have been noted.

This is followed by a table having F+1 columns and N rows with each row in a new line and each column separated by a single space. The last column is the price per square foot.

The table is immediately followed by integer T followed by T rows containing F columns.

In [10]:
import numpy as np
import pandas as pd
import tensorflow as tf
import mat
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures

In [14]:
# reading file line by line
'''
file = open('input03.txt', 'r') 

input_str = file.readline()
feat_count, data_count = list(map(int, input_str.split(" ")))
print('Number of features :: ', feat_count)
print('Number of data points :: ', data_count)
data_arr = np.empty((0,feat_count+1), float)
for i in range(data_count):
    input_str = file.readline()
    data_arr = np.append(data_arr, np.array([list(map(float, input_str.split(" ")))]), axis=0)


input_str = file.readline()
test_count = int(input_str)
print('Number of test data points :: ', test_count)
test_arr = np.empty((0,feat_count+1), float)

for i in range(test_count):
    input_str = file.readline()
    test_arr = np.append(test_arr, np.array([list(map(float, input_str.split(" ")))]), axis=0)
'''

# reading file using pandas
data_df = pd.read_csv("./input03.txt", delimiter= ' ', names=['p1','p2','predictions'])
feat_count = int(data_df['p1'][0])
data_count = int(data_df['p2'][0])
print('Number of features :: ', feat_count)
print('Number of data points :: ', data_count)
train_df = data_df[1:101]

test_df = data_df[102:]


Number of features ::  2
Number of data points ::  100


In [15]:
plt.scatter(train_df['p1'], train_df['predictions'])
plt.show()
plt.scatter(train_df['p2'], train_df['predictions'])
plt.xlabel('charging_time')
plt.ylabel('run_time')
plt.show()


data_arr = train_df.values
test_arr = test_df.values

NameError: name 'plt' is not defined

In [13]:
# Sklearn polynomial regression
polynomial_features= PolynomialFeatures(degree=3)
x_poly = polynomial_features.fit_transform(data_arr[:,0:feat_count])

model = LinearRegression()
model.fit(x_poly, data_arr[:,feat_count])

x_poly_test = polynomial_features.fit_transform(test_arr[:,0:feat_count])
y_poly_pred = model.predict(x_poly_test)

for pred, ground_truth in zip(y_poly_pred,test_arr[:,feat_count:feat_count+1]):
    print(np.round(pred,2), np.round(ground_truth,2))

180.38 [180.38]
1312.07 [1312.07]
440.13 [440.13]
343.72 [343.72]


In [9]:
# building simple neural network architecture using tensorflow
# Initializers
sigma = 1
weight_initializer = tf.variance_scaling_initializer(mode="fan_avg", distribution="uniform", scale=sigma)
bias_initializer = tf.zeros_initializer()


# Model architecture parameters
n_feats = 2
n_neurons_1 = 256
n_neurons_2 = 128
n_target = 1


# Placeholder
X = tf.placeholder(dtype=tf.float32, shape=[None, n_feats])
Y = tf.placeholder(dtype=tf.float32, shape=[None])

# Layer 1: Variables for hidden weights and biases
W_hidden_1 = tf.Variable(weight_initializer([n_feats, n_neurons_1]))
bias_hidden_1 = tf.Variable(bias_initializer([n_neurons_1]))
# Layer 2: Variables for hidden weights and biases
W_hidden_2 = tf.Variable(weight_initializer([n_neurons_1, n_neurons_2]))
bias_hidden_2 = tf.Variable(bias_initializer([n_neurons_2]))

# Output layer: Variables for output weights and biases
W_out = tf.Variable(weight_initializer([n_neurons_2, n_target]))
bias_out = tf.Variable(bias_initializer([n_target]))

# Hidden layer
hidden_1 = tf.nn.relu(tf.add(tf.matmul(X, W_hidden_1), bias_hidden_1))
hidden_2 = tf.nn.relu(tf.add(tf.matmul(hidden_1, W_hidden_2), bias_hidden_2))


# Output layer (must be transposed)
out = tf.transpose(tf.add(tf.matmul(hidden_2, W_out), bias_out))

# Cost function
mse = tf.reduce_mean(tf.squared_difference(out, Y))
# Optimizer
opt = tf.train.AdamOptimizer().minimize(mse)


with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())

    # Number of epochs and batch size
    epochs = 1000
    for e in range(epochs):
        X_train = data_arr[:,0:2]
        y_train =  data_arr[:,2]
        #print(y_train)
        epoch_points = []
        cost_points = []
        sess.run(opt, feed_dict={X: X_train, Y: y_train})
        if e%100 == 0:
            epoch_points.append(e)
            cost_points.append(sess.run(mse))
            print("The cost for epoch no ",epoch, ' is ',sess.run(cost))

    plt.plot(epoch_points, cost_points, 'r--')
    plt.axis([0, epochs, min(cost_points), max(cost_points)])
    plt.show()

        
    pred_list = sess.run(out, feed_dict={X: test_df.values[:,0:2]})
    
for pred, ground_truth in zip(pred_list,test_arr[:,feat_count:feat_count+1]):
    print(np.round(pred,2), np.round(ground_truth,2))

InvalidArgumentError: You must feed a value for placeholder tensor 'Placeholder_10' with dtype float and shape [?,2]
	 [[Node: Placeholder_10 = Placeholder[dtype=DT_FLOAT, shape=[?,2], _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
	 [[Node: Mean_5/_11 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_39_Mean_5", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Caused by op 'Placeholder_10', defined at:
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\site-packages\ipykernel_launcher.py", line 16, in <module>
    app.launch_new_instance()
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\site-packages\traitlets\config\application.py", line 658, in launch_instance
    app.start()
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\site-packages\ipykernel\kernelapp.py", line 486, in start
    self.io_loop.start()
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\site-packages\tornado\platform\asyncio.py", line 112, in start
    self.asyncio_loop.run_forever()
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\asyncio\base_events.py", line 421, in run_forever
    self._run_once()
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\asyncio\base_events.py", line 1426, in _run_once
    handle._run()
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\asyncio\events.py", line 127, in _run
    self._callback(*self._args)
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\site-packages\tornado\platform\asyncio.py", line 102, in _handle_events
    handler_func(fileobj, events)
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\site-packages\tornado\stack_context.py", line 276, in null_wrapper
    return fn(*args, **kwargs)
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\site-packages\zmq\eventloop\zmqstream.py", line 450, in _handle_events
    self._handle_recv()
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\site-packages\zmq\eventloop\zmqstream.py", line 480, in _handle_recv
    self._run_callback(callback, msg)
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\site-packages\zmq\eventloop\zmqstream.py", line 432, in _run_callback
    callback(*args, **kwargs)
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\site-packages\tornado\stack_context.py", line 276, in null_wrapper
    return fn(*args, **kwargs)
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\site-packages\ipykernel\kernelbase.py", line 283, in dispatcher
    return self.dispatch_shell(stream, msg)
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\site-packages\ipykernel\kernelbase.py", line 233, in dispatch_shell
    handler(stream, idents, msg)
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\site-packages\ipykernel\kernelbase.py", line 399, in execute_request
    user_expressions, allow_stdin)
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\site-packages\ipykernel\ipkernel.py", line 208, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\site-packages\ipykernel\zmqshell.py", line 537, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\site-packages\IPython\core\interactiveshell.py", line 2728, in run_cell
    interactivity=interactivity, compiler=compiler, result=result)
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\site-packages\IPython\core\interactiveshell.py", line 2850, in run_ast_nodes
    if self.run_code(code, result):
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\site-packages\IPython\core\interactiveshell.py", line 2910, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-9-8f37d7b7c26f>", line 16, in <module>
    X = tf.placeholder(dtype=tf.float32, shape=[None, n_feats])
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\site-packages\tensorflow\python\ops\array_ops.py", line 1746, in placeholder
    return gen_array_ops._placeholder(dtype=dtype, shape=shape, name=name)
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 4026, in _placeholder
    "Placeholder", dtype=dtype, shape=shape, name=name)
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\site-packages\tensorflow\python\framework\ops.py", line 3271, in create_op
    op_def=op_def)
  File "c:\users\sarap\appdata\local\programs\python\python36\lib\site-packages\tensorflow\python\framework\ops.py", line 1650, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'Placeholder_10' with dtype float and shape [?,2]
	 [[Node: Placeholder_10 = Placeholder[dtype=DT_FLOAT, shape=[?,2], _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
	 [[Node: Mean_5/_11 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_39_Mean_5", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
