<a href="https://colab.research.google.com/github/avi26-git/TensforFlow/blob/master/First_steps.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#  Starting with TensorFlow:



Learning Objectives:

Learn fundamental TensorFlow concepts
Use the LinearRegressor class in TensorFlow to predict median housing price, at the granularity of city blocks, based on one input feature
Evaluate the accuracy of a model's predictions using Root Mean Squared Error (RMSE)
Improve the accuracy of a model by tuning its hyperparameters

Loading the necessary libraries

In [0]:
from __future__ import print_function

import math

from IPython import display
from matplotlib import cm
from matplotlib import gridspec
from matplotlib import pyplot as plt
import numpy as np
import pandas as pd
from sklearn import metrics
import tensorflow as tf
from tensorflow.python.data import Dataset

tf.logging.set_verbosity(tf.logging.ERROR)
pd.options.display.max_rows = 10
pd.options.display.float_format = '{:.1f}'.format

This housing data is based on 1990 census data from California.

# Getting and setting data

In [0]:
california_housing_dataframe = pd.read_csv("https://download.mlcc.google.com/mledu-datasets/california_housing_train.csv", sep=",")

This data has to be randomized just to be sure not to get any pathological ordering effects that might harm the performance of Stochastic Gradient Descent.
Also, we'll scale median_house_value to be in units of thousands, so it can be learned a little more easily with learning rates in a range that we usually use.

In [0]:
california_housing_dataframe = california_housing_dataframe.reindex(
    np.random.permutation(california_housing_dataframe.index))
california_housing_dataframe["median_house_value"] /= 1000.0
california_housing_dataframe

Unnamed: 0,longitude,latitude,housing_median_age,total_rooms,total_bedrooms,population,households,median_income,median_house_value
14,-114.6,32.8,15.0,1448.0,378.0,949.0,300.0,0.9,45.0
4113,-118.0,33.8,32.0,2161.0,432.0,1503.0,402.0,4.3,191.4
4795,-118.1,34.1,26.0,794.0,182.0,709.0,170.0,3.2,170.8
13626,-122.0,37.3,32.0,2248.0,460.0,1191.0,419.0,5.6,288.9
14915,-122.2,37.8,52.0,1611.0,203.0,556.0,179.0,8.7,500.0
...,...,...,...,...,...,...,...,...,...
10499,-120.4,34.9,8.0,3119.0,620.0,1159.0,544.0,3.5,165.5
824,-117.1,32.8,38.0,3779.0,614.0,1495.0,614.0,4.4,184.0
5887,-118.2,33.9,36.0,1554.0,273.0,974.0,264.0,4.2,161.4
9280,-119.1,36.0,27.0,1575.0,321.0,1063.0,317.0,2.1,53.9


Examining the data would be the next step.

In [0]:
california_housing_dataframe.describe()

Unnamed: 0,longitude,latitude,housing_median_age,total_rooms,total_bedrooms,population,households,median_income,median_house_value
count,17000.0,17000.0,17000.0,17000.0,17000.0,17000.0,17000.0,17000.0,17000.0
mean,-119.6,35.6,28.6,2643.7,539.4,1429.6,501.2,3.9,207.3
std,2.0,2.1,12.6,2179.9,421.5,1147.9,384.5,1.9,116.0
min,-124.3,32.5,1.0,2.0,1.0,3.0,1.0,0.5,15.0
25%,-121.8,33.9,18.0,1462.0,297.0,790.0,282.0,2.6,119.4
50%,-118.5,34.2,29.0,2127.0,434.0,1167.0,409.0,3.5,180.4
75%,-118.0,37.7,37.0,3151.2,648.2,1721.0,605.2,4.8,265.0
max,-114.3,42.0,52.0,37937.0,6445.0,35682.0,6082.0,15.0,500.0


#Build a model

Let's try to predict median_house_value, which will be our label. We'll use total_rooms as our input feature.
Using the TF Estimator API.

##Step 1: Define Features and Configure Feature Columns

In order to import the training data into TensorFlow, we need to specify what type of data each feature contains. There are two main types of data:

**Categorical Data:** Data that is textual. e.g., the home style, the words in a real-estate ad.

**Numerical Data:** Data that is a number (integer or float) and that you want to treat as a number. Also, sometimes you might want to treat numerical data (e.g., a postal code) as if it were categorical.

In TensorFlow, we indicate a feature's data type using a construct called a **feature column**. Feature columns store only a description of the feature data; they do not contain the feature data itself.

Using just one numeric input feature, total_rooms. The following code pulls the total_rooms data from our california_housing_dataframe and defines the feature column using numeric_column, which specifies its data is numeric:

In [0]:
# Define the input feature: total_rooms.
my_feature = california_housing_dataframe[["total_rooms"]]

# Configure a numeric feature column for total_rooms.
feature_columns = [tf.feature_column.numeric_column("total_rooms")]

##Step 2: Define the Target

median_house_value is the target right now.

In [0]:
# Define the label.
targets = california_housing_dataframe["median_house_value"]

##Step 3: Configure the LinearRegressor

To be safe, we also apply gradient clipping to our optimizer via clip_gradients_by_norm. Gradient clipping ensures the magnitude of the gradients do not become too large during training, which can cause gradient descent to fail.

In [0]:
# Use gradient descent as the optimizer for training the model.
my_optimizer=tf.train.GradientDescentOptimizer(learning_rate=0.0000001)
my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)

# Configure the linear regression model with our feature columns and optimizer.
# Set a learning rate of 0.0000001 for Gradient Descent.
linear_regressor = tf.estimator.LinearRegressor(
    feature_columns=feature_columns,
    optimizer=my_optimizer
)