# Estimators - Introduction

â€”a high-level TensorFlow API that greatly simplifies machine learning programming.

Estimators encapsulate the following actions:

    1. training
    2. evaluation
    3. prediction
    4. export for serving
    
You can develop a state of the art model with high-level intuitive code. In short, it is generally much easier to create models with Estimators than with the low-level TensorFlow APIs.

They are of two types:
    1. Premade Estimators
    2. Custom estimators



## Premade Estimators

Pre-made Estimators enable you to work at a much higher conceptual level than the base TensorFlow APIs.

No Worries about creating graphs and Sessions, Since Estimators will handle everything for you

# Structure of Premade Estimators

**A Tensorflow program using premade Estimators has 4 parts:**

    1. Write one or more dataset importing functions
        A dataset function should return two values 
            i) a dictionary with keys - feature columns and values - tensors.
            ii)  A tensor containing one or more labels
    
    2. Define the feature columns
         Each tf.feature_column identifies a feature name, its type, and any input pre-processing.
    
    3. Instantiate the relevant pre-made Estimator with feature columns. 
    
    4. Call a training, evaluation, or inference method.

In [None]:
#Skeleton for import dataset function
def input_fn(dataset):
   # manipulate dataset, extracting the feature dict and the label
   return feature_dict, label

# Define three numeric feature columns.
population = tf.feature_column.numeric_column('population')
crime_rate = tf.feature_column.numeric_column('crime_rate')
median_education = tf.feature_column.numeric_column('median_education',
                    normalizer_fn=lambda x: x - global_education_mean)

# Instantiate an estimator, passing the feature columns.
estimator = tf.estimator.LinearClassifier(
    feature_columns=[population, crime_rate, median_education],
    )

# my_training_set is the function created in Step 1
estimator.train(input_fn=my_training_set, steps=2000)

## Custom Estimators

The Heart of every estimator is model function which builds graphs

- In Premade Estimators Someone would have made model function

- In Custom Estimators We have to define our own model function

## Recomendded Workflow

- Find a Suitable premade Estimator and build a model.
- Find the results and Some other alternative Estimators.
- Check for Best Results.
- Go with Custom estimator

You can create Estimators from Keras models using **tf.keras.estimator.model_to_estimator(keras_model=keras_inception_v3)**


# Estimators  - Brief

### Strongly recommended to use Estimators API (tf.estimator) and Datasets API (tf.data)

![image.png](attachment:image.png)

An Estimator is any class derived from **tf.estimator.Estimator**.

TensorFlow provides a collection of **tf.estimator (for example, LinearRegressor)** to implement common ML algorithms. 

## Input Function

An input function is a function that returns a tf.data.Dataset object which outputs the following two-element tuple:

    - features - A Python dictionary in which:
             -   Each key is the name of a feature.
             -   Each value is an array containing all of that feature's values.
    - label - An array containing the values of the label for every example.

In [None]:
def input_evaluation_set():
    features = {'SepalLength': np.array([6.4, 5.0]),
                'SepalWidth':  np.array([2.8, 2.3]),
                'PetalLength': np.array([5.6, 3.3]),
                'PetalWidth':  np.array([2.2, 1.0])}
    labels = np.array([2, 1])
    return features, labels


Input function generates features and labels the way we like but it is recommended to use [**Dataset API**](https://www.tensorflow.org/guide/datasets_for_estimators)

![image.png](attachment:image.png)

In [1]:
def train_input_fn(features, labels, batch_size):
    """An input function for training"""
    # Convert the inputs to a Dataset.
    dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels))

    # Shuffle, repeat, and batch the examples.
    dataset = dataset.shuffle(1000).repeat().batch(batch_size)

    # Return the read end of the pipeline.
    return dataset.make_one_shot_iterator().get_next()

## Define the feature columns

    A feature column is an object describing how the model should use raw input data from the features dictionary.
    
    Think of feature columns as the intermediaries between raw data and Estimators.
    
**Feature Column transforms vast range of raw data into a format that can be used by the Estimators**

**tf.feature_column.numeric_column** - Used for creating feature columns for Numerical values.

What about Non-Numerical Values ?

Every Non-Numerical value is converted into numerical
For Eg: Three classes A,B,C
A is present - > [1,0,0]

[**You can find various feature column types in this link**](https://www.tensorflow.org/api_docs/python/tf/feature_column)

![image.png](attachment:image.png)

## Bucketized Column

 When you don't want to feed a number directly into the model, but instead split its value into different categories based on numerical ranges. Create a **tf.feature_column.bucketized_column** .
 

In [None]:
# First, convert the raw input to a numeric column.
numeric_feature_column = tf.feature_column.numeric_column("Year")

# Then, bucketize the numeric column on the years 1960, 1980, and 2000.
bucketized_feature_column = tf.feature_column.bucketized_column(
    source_column = numeric_feature_column,
    boundaries = [1960, 1980, 2000]) # 3 element boundaries created 4 buckets (<1960, 60-80,80-2000,>2000)

## Categorical identity column
Categorical identity columns can be seen as a special case of bucketized columns.

In traditional bucketized columns, each bucket represents a range of values (for example, from 1960 to 1979).
 
let the categories be 0,1,2,3 -> [0,0,0,0] , [0,1,0,0] , [0,0,1,0] , [0,0,0,1]

## Categorical vocabulary column

Strings cannot be added directly they are also converted to numerical forms using one hot encoding.

TensorFlow provides two different functions to create categorical vocabulary columns:

    - tf.feature_column.categorical_column_with_vocabulary_list
    - tf.feature_column.categorical_column_with_vocabulary_file


## Hashed Column

    This feature column can be used for large number of categories.This feature column finds the hash value for the category.
![image.png](attachment:image.png)

## Passing feature columns to Estimators
        tf.estimator.LinearClassifier and tf.estimator.LinearRegressor: Accept all types of feature column.
    
        tf.estimator.DNNClassifier and tf.estimator.DNNRegressor: Only accept dense columns. Other column types must be wrapped in either an indicator_column or embedding_column.

        tf.estimator.DNNLinearCombinedClassifier and tf.estimator.DNNLinearCombinedRegressor:The linear_feature_columns argument accepts any feature column type.The dnn_feature_columns argument only accepts dense columns.

# Custom Estimators

1. Define the model.
        
2. Specify additional calculations for each of the three different modes:
   - Predict
   - Evaluate
   - Train

## Define the model
    The basic deep neural network model must define the following three sections:

          - An input layer
          - One or more hidden layers
          - An output layer

## Model function
    - This function returns tf.estimator.EstimatorSpec() according to various modes like predict train and eval
    
    - This function is used in tf.estimator.Estimator()
    
    - This function defines input,hidden and output layers. It also defines the loss  and optimization functions necessary 
       for training

In [None]:
def my_model(features, labels, mode, params):
    """DNN with three hidden layers and learning_rate=0.1."""
    # Create three fully connected layers.
    
    # INPUT LAYER
    net = tf.feature_column.input_layer(features, params['feature_columns'])
    
    # HIDDEN LAYERS
    for units in params['hidden_units']:
        net = tf.layers.dense(net, units=units, activation=tf.nn.relu)

    # Compute logits (1 per class).
    # OUTPUT LAYER
    logits = tf.layers.dense(net, params['n_classes'], activation=None)

    # Compute predictions.
    predicted_classes = tf.argmax(logits, 1)
    # PREDICT MODE
    if mode == tf.estimator.ModeKeys.PREDICT:
        predictions = {
            'class_ids': predicted_classes[:, tf.newaxis],
            'probabilities': tf.nn.softmax(logits),
            'logits': logits,
        }
        return tf.estimator.EstimatorSpec(mode, predictions=predictions)

    # Compute loss.
    loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)

    # Compute evaluation metrics.
    accuracy = tf.metrics.accuracy(labels=labels,
                                   predictions=predicted_classes,
                                   name='acc_op')
    metrics = {'accuracy': accuracy}
    tf.summary.scalar('accuracy', accuracy[1])

    if mode == tf.estimator.ModeKeys.EVAL:
        return tf.estimator.EstimatorSpec(
            mode, loss=loss, eval_metric_ops=metrics)

    # Create training op.
    # TRAIN MODE
    assert mode == tf.estimator.ModeKeys.TRAIN

    optimizer = tf.train.AdagradOptimizer(learning_rate=0.1)
    train_op = optimizer.minimize(loss, global_step=tf.train.get_global_step())
    return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op)

In [None]:
classifier = tf.estimator.Estimator(
        model_fn=my_model_fn,
        params={
            'feature_columns': my_feature_columns,
            # Two hidden layers of 10 nodes each.
            'hidden_units': [10, 10],
            # The model must choose between 3 classes.
            'n_classes': 3,
        })

# Summary
    
    Although pre-made Estimators can be an effective way to quickly create new models, you will often need the additional flexibility that custom Estimators provide. Fortunately, pre-made and custom Estimators follow the same programming model. The only practical difference is that you must write a model function for custom Estimators; everything else is the same.