# Tensorflow Extended(TFX) Machine Learning(ML) Pipeline for network anomaly detection using a subset of InSDN dataset - Version 1.0

## Summary of Content:

#### In this Jupyter notebook, Tensorflow Extended Platform components orchestrated by an interactive context are used to digest, discover, validate, preprocess, train, evaluate and push a network anomally detection ML model over a subset of InSDN 2020 dataset published in https://ieeexplore.ieee.org/ielx7/6287639/6514899/09187858.pdf and provided in http://aseados.ucd.ie/datasets/SDN/

#### Most of the focus of this project is to construct an end to end(E2E) (as much as possible) but still manually executed machine learnig pipeline using TFX components. Therefore, major concern is including and integrating TFX components rather than high level of future engineering or training high accuracy models or digesting large amounts of data. Therefore a toy dataset constructed as a CSV file with around 20 future colums and 7,856 sample rows are used (Original InSDN dataset consists of 83 features and 343,889 samples). Target labels are divided under five main group as Web, Malware, Dos/DDoS, Other and Normal. The task of ML model of this pipeline is predicting which type of label is the given data sample using basic deep learning neural network structures.

## Parts to be included and improved in further versions:

#### Even though this project includes an almost E2E ML pipeline, it is far away from being either automated or accurate. Therefore there are some aspects that should be included and improved in the future versions:
#### 1) Integrating KAFKA to provide real-time data rather than using static data provided with csv files
#### 2) Replacing Interactive Context with Apache AirBeam/AirFlow or KubeFlow to provide a almost fully automated pipeline orchestration
#### 3) Using more features and samples from InSDN dataset
#### 4) Using better designed ML models
#### 5) Improving the future engineering and preprocessing aspects

## Reference and Further Reading Materials:

#### Majority of the ideas and programming approaches represented here are referenced by the TFX tutorial provided by TensorFlow in https://github.com/tensorflow/tfx/blob/master/docs/tutorials/tfx/components.ipynb 

#### Each and every component used and almost every line of code written in this notebook will be explained in detail as much as possible but for further readings you may refer to the book "Building Machine Learning Pipelines" by Hannes Hapke and Catherine Nelson(https://www.oreilly.com/library/view/building-machine-learning/9781492053187/) or a wide variety of online sources listed below:

https://www.tensorflow.org/tfx/guide
https://stackoverflow.blog/2020/10/12/how-to-put-machine-learning-models-into-production/
https://medium.com/everything-full-stack/machine-learning-model-serving-overview-c01a6aa3e823
https://github.com/kaiwaehner/kafka-streams-machine-learning-examples
https://github.com/ksalama/tfx-workshop
https://cloud.google.com/architecture/architecture-for-mlops-using-tfx-kubeflow-pipelines-and-cloud-build
https://blog.doit-intl.com/tensorflow-extended-101-literally-everthing-you-need-to-know-aeecc51e6832
https://theaisummer.com/tfx/
https://blog.tensorflow.org/2020/09/brief-history-of-tensorflow-extended-tfx.html
https://blog.doit-intl.com/using-tensorflow-extended-tfx-to-build-machine-learning-pipelines-d04800bda1ec
https://www.youtube.com/watch?v=VrBoQCchJQU
https://www.youtube.com/watch?v=wPri78CFSEw
https://ieeexplore.ieee.org/ielx7/6287639/6514899/09187858.pdf
https://medium.com/acing-ai/understanding-tensorflow-serving-faca576b558c
https://www.youtube.com/watch?v=7oW49Ulr4cY
https://www.youtube.com/watch?v=RpWeVvAFzJE
https://www.youtube.com/watch?v=YeuvR6m6ACQ&list=PLQY2H8rRoyvxR15n04JiW0ezF5HQRs_8F
https://www.youtube.com/watch?v=TA5kbFgeUlk&list=PLQY2H8rRoyvxR15n04JiW0ezF5HQRs_8F&index=7
https://dzlab.github.io/ml/2020/09/13/tfx-data-ingestion/
https://colab.research.google.com/github/tensorflow/workshops/blob/master/tfx_colabs/TFX_Workshop_Colab.ipynb
https://github.com/tensorflow/tfx/blob/master/docs/tutorials/tfx/components.ipynb
http://aseados.ucd.ie/datasets/SDN/

## Pipeline Implementation:

### 0) Importing necessary modules and creating a subset of InSDN 2020 dataset using Pandas Library:

In [66]:
import os #to handle operating system operations such as file access
import pandas #to update the original InSDN 2020 dataset csv file to create a subset 
import shutil #to remove certain directories to clean the workspace

import os
import pprint
import tempfile
import urllib

import absl
import tensorflow as tf #to be used in general
import tensorflow_model_analysis as tfma
tf.get_logger().propagate = False
pp = pprint.PrettyPrinter()

import tfx
from tfx.components import CsvExampleGen #to digest the data in CSV format an turn it into TFRecord file
from tfx.components import Evaluator #to evaluate the trained model
from tfx.components import ExampleValidator #to check the example set fot anomalies
from tfx.components import Pusher #to push the trained model
from tfx.components import SchemaGen #to be used for schema generations
from tfx.components import StatisticsGen #to do statistical analysis
from tfx.components import Trainer #to train a model based on the specifications
from tfx.components import Transform #to apply future engineering and preprocessing
from tfx.dsl.components.common import resolver
from tfx.dsl.experimental import latest_blessed_model_resolver
from tfx.orchestration import metadata
from tfx.orchestration import pipeline
from tfx.orchestration.experimental.interactive.interactive_context import InteractiveContext #to be used as the orchestrator of the pipeline
from tfx.proto import pusher_pb2
from tfx.proto import trainer_pb2
from tfx.proto.evaluator_pb2 import SingleSlicingSpec
from tfx.types import Channel
from tfx.types.standard_artifacts import Model
from tfx.types.standard_artifacts import ModelBlessing

In [67]:
#In case the folders tfx and data already exist, it is beter to clean the workspace before executing the notebook
shutil.rmtree("data")
shutil.rmtree("tfx")

#Creating necessary folders
os.mkdir("data") #To contain the filtered subset InSDN data with 20 futures
os.mkdir("tfx") #To contain TFX artifacts and metadata
#os.mkdir("raw_data") #To contain raw InSDN data with 64 futures
                      #Uncomment this last code if you are working on this project for the very first time and 
                      #after running this cell, put your raw InSDN data in csv format to this folder.

In [68]:
csv_data = pandas.read_csv("raw_data/raw_data.csv") #read the data from raw_data.csv with pandas, TO DO THIS, YOU NEED TO PUT YOUR RAW DATA INTO raw_data FOLDER INITIALLY

#List of feature columns to be used in this project 
#Note: This list also includes the target variable "Label"
FEATURES_TO_BE_USED = ['Src Port', 'Dst Port', 'Flow Duration',
'TotLen Fwd Pkts', 'Fwd Pkt Len Max',
'Bwd Pkt Len Min', 'Pkt Len Max', 'Pkt Len Mean',
'Fwd Pkts/s', 'Bwd Pkts/s', 'Flow IAT Mean', 'Flow IAT Min',
'Bwd IAT Tot', 'Bwd IAT Min', 'Fwd Header Len', 'Bwd Header Len',
'FIN Flag Cnt', 'SYN Flag Cnt', 'ACK Flag Cnt', 'Init Bwd Win Byts', 'Label']

#All of the feature columns except the ones stated below are dropped from the dataset 
#Note: 'Label' is not a feature but a label
csv_data = csv_data.filter(FEATURES_TO_BE_USED)

csv_data.to_csv("data/data.csv", index=False) #write the updated data to data.csv with pandas 

In [69]:
#Alternative for the cell above for reading data from raw_data.csv
#Run this cell in exchange of the one above to apply preprocessing for Label column using pandas 
#Some problems that occur in the transform component is the reason this cell exists
#If those problems are solved, than this cell might be simply ignored.

csv_data = pandas.read_csv("raw_data/raw_data.csv") #read the data from raw_data.csv with pandas, TO DO THIS, YOU NEED TO PUT YOUR RAW DATA INTO raw_data FOLDER INITIALLY

FEATURES_TO_BE_USED = ['Src Port', 'Dst Port', 'Flow Duration',
'TotLen Fwd Pkts', 'Fwd Pkt Len Max',
'Bwd Pkt Len Min', 'Pkt Len Max', 'Pkt Len Mean',
'Fwd Pkts/s', 'Bwd Pkts/s', 'Flow IAT Mean', 'Flow IAT Min',
'Bwd IAT Tot', 'Bwd IAT Min', 'Fwd Header Len', 'Bwd Header Len',
'FIN Flag Cnt', 'SYN Flag Cnt', 'ACK Flag Cnt', 'Init Bwd Win Byts', 'Label']

new_column_names = {}
#lst = []
for name in FEATURES_TO_BE_USED:
    new_column_names[name] = "_".join(name.split()) 
    #lst.append(new_column_names[name]) #used for a single time to automatically create new FEATURES_TO_BE_USED list   
#print(lst)
csv_data.rename(columns=new_column_names, inplace=True)

#List of feature columns to be used in this project 
#Note: This list also includes the target variable "Label"
FEATURES_TO_BE_USED = ['Src_Port', 'Dst_Port', 'Flow_Duration', 'TotLen_Fwd_Pkts', 
                       'Fwd_Pkt_Len_Max', 'Bwd_Pkt_Len_Min', 'Pkt_Len_Max', 'Pkt_Len_Mean', 
                       'Fwd_Pkts/s', 'Bwd_Pkts/s', 'Flow_IAT_Mean', 'Flow_IAT_Min', 'Bwd_IAT_Tot', 
                       'Bwd_IAT_Min', 'Fwd_Header_Len', 'Bwd_Header_Len', 'FIN_Flag_Cnt', 
                       'SYN_Flag_Cnt', 'ACK_Flag_Cnt', 'Init_Bwd_Win_Byts', 'Label']

#All of the feature columns except the ones stated below are dropped from the dataset 
#Note: 'Label' is not a feature but a label
csv_data = csv_data.filter(FEATURES_TO_BE_USED)

#This part is additionla to the cell above
#Convert the categorical values in "Label" label to the indexes wrt the following chart:
#'Normal' -> 0 , 'DoS/DDoS Attack' -> 1 , 'Malware Attack' -> 2 , 'Other Attack Types' ->3 , 'Web Attack' -> 4
csv_data.loc[csv_data['Label'] == 'Normal', 'Label'] = 0
csv_data.loc[csv_data['Label'] == 'DoS/DDoS Attack', 'Label'] = 1
csv_data.loc[csv_data['Label'] == 'Malware Attack', 'Label'] = 2
csv_data.loc[csv_data['Label'] == 'Other Attack Types', 'Label'] = 3
csv_data.loc[csv_data['Label'] == 'Web Attack', 'Label'] = 4


csv_data.to_csv("data/data.csv", index=False) #write the updated data to data.csv with pandas 

### 1) Creating InteractiveConext and digesting data using CSV Example Generator (CSVExampleGen):

In [71]:
print('TensorFlow version: {}'.format(tf.__version__))
print('TFX version: {}'.format(tfx.__version__))

TensorFlow version: 2.4.2
TFX version: 0.29.0


In [72]:
tfx_folder_path = os.path.join(os.getcwd(), "tfx") #Defining path to the folder to contain ML Metadata of tfx components 
print(tfx_folder_path)

C:\Users\uturk\Desktop\InSDN\Pipeline\tfx


In [73]:
# Here, an InteractiveContext is being created using default parameters. This will
# use a temporary directory with an ephemeral ML Metadata database instance.
# To use your own pipeline root or database, the optional properties
# `pipeline_root` and `metadata_connection_config` may be passed to
# InteractiveContext. Calls to InteractiveContext are no-ops outside of the notebook.

context = InteractiveContext(pipeline_root = tfx_folder_path) #Creating with pipeline root set to tfx folder 

#Warning about InteractiveContext metadata_connection_config not provided may be simply omitted



In [74]:
data_path = os.path.join(os.getcwd(), "data") #Defining path to the folder to contain ML Metadata of tfx components 
print(data_path)

C:\Users\uturk\Desktop\InSDN\Pipeline\data


The `ExampleGen` component is usually at the start of a TFX pipeline. It will:

1.   Split data into training and evaluation sets (by default, 2/3 training + 1/3 eval)
2.   Convert data into the `tf.Example` format (learn more [here](https://www.tensorflow.org/tutorials/load_data/tfrecord))
3.   Copy data into the `tfx` directory for other components to access

`ExampleGen` takes as input the path to your data source. In this case, this is the `data_path` that contains the CSV file.

In [75]:
example_gen = CsvExampleGen(input_base = data_path) #An example generator is created to digest input from data path
context.run(example_gen) #Interactive context runs the example generator and the CSV formatted data is digested as TFRecord files

0,1
.execution_id,1
.component,"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } CsvExampleGen at 0x261a517ab50.inputs{}.outputs['examples'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Examples' (1 artifact) at 0x261a517aeb0.type_nameExamples._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0.exec_properties['input_base']C:\Users\uturk\Desktop\InSDN\Pipeline\data['input_config']{  ""splits"": [  {  ""name"": ""single_split"",  ""pattern"": ""*""  }  ] }['output_config']{  ""split_config"": {  ""splits"": [  {  ""hash_buckets"": 2,  ""name"": ""train""  },  {  ""hash_buckets"": 1,  ""name"": ""eval""  }  ]  } }['output_data_format']6['custom_config']None['range_config']None['span']0['version']None['input_fingerprint']split:single_split,num_files:1,total_bytes:765274,xor_checksum:1627858663,sum_checksum:1627858663"
.component.inputs,{}
.component.outputs,"['examples'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Examples' (1 artifact) at 0x261a517aeb0.type_nameExamples._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0"

0,1
.inputs,{}
.outputs,"['examples'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Examples' (1 artifact) at 0x261a517aeb0.type_nameExamples._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0"
.exec_properties,"['input_base']C:\Users\uturk\Desktop\InSDN\Pipeline\data['input_config']{  ""splits"": [  {  ""name"": ""single_split"",  ""pattern"": ""*""  }  ] }['output_config']{  ""split_config"": {  ""splits"": [  {  ""hash_buckets"": 2,  ""name"": ""train""  },  {  ""hash_buckets"": 1,  ""name"": ""eval""  }  ]  } }['output_data_format']6['custom_config']None['range_config']None['span']0['version']None['input_fingerprint']split:single_split,num_files:1,total_bytes:765274,xor_checksum:1627858663,sum_checksum:1627858663"

0,1
['examples'],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Examples' (1 artifact) at 0x261a517aeb0.type_nameExamples._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0"

0,1
.type_name,Examples
._artifacts,"[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0"

0,1
[0],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0"

0,1
.type,<class 'tfx.types.standard_artifacts.Examples'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1
.span,0
.split_names,"[""train"", ""eval""]"
.version,0

0,1
['input_base'],C:\Users\uturk\Desktop\InSDN\Pipeline\data
['input_config'],"{  ""splits"": [  {  ""name"": ""single_split"",  ""pattern"": ""*""  }  ] }"
['output_config'],"{  ""split_config"": {  ""splits"": [  {  ""hash_buckets"": 2,  ""name"": ""train""  },  {  ""hash_buckets"": 1,  ""name"": ""eval""  }  ]  } }"
['output_data_format'],6
['custom_config'],
['range_config'],
['span'],0
['version'],
['input_fingerprint'],"split:single_split,num_files:1,total_bytes:765274,xor_checksum:1627858663,sum_checksum:1627858663"

0,1
['examples'],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Examples' (1 artifact) at 0x261a517aeb0.type_nameExamples._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0"

0,1
.type_name,Examples
._artifacts,"[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0"

0,1
[0],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0"

0,1
.type,<class 'tfx.types.standard_artifacts.Examples'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1
.span,0
.split_names,"[""train"", ""eval""]"
.version,0


Let's examine the output artifacts of `ExampleGen`. This component produces two artifacts, training examples and evaluation examples:

In [76]:
artifact = example_gen.outputs['examples'].get()[0]
print(artifact.split_names, artifact.uri)

["train", "eval"] C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1


We can also take a look at the first three training examples:

In [77]:
# Get the URI of the output artifact representing the training examples, which is a directory
train_uri = os.path.join(example_gen.outputs['examples'].get()[0].uri, 'Split-train')

# Get the list of files in this directory (all compressed TFRecord files)
tfrecord_filenames = [os.path.join(train_uri, name) for name in os.listdir(train_uri)]

# Create a `TFRecordDataset` to read these files
dataset = tf.data.TFRecordDataset(tfrecord_filenames, compression_type="GZIP")

# Iterate over the first 3 records and decode them.
for tfrecord in dataset.take(3):
  serialized_example = tfrecord.numpy()
  example = tf.train.Example()
  example.ParseFromString(serialized_example)
  pp.pprint(example)

features {
  feature {
    key: "ACK_Flag_Cnt"
    value {
      int64_list {
        value: 0
      }
    }
  }
  feature {
    key: "Bwd_Header_Len"
    value {
      int64_list {
        value: 144
      }
    }
  }
  feature {
    key: "Bwd_IAT_Min"
    value {
      float_list {
        value: 3069.0
      }
    }
  }
  feature {
    key: "Bwd_IAT_Tot"
    value {
      float_list {
        value: 11756.0
      }
    }
  }
  feature {
    key: "Bwd_Pkt_Len_Min"
    value {
      int64_list {
        value: 0
      }
    }
  }
  feature {
    key: "Bwd_Pkts/s"
    value {
      float_list {
        value: 296.20853
      }
    }
  }
  feature {
    key: "Dst_Port"
    value {
      int64_list {
        value: 8081
      }
    }
  }
  feature {
    key: "FIN_Flag_Cnt"
    value {
      int64_list {
        value: 0
      }
    }
  }
  feature {
    key: "Flow_Duration"
    value {
      int64_list {
        value: 13504
      }
    }
  }
  feature {
    key: "Flow_IAT_Mean"
    valu

Now that `ExampleGen` has finished ingesting the data, the next step is data analysis.

### 2) Generating statistics using  StatisticsGen

The `StatisticsGen` component computes statistics over your dataset for data analysis, as well as for use in downstream components. It uses the [TensorFlow Data Validation](https://www.tensorflow.org/tfx/data_validation/get_started) library.

`StatisticsGen` takes as input the dataset we just ingested using `ExampleGen`.

In [78]:
statistics_gen = StatisticsGen(examples = example_gen.outputs['examples']) #Creates a statistics generator to work with examples generated by example generator
context.run(statistics_gen) #Interactive Context runs the statistics generator

0,1
.execution_id,2
.component,"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } StatisticsGen at 0x261a5568280.inputs['examples'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Examples' (1 artifact) at 0x261a517aeb0.type_nameExamples._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0.outputs['statistics'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'ExampleStatistics' (1 artifact) at 0x261a55682b0.type_nameExampleStatistics._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""].exec_properties['stats_options_json']None['exclude_splits'][]"
.component.inputs,"['examples'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Examples' (1 artifact) at 0x261a517aeb0.type_nameExamples._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0"
.component.outputs,"['statistics'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'ExampleStatistics' (1 artifact) at 0x261a55682b0.type_nameExampleStatistics._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""]"

0,1
.inputs,"['examples'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Examples' (1 artifact) at 0x261a517aeb0.type_nameExamples._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0"
.outputs,"['statistics'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'ExampleStatistics' (1 artifact) at 0x261a55682b0.type_nameExampleStatistics._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""]"
.exec_properties,['stats_options_json']None['exclude_splits'][]

0,1
['examples'],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Examples' (1 artifact) at 0x261a517aeb0.type_nameExamples._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0"

0,1
.type_name,Examples
._artifacts,"[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0"

0,1
[0],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0"

0,1
.type,<class 'tfx.types.standard_artifacts.Examples'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1
.span,0
.split_names,"[""train"", ""eval""]"
.version,0

0,1
['statistics'],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'ExampleStatistics' (1 artifact) at 0x261a55682b0.type_nameExampleStatistics._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""]"

0,1
.type_name,ExampleStatistics
._artifacts,"[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""]"

0,1
[0],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""]"

0,1
.type,<class 'tfx.types.standard_artifacts.ExampleStatistics'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2
.span,0
.split_names,"[""train"", ""eval""]"

0,1
['stats_options_json'],
['exclude_splits'],[]

0,1
['examples'],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Examples' (1 artifact) at 0x261a517aeb0.type_nameExamples._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0"

0,1
.type_name,Examples
._artifacts,"[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0"

0,1
[0],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0"

0,1
.type,<class 'tfx.types.standard_artifacts.Examples'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1
.span,0
.split_names,"[""train"", ""eval""]"
.version,0

0,1
['statistics'],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'ExampleStatistics' (1 artifact) at 0x261a55682b0.type_nameExampleStatistics._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""]"

0,1
.type_name,ExampleStatistics
._artifacts,"[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""]"

0,1
[0],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""]"

0,1
.type,<class 'tfx.types.standard_artifacts.ExampleStatistics'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2
.span,0
.split_names,"[""train"", ""eval""]"


In [79]:
#to see statistics calculated, interactive context can be used again
context.show(statistics_gen.outputs['statistics'])

### 3) Generating data schematic using SchemaGen

The `SchemaGen` component generates a schema based on your data statistics. (A schema defines the expected bounds, types, and properties of the features in your dataset.) It also uses the [TensorFlow Data Validation](https://www.tensorflow.org/tfx/data_validation/get_started) library.

`SchemaGen` will take as input the statistics that we generated with `StatisticsGen`, looking at the training split by default.

In [80]:
 schema_gen = SchemaGen(statistics = statistics_gen.outputs['statistics'], infer_feature_shape=False) #Creates a schema generator to create a schema based on the statistics calculated by Statisticsgen
context.run(schema_gen)

0,1
.execution_id,3
.component,"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } SchemaGen at 0x2619c5e7e50.inputs['statistics'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'ExampleStatistics' (1 artifact) at 0x261a55682b0.type_nameExampleStatistics._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""].outputs['schema'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Schema' (1 artifact) at 0x2619c5e7220.type_nameSchema._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3.exec_properties['infer_feature_shape']0['exclude_splits'][]"
.component.inputs,"['statistics'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'ExampleStatistics' (1 artifact) at 0x261a55682b0.type_nameExampleStatistics._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""]"
.component.outputs,['schema'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Schema' (1 artifact) at 0x2619c5e7220.type_nameSchema._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3

0,1
.inputs,"['statistics'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'ExampleStatistics' (1 artifact) at 0x261a55682b0.type_nameExampleStatistics._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""]"
.outputs,['schema'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Schema' (1 artifact) at 0x2619c5e7220.type_nameSchema._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3
.exec_properties,['infer_feature_shape']0['exclude_splits'][]

0,1
['statistics'],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'ExampleStatistics' (1 artifact) at 0x261a55682b0.type_nameExampleStatistics._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""]"

0,1
.type_name,ExampleStatistics
._artifacts,"[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""]"

0,1
[0],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""]"

0,1
.type,<class 'tfx.types.standard_artifacts.ExampleStatistics'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2
.span,0
.split_names,"[""train"", ""eval""]"

0,1
['schema'],function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Schema' (1 artifact) at 0x2619c5e7220.type_nameSchema._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3

0,1
.type_name,Schema
._artifacts,[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3

0,1
[0],function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3

0,1
.type,<class 'tfx.types.standard_artifacts.Schema'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3

0,1
['infer_feature_shape'],0
['exclude_splits'],[]

0,1
['statistics'],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'ExampleStatistics' (1 artifact) at 0x261a55682b0.type_nameExampleStatistics._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""]"

0,1
.type_name,ExampleStatistics
._artifacts,"[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""]"

0,1
[0],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""]"

0,1
.type,<class 'tfx.types.standard_artifacts.ExampleStatistics'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2
.span,0
.split_names,"[""train"", ""eval""]"

0,1
['schema'],function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Schema' (1 artifact) at 0x2619c5e7220.type_nameSchema._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3

0,1
.type_name,Schema
._artifacts,[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3

0,1
[0],function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3

0,1
.type,<class 'tfx.types.standard_artifacts.Schema'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3


After `SchemaGen` finishes running, we can visualize the generated schema as a table.

In [81]:
context.show(schema_gen.outputs['schema'])

Unnamed: 0_level_0,Type,Presence,Valency,Domain
Feature name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
'ACK_Flag_Cnt',INT,required,single,-
'Bwd_Header_Len',INT,required,single,-
'Bwd_IAT_Min',FLOAT,required,single,-
'Bwd_IAT_Tot',FLOAT,required,single,-
'Bwd_Pkt_Len_Min',INT,required,single,-
'Bwd_Pkts/s',FLOAT,required,single,-
'Dst_Port',INT,required,single,-
'FIN_Flag_Cnt',INT,required,single,-
'Flow_Duration',INT,required,single,-
'Flow_IAT_Mean',FLOAT,required,single,-


Each feature in your dataset shows up as a row in the schema table, alongside its properties. The schema also captures all the values that a categorical feature takes on, denoted as its domain.

To learn more about schemas, see [the SchemaGen documentation](https://www.tensorflow.org/tfx/guide/schemagen).

### 4)Validating examples using ExampleValidator

The `ExampleValidator` component detects anomalies in your data, based on the expectations defined by the schema. It also uses the [TensorFlow Data Validation](https://www.tensorflow.org/tfx/data_validation/get_started) library.

`ExampleValidator` will take as input the statistics from `StatisticsGen`, and the schema from `SchemaGen`.

In [82]:
example_validator = ExampleValidator(statistics = statistics_gen.outputs['statistics'], schema = schema_gen.outputs['schema'])
context.run(example_validator)

0,1
.execution_id,4
.component,"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } ExampleValidator at 0x2619b370ac0.inputs['statistics'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'ExampleStatistics' (1 artifact) at 0x261a55682b0.type_nameExampleStatistics._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""]['schema'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Schema' (1 artifact) at 0x2619c5e7220.type_nameSchema._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3.outputs['anomalies'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'ExampleAnomalies' (1 artifact) at 0x2619b370fa0.type_nameExampleAnomalies._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleAnomalies' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\ExampleValidator\anomalies\4) at 0x261a538e130.type<class 'tfx.types.standard_artifacts.ExampleAnomalies'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\ExampleValidator\anomalies\4.span0.split_names[""train"", ""eval""].exec_properties['exclude_splits'][]"
.component.inputs,"['statistics'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'ExampleStatistics' (1 artifact) at 0x261a55682b0.type_nameExampleStatistics._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""]['schema'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Schema' (1 artifact) at 0x2619c5e7220.type_nameSchema._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3"
.component.outputs,"['anomalies'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'ExampleAnomalies' (1 artifact) at 0x2619b370fa0.type_nameExampleAnomalies._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleAnomalies' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\ExampleValidator\anomalies\4) at 0x261a538e130.type<class 'tfx.types.standard_artifacts.ExampleAnomalies'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\ExampleValidator\anomalies\4.span0.split_names[""train"", ""eval""]"

0,1
.inputs,"['statistics'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'ExampleStatistics' (1 artifact) at 0x261a55682b0.type_nameExampleStatistics._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""]['schema'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Schema' (1 artifact) at 0x2619c5e7220.type_nameSchema._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3"
.outputs,"['anomalies'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'ExampleAnomalies' (1 artifact) at 0x2619b370fa0.type_nameExampleAnomalies._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleAnomalies' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\ExampleValidator\anomalies\4) at 0x261a538e130.type<class 'tfx.types.standard_artifacts.ExampleAnomalies'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\ExampleValidator\anomalies\4.span0.split_names[""train"", ""eval""]"
.exec_properties,['exclude_splits'][]

0,1
['statistics'],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'ExampleStatistics' (1 artifact) at 0x261a55682b0.type_nameExampleStatistics._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""]"
['schema'],function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Schema' (1 artifact) at 0x2619c5e7220.type_nameSchema._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3

0,1
.type_name,ExampleStatistics
._artifacts,"[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""]"

0,1
[0],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""]"

0,1
.type,<class 'tfx.types.standard_artifacts.ExampleStatistics'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2
.span,0
.split_names,"[""train"", ""eval""]"

0,1
.type_name,Schema
._artifacts,[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3

0,1
[0],function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3

0,1
.type,<class 'tfx.types.standard_artifacts.Schema'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3

0,1
['anomalies'],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'ExampleAnomalies' (1 artifact) at 0x2619b370fa0.type_nameExampleAnomalies._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleAnomalies' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\ExampleValidator\anomalies\4) at 0x261a538e130.type<class 'tfx.types.standard_artifacts.ExampleAnomalies'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\ExampleValidator\anomalies\4.span0.split_names[""train"", ""eval""]"

0,1
.type_name,ExampleAnomalies
._artifacts,"[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleAnomalies' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\ExampleValidator\anomalies\4) at 0x261a538e130.type<class 'tfx.types.standard_artifacts.ExampleAnomalies'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\ExampleValidator\anomalies\4.span0.split_names[""train"", ""eval""]"

0,1
[0],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleAnomalies' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\ExampleValidator\anomalies\4) at 0x261a538e130.type<class 'tfx.types.standard_artifacts.ExampleAnomalies'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\ExampleValidator\anomalies\4.span0.split_names[""train"", ""eval""]"

0,1
.type,<class 'tfx.types.standard_artifacts.ExampleAnomalies'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\ExampleValidator\anomalies\4
.span,0
.split_names,"[""train"", ""eval""]"

0,1
['exclude_splits'],[]

0,1
['statistics'],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'ExampleStatistics' (1 artifact) at 0x261a55682b0.type_nameExampleStatistics._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""]"
['schema'],function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Schema' (1 artifact) at 0x2619c5e7220.type_nameSchema._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3

0,1
.type_name,ExampleStatistics
._artifacts,"[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""]"

0,1
[0],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleStatistics' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2) at 0x2619c6e9040.type<class 'tfx.types.standard_artifacts.ExampleStatistics'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2.span0.split_names[""train"", ""eval""]"

0,1
.type,<class 'tfx.types.standard_artifacts.ExampleStatistics'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\StatisticsGen\statistics\2
.span,0
.split_names,"[""train"", ""eval""]"

0,1
.type_name,Schema
._artifacts,[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3

0,1
[0],function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3

0,1
.type,<class 'tfx.types.standard_artifacts.Schema'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3

0,1
['anomalies'],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'ExampleAnomalies' (1 artifact) at 0x2619b370fa0.type_nameExampleAnomalies._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleAnomalies' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\ExampleValidator\anomalies\4) at 0x261a538e130.type<class 'tfx.types.standard_artifacts.ExampleAnomalies'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\ExampleValidator\anomalies\4.span0.split_names[""train"", ""eval""]"

0,1
.type_name,ExampleAnomalies
._artifacts,"[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleAnomalies' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\ExampleValidator\anomalies\4) at 0x261a538e130.type<class 'tfx.types.standard_artifacts.ExampleAnomalies'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\ExampleValidator\anomalies\4.span0.split_names[""train"", ""eval""]"

0,1
[0],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'ExampleAnomalies' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\ExampleValidator\anomalies\4) at 0x261a538e130.type<class 'tfx.types.standard_artifacts.ExampleAnomalies'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\ExampleValidator\anomalies\4.span0.split_names[""train"", ""eval""]"

0,1
.type,<class 'tfx.types.standard_artifacts.ExampleAnomalies'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\ExampleValidator\anomalies\4
.span,0
.split_names,"[""train"", ""eval""]"


After `ExampleValidator` finishes running, we can visualize the anomalies as a table.

In [83]:
context.show(example_validator.outputs['anomalies'])

  pd.set_option('max_colwidth', -1)


In the anomalies table, we can see that there are no anomalies. This is what we'd expect, since this the first dataset that we've analyzed and the schema is tailored to it. You should review this schema -- anything unexpected means an anomaly in the data. Once reviewed, the schema can be used to guard future data, and anomalies produced here can be used to debug model performance, understand how your data evolves over time, and identify data errors.

## 5) Writing preprocessing scripts and applying these using Transform component

The `Transform` component performs feature engineering for both training and serving. It uses the [TensorFlow Transform](https://www.tensorflow.org/tfx/transform/get_started) library.

`Transform` will take as input the data from `ExampleGen`, the schema from `SchemaGen`, as well as a module that contains user-defined Transform code.

Note: The `%%writefile` cell magic will save the contents of the cell as a `.py` file on disk. This allows the `Transform` component to load your code as a module.

In [84]:
insdn_transform_module_file = 'insdn_transform.py' #to hold the name of the python file defining transforms to be applied

In [85]:
%%writefile {insdn_transform_module_file}

#This part is mainly adapted from the textbook and reference TFX tutorial

import tensorflow as tf
import tensorflow_transform as tft

#List of feature columns to be used in this project 
#Note: This list also includes the target variable "Label"
FEATURES_TO_BE_USED = ['Src_Port', 'Dst_Port', 'Flow_Duration', 'TotLen_Fwd_Pkts', 
                       'Fwd_Pkt_Len_Max', 'Bwd_Pkt_Len_Min', 'Pkt_Len_Max', 'Pkt_Len_Mean', 
                       'Fwd_Pkts/s', 'Bwd_Pkts/s', 'Flow_IAT_Mean', 'Flow_IAT_Min', 'Bwd_IAT_Tot', 
                       'Bwd_IAT_Min', 'Fwd_Header_Len', 'Bwd_Header_Len', 'FIN_Flag_Cnt', 
                       'SYN_Flag_Cnt', 'ACK_Flag_Cnt', 'Init_Bwd_Win_Byts', 'Label']

#First define the numeric feature keys to be normalized in preprocessing function
NUMERIC_FEATURE_KEYS = FEATURES_TO_BE_USED[:20]
LABEL_KEY = FEATURES_TO_BE_USED[20]
NUMBER_OF_CATEGORIES_IN_LABEL_KEY = 5

def preprocessing_fn(inputs):
    
    #  tf.transform's callback function for preprocessing inputs.
    # Args:
    #  inputs: map from feature keys to raw not-yet-transformed features.
    # Returns:
    #  Map from string feature key to transformed feature operations.
    
    #outputs is the dictionary containing the processed data and will be returned at the end of preprocessing function 
    outputs = {}
    
    #Normalize all the numerical features using built in TFT Min-Max Normilizer
#    for key in NUMERIC_FEATURE_KEYS:
#        outputs[transformed_name(key)] = tft.scale_to_0_1(fill_in_missing(inputs[key]))
    
    #Target value transformation
    #This transformation is applied to transform string category names of target key "Label" to 
        #First, indexes of numerical values 0,1,2,3,4 using built in TFT compute_and_apply_vocabulary function. 
            #(For more info about this function, refere to page 71 of "Building Machine Learning Pipelines")
            #Note: top_k parameter is used to make sure that number of categories are limited to 5 including:
            #'DoS/DDoS Attack', 'Malware Attack', 'Normal', 'Other Attack Types', 'Web Attack'
        #And then to one_hot encoded vector representations (of length number of categories)  using user defined
        #convert_num_to_one_hot function
            #See the function belove
    #This transformation is essential in order to apply keras models for multiclass classification 
#Following line executes without an error
#    index = tft.compute_and_apply_vocabulary(fill_in_missing(inputs[LABEL_KEY]), top_k=NUMBER_OF_CATEGORIES_IN_LABEL_KEY) #name->index
#Following line causes an error. Type of index is not approved by convert_num_to_one_hot function. Expects a tensor object!!
#    outputs[transformed_name(LABEL_KEY)] = convert_num_to_one_hot(index, num_labels=NUMBER_OF_CATEGORIES_IN_LABEL_KEY) #index->one-hot
 
#---------------------------------------------------------------------------    
    #Alternative for now, it just functions as a buffer for data
    #Either problem with original version above should be solved or transformations should be handled with pandas in csv data 
    for key in NUMERIC_FEATURE_KEYS:
        outputs[transformed_name(key)] = inputs[key]
    outputs[transformed_name(LABEL_KEY)] = inputs[LABEL_KEY]
#---------------------------------------------------------------------------
    return outputs

def transformed_name(key):
    #This is a helper function to produce names for processed keys
    #Obtained from "Building Machine Learning Pipelines" page 76
    #Note: key + "_xf" format is conventional and is desired by TFX platform
    return key + '_xf'

def convert_num_to_one_hot(label_tensor, num_labels = 2):
    #This helper function is to convert a given index tensor to a one-hot encoded representation tensor
    #Obtained from "Building Machine Learning Pipelines" page 76
    #Note: this function is very similar to to_categorical function from Keras.util
    one_hot_tensor = tf.one_hot(label_tensor, num_labels)
    return tf.reshape((one_hot_tensor, [-1, num_labels]))

def fill_in_missing(x):
    """Replace missing values in a SparseTensor.
    Fills in missing values of `x` with '' or 0, and converts to a dense tensor.
    Args:
        x: A `SparseTensor` of rank 2.  Its dense shape should have size at most 1
        in the second dimension.
    Returns:
    A rank 1 tensor where missing values of `x` have been filled in.
    """
    if not isinstance(x, tf.sparse.SparseTensor):
        return x

    default_value = '' if x.dtype == tf.string else 0
    return tf.squeeze(tf.sparse.to_dense(tf.SparseTensor(x.indices, x.values, [x.dense_shape[0], 1]), default_value),axis=1)

Overwriting insdn_transform.py


Now, we pass in this feature engineering code to the `Transform` component and run it to transform your data.

In [None]:
examples

In [86]:
transform = Transform(
    examples=example_gen.outputs['examples'],
    schema=schema_gen.outputs['schema'],
    module_file = os.path.abspath(insdn_transform_module_file))
    #Caution: This line does not work: module_file=os.path.join(os.getcwd(), insdn_transform_module_file))
context.run(transform)



INFO:tensorflow:Assets added to graph.
INFO:tensorflow:No assets to write.
INFO:tensorflow:SavedModel written to: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transform_graph\5\.temp_path\tftransform_tmp\a2ece906bd0641d597fc23f933d84b93\saved_model.pb




INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore


0,1
.execution_id,5
.component,"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Transform at 0x2619b370880.inputs['examples'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Examples' (1 artifact) at 0x261a517aeb0.type_nameExamples._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0['schema'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Schema' (1 artifact) at 0x2619c5e7220.type_nameSchema._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3.outputs['transform_graph'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'TransformGraph' (1 artifact) at 0x2619b370f10.type_nameTransformGraph._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'TransformGraph' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transform_graph\5) at 0x2619c3b8760.type<class 'tfx.types.standard_artifacts.TransformGraph'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transform_graph\5['transformed_examples'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Examples' (1 artifact) at 0x2619b370df0.type_nameExamples._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transformed_examples\5) at 0x2619c3b8c70.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transformed_examples\5.span0.split_names[""train"", ""eval""].version0['updated_analyzer_cache'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'TransformCache' (1 artifact) at 0x2619b370280.type_nameTransformCache._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'TransformCache' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\updated_analyzer_cache\5) at 0x2619c3b8520.type<class 'tfx.types.standard_artifacts.TransformCache'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\updated_analyzer_cache\5.exec_properties['module_file']C:\Users\uturk\Desktop\InSDN\Pipeline\insdn_transform.py['preprocessing_fn']None['force_tf_compat_v1']1['custom_config']null['splits_config']None"
.component.inputs,"['examples'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Examples' (1 artifact) at 0x261a517aeb0.type_nameExamples._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0['schema'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Schema' (1 artifact) at 0x2619c5e7220.type_nameSchema._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3"
.component.outputs,"['transform_graph'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'TransformGraph' (1 artifact) at 0x2619b370f10.type_nameTransformGraph._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'TransformGraph' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transform_graph\5) at 0x2619c3b8760.type<class 'tfx.types.standard_artifacts.TransformGraph'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transform_graph\5['transformed_examples'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Examples' (1 artifact) at 0x2619b370df0.type_nameExamples._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transformed_examples\5) at 0x2619c3b8c70.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transformed_examples\5.span0.split_names[""train"", ""eval""].version0['updated_analyzer_cache'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'TransformCache' (1 artifact) at 0x2619b370280.type_nameTransformCache._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'TransformCache' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\updated_analyzer_cache\5) at 0x2619c3b8520.type<class 'tfx.types.standard_artifacts.TransformCache'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\updated_analyzer_cache\5"

0,1
.inputs,"['examples'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Examples' (1 artifact) at 0x261a517aeb0.type_nameExamples._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0['schema'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Schema' (1 artifact) at 0x2619c5e7220.type_nameSchema._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3"
.outputs,"['transform_graph'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'TransformGraph' (1 artifact) at 0x2619b370f10.type_nameTransformGraph._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'TransformGraph' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transform_graph\5) at 0x2619c3b8760.type<class 'tfx.types.standard_artifacts.TransformGraph'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transform_graph\5['transformed_examples'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Examples' (1 artifact) at 0x2619b370df0.type_nameExamples._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transformed_examples\5) at 0x2619c3b8c70.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transformed_examples\5.span0.split_names[""train"", ""eval""].version0['updated_analyzer_cache'] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'TransformCache' (1 artifact) at 0x2619b370280.type_nameTransformCache._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'TransformCache' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\updated_analyzer_cache\5) at 0x2619c3b8520.type<class 'tfx.types.standard_artifacts.TransformCache'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\updated_analyzer_cache\5"
.exec_properties,['module_file']C:\Users\uturk\Desktop\InSDN\Pipeline\insdn_transform.py['preprocessing_fn']None['force_tf_compat_v1']1['custom_config']null['splits_config']None

0,1
['examples'],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Examples' (1 artifact) at 0x261a517aeb0.type_nameExamples._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0"
['schema'],function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Schema' (1 artifact) at 0x2619c5e7220.type_nameSchema._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3

0,1
.type_name,Examples
._artifacts,"[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0"

0,1
[0],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0"

0,1
.type,<class 'tfx.types.standard_artifacts.Examples'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1
.span,0
.split_names,"[""train"", ""eval""]"
.version,0

0,1
.type_name,Schema
._artifacts,[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3

0,1
[0],function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3

0,1
.type,<class 'tfx.types.standard_artifacts.Schema'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3

0,1
['transform_graph'],function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'TransformGraph' (1 artifact) at 0x2619b370f10.type_nameTransformGraph._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'TransformGraph' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transform_graph\5) at 0x2619c3b8760.type<class 'tfx.types.standard_artifacts.TransformGraph'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transform_graph\5
['transformed_examples'],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Examples' (1 artifact) at 0x2619b370df0.type_nameExamples._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transformed_examples\5) at 0x2619c3b8c70.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transformed_examples\5.span0.split_names[""train"", ""eval""].version0"
['updated_analyzer_cache'],function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'TransformCache' (1 artifact) at 0x2619b370280.type_nameTransformCache._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'TransformCache' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\updated_analyzer_cache\5) at 0x2619c3b8520.type<class 'tfx.types.standard_artifacts.TransformCache'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\updated_analyzer_cache\5

0,1
.type_name,TransformGraph
._artifacts,[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'TransformGraph' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transform_graph\5) at 0x2619c3b8760.type<class 'tfx.types.standard_artifacts.TransformGraph'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transform_graph\5

0,1
[0],function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'TransformGraph' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transform_graph\5) at 0x2619c3b8760.type<class 'tfx.types.standard_artifacts.TransformGraph'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transform_graph\5

0,1
.type,<class 'tfx.types.standard_artifacts.TransformGraph'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transform_graph\5

0,1
.type_name,Examples
._artifacts,"[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transformed_examples\5) at 0x2619c3b8c70.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transformed_examples\5.span0.split_names[""train"", ""eval""].version0"

0,1
[0],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transformed_examples\5) at 0x2619c3b8c70.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transformed_examples\5.span0.split_names[""train"", ""eval""].version0"

0,1
.type,<class 'tfx.types.standard_artifacts.Examples'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transformed_examples\5
.span,0
.split_names,"[""train"", ""eval""]"
.version,0

0,1
.type_name,TransformCache
._artifacts,[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'TransformCache' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\updated_analyzer_cache\5) at 0x2619c3b8520.type<class 'tfx.types.standard_artifacts.TransformCache'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\updated_analyzer_cache\5

0,1
[0],function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'TransformCache' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\updated_analyzer_cache\5) at 0x2619c3b8520.type<class 'tfx.types.standard_artifacts.TransformCache'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\updated_analyzer_cache\5

0,1
.type,<class 'tfx.types.standard_artifacts.TransformCache'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\updated_analyzer_cache\5

0,1
['module_file'],C:\Users\uturk\Desktop\InSDN\Pipeline\insdn_transform.py
['preprocessing_fn'],
['force_tf_compat_v1'],1
['custom_config'],
['splits_config'],

0,1
['examples'],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Examples' (1 artifact) at 0x261a517aeb0.type_nameExamples._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0"
['schema'],function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Schema' (1 artifact) at 0x2619c5e7220.type_nameSchema._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3

0,1
.type_name,Examples
._artifacts,"[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0"

0,1
[0],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1) at 0x2619c2f0610.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1.span0.split_names[""train"", ""eval""].version0"

0,1
.type,<class 'tfx.types.standard_artifacts.Examples'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\CsvExampleGen\examples\1
.span,0
.split_names,"[""train"", ""eval""]"
.version,0

0,1
.type_name,Schema
._artifacts,[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3

0,1
[0],function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Schema' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3) at 0x2619bb62400.type<class 'tfx.types.standard_artifacts.Schema'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3

0,1
.type,<class 'tfx.types.standard_artifacts.Schema'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\SchemaGen\schema\3

0,1
['transform_graph'],function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'TransformGraph' (1 artifact) at 0x2619b370f10.type_nameTransformGraph._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'TransformGraph' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transform_graph\5) at 0x2619c3b8760.type<class 'tfx.types.standard_artifacts.TransformGraph'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transform_graph\5
['transformed_examples'],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'Examples' (1 artifact) at 0x2619b370df0.type_nameExamples._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transformed_examples\5) at 0x2619c3b8c70.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transformed_examples\5.span0.split_names[""train"", ""eval""].version0"
['updated_analyzer_cache'],function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Channel of type 'TransformCache' (1 artifact) at 0x2619b370280.type_nameTransformCache._artifacts[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'TransformCache' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\updated_analyzer_cache\5) at 0x2619c3b8520.type<class 'tfx.types.standard_artifacts.TransformCache'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\updated_analyzer_cache\5

0,1
.type_name,TransformGraph
._artifacts,[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'TransformGraph' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transform_graph\5) at 0x2619c3b8760.type<class 'tfx.types.standard_artifacts.TransformGraph'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transform_graph\5

0,1
[0],function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'TransformGraph' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transform_graph\5) at 0x2619c3b8760.type<class 'tfx.types.standard_artifacts.TransformGraph'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transform_graph\5

0,1
.type,<class 'tfx.types.standard_artifacts.TransformGraph'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transform_graph\5

0,1
.type_name,Examples
._artifacts,"[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transformed_examples\5) at 0x2619c3b8c70.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transformed_examples\5.span0.split_names[""train"", ""eval""].version0"

0,1
[0],"function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'Examples' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transformed_examples\5) at 0x2619c3b8c70.type<class 'tfx.types.standard_artifacts.Examples'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transformed_examples\5.span0.split_names[""train"", ""eval""].version0"

0,1
.type,<class 'tfx.types.standard_artifacts.Examples'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\transformed_examples\5
.span,0
.split_names,"[""train"", ""eval""]"
.version,0

0,1
.type_name,TransformCache
._artifacts,[0] function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'TransformCache' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\updated_analyzer_cache\5) at 0x2619c3b8520.type<class 'tfx.types.standard_artifacts.TransformCache'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\updated_analyzer_cache\5

0,1
[0],function toggleTfxObject(element) {  var objElement = element.parentElement;  if (objElement.classList.contains('collapsed')) {  objElement.classList.remove('collapsed');  objElement.classList.add('expanded');  } else {  objElement.classList.add('collapsed');  objElement.classList.remove('expanded');  } } Artifact of type 'TransformCache' (uri: C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\updated_analyzer_cache\5) at 0x2619c3b8520.type<class 'tfx.types.standard_artifacts.TransformCache'>.uriC:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\updated_analyzer_cache\5

0,1
.type,<class 'tfx.types.standard_artifacts.TransformCache'>
.uri,C:\Users\uturk\Desktop\InSDN\Pipeline\tfx\Transform\updated_analyzer_cache\5


Let's examine the output artifacts of `Transform`. This component produces two types of outputs:

* `transform_graph` is the graph that can perform the preprocessing operations (this graph will be included in the serving and evaluation models).
* `transformed_examples` represents the preprocessed training and evaluation data.

In [87]:
transform.outputs

{'transform_graph': Channel(
    type_name: TransformGraph
    artifacts: [Artifact(artifact: id: 5
type_id: 13
uri: "C:\\Users\\uturk\\Desktop\\InSDN\\Pipeline\\tfx\\Transform\\transform_graph\\5"
custom_properties {
  key: "name"
  value {
    string_value: "transform_graph"
  }
}
custom_properties {
  key: "producer_component"
  value {
    string_value: "Transform"
  }
}
custom_properties {
  key: "state"
  value {
    string_value: "published"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "0.29.0"
  }
}
state: LIVE
, artifact_type: id: 13
name: "TransformGraph"
)]
    additional_properties: {}
    additional_custom_properties: {}
), 'transformed_examples': Channel(
    type_name: Examples
    artifacts: [Artifact(artifact: id: 6
type_id: 5
uri: "C:\\Users\\uturk\\Desktop\\InSDN\\Pipeline\\tfx\\Transform\\transformed_examples\\5"
properties {
  key: "split_names"
  value {
    string_value: "[\"train\", \"eval\"]"
  }
}
custom_properties {
  key: "name"

Take a peek at the transform_graph artifact. It points to a directory containing three subdirectories.

In [88]:
train_uri = transform.outputs['transform_graph'].get()[0].uri
os.listdir(train_uri)

['metadata', 'transformed_metadata', 'transform_fn']

The transformed_metadata subdirectory contains the schema of the preprocessed data. The transform_fn subdirectory contains the actual preprocessing graph. The metadata subdirectory contains the schema of the original data.

We can also take a look at the first three transformed examples:

In [89]:
# Get the URI of the output artifact representing the transformed examples, which is a directory
train_uri = os.path.join(transform.outputs['transformed_examples'].get()[0].uri, 'Split-train')

# Get the list of files in this directory (all compressed TFRecord files)
tfrecord_filenames = [os.path.join(train_uri, name)
                      for name in os.listdir(train_uri)]

# Create a `TFRecordDataset` to read these files
dataset = tf.data.TFRecordDataset(tfrecord_filenames, compression_type="GZIP")

# Iterate over the first 3 records and decode them.
for tfrecord in dataset.take(3):
    serialized_example = tfrecord.numpy()
    example = tf.train.Example()
    example.ParseFromString(serialized_example)
    pp.pprint(example)

features {
  feature {
    key: "ACK_Flag_Cnt_xf"
    value {
      int64_list {
        value: 0
      }
    }
  }
  feature {
    key: "Bwd_Header_Len_xf"
    value {
      int64_list {
        value: 144
      }
    }
  }
  feature {
    key: "Bwd_IAT_Min_xf"
    value {
      float_list {
        value: 3069.0
      }
    }
  }
  feature {
    key: "Bwd_IAT_Tot_xf"
    value {
      float_list {
        value: 11756.0
      }
    }
  }
  feature {
    key: "Bwd_Pkt_Len_Min_xf"
    value {
      int64_list {
        value: 0
      }
    }
  }
  feature {
    key: "Bwd_Pkts/s_xf"
    value {
      float_list {
        value: 296.20853
      }
    }
  }
  feature {
    key: "Dst_Port_xf"
    value {
      int64_list {
        value: 8081
      }
    }
  }
  feature {
    key: "FIN_Flag_Cnt_xf"
    value {
      int64_list {
        value: 0
      }
    }
  }
  feature {
    key: "Flow_Duration_xf"
    value {
      int64_list {
        value: 13504
      }
    }
  }
  feature {
    ke

After the `Transform` component has transformed your data into features, and the next step is to train a model.

## 6) Writing ML code for model training and the training the model with Trainer Component

The `Trainer` component will train a model that you define in TensorFlow (either using the Estimator API or the Keras API with [`model_to_estimator`](https://www.tensorflow.org/api_docs/python/tf/keras/estimator/model_to_estimator)).

`Trainer` takes as input the schema from `SchemaGen`, the transformed data and graph from `Transform`, training parameters, as well as a module that contains user-defined model code.

Let's see an example of user-defined model code below (for an introduction to the TensorFlow Estimator APIs, [see the tutorial](https://www.tensorflow.org/tutorials/estimator/premade)):

In [90]:
insdn_trainer_module_file = 'insdn_trainer.py' #to hold the name of the python file defining transforms to be applied

In [91]:
%%writefile {insdn_trainer_module_file}

#This part is mainly adapted from the reference TFX tutorial

import tensorflow as tf
import tensorflow.keras as keras

import tensorflow_model_analysis as tfma
import tensorflow_transform as tft
from tensorflow_transform.tf_metadata import schema_utils
from tfx_bsl.tfxio import dataset_options

#List of feature columns to be used in this project 
#Note: This list also includes the target variable "Label"
FEATURES_TO_BE_USED = ['Src_Port', 'Dst_Port', 'Flow_Duration', 'TotLen_Fwd_Pkts', 
                       'Fwd_Pkt_Len_Max', 'Bwd_Pkt_Len_Min', 'Pkt_Len_Max', 'Pkt_Len_Mean', 
                       'Fwd_Pkts/s', 'Bwd_Pkts/s', 'Flow_IAT_Mean', 'Flow_IAT_Min', 'Bwd_IAT_Tot', 
                       'Bwd_IAT_Min', 'Fwd_Header_Len', 'Bwd_Header_Len', 'FIN_Flag_Cnt', 
                       'SYN_Flag_Cnt', 'ACK_Flag_Cnt', 'Init_Bwd_Win_Byts', 'Label']

#Numeric feature keys to be used as inputs to model
NUMERIC_FEATURE_KEYS = FEATURES_TO_BE_USED[:20]
#Label key "Label" to be used in model
LABEL_KEY = FEATURES_TO_BE_USED[20]
NUMBER_OF_CATEGORIES_IN_LABEL_KEY = 5 #'DoS/DDoS Attack', 'Malware Attack', 'Normal', 'Other Attack Types', 'Web Attack'


def transformed_name(key):
    return key + '_xf'


def transformed_names(keys):
    return [transformed_name(key) for key in keys]


def get_raw_feature_spec(schema):
    return schema_utils.schema_as_feature_spec(schema).feature_spec


def build_estimator(config, hidden_units=None, warm_start_from=None):
    """Build an estimator for predicting network anomaly label.
    Args:
      config: tf.estimator.RunConfig defining the runtime environment for the
        estimator (including model_dir).
      hidden_units: [int], the layer sizes of the DNN (input layer first)
      warm_start_from: Optional directory to warm start from.
    Returns:
      A dict of the following:
        - estimator: The estimator that will be used for training and eval.
        - train_spec: Spec for training.
        - eval_spec: Spec for eval.
        - eval_input_receiver_fn: Input function for eval.
    """
    #??? Note: some features in our dataset are Int values rather than float. They will be converted to float32
    #https://www.tensorflow.org/api_docs/python/tf/feature_column/numeric_column
    my_feature_columns = [tf.feature_column.numeric_column(key, shape=()) for key in transformed_names(NUMERIC_FEATURE_KEYS)]
  
    #??? Note: Reference TFX tutorial uses DNNLinearCombinedClassifier. Here, DNNClassifier is preferred. 
    #Params arranged accordingly. 
    #https://www.tensorflow.org/api_docs/python/tf/estimator/DNNClassifier
    #https://www.tensorflow.org/api_docs/python/tf/estimator/DNNLinearCombinedClassifier
    return tf.estimator.DNNClassifier(
        config=config,
        feature_columns=my_feature_columns,
        hidden_units=hidden_units or [100, 70, 50, 25],
        n_classes = NUMBER_OF_CATEGORIES_IN_LABEL_KEY,
        warm_start_from=warm_start_from)


def example_serving_receiver_fn(tf_transform_graph, schema):
    """Build the serving in inputs.
    Args:
      tf_transform_graph: A TFTransformOutput.
      schema: the schema of the input data.
    Returns:
      Tensorflow graph which parses examples, applying tf-transform to them.
    """
    raw_feature_spec = get_raw_feature_spec(schema)
    raw_feature_spec.pop(LABEL_KEY)

    raw_input_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(raw_feature_spec, default_batch_size=None)
    serving_input_receiver = raw_input_fn()

    transformed_features = tf_transform_graph.transform_raw_features(serving_input_receiver.features)

    return tf.estimator.export.ServingInputReceiver(transformed_features, serving_input_receiver.receiver_tensors)



def eval_input_receiver_fn(tf_transform_graph, schema):
    """Build everything needed for the tf-model-analysis to run the model.
    Args:
      tf_transform_graph: A TFTransformOutput.
      schema: the schema of the input data.
    Returns:
      EvalInputReceiver function, which contains:
        - Tensorflow graph which parses raw untransformed features, applies the
          tf-transform preprocessing operators.
        - Set of raw, untransformed features.
        - Label against which predictions will be compared.
    """
    # Notice that the inputs are raw features, not transformed features here.
    raw_feature_spec = get_raw_feature_spec(schema)

    serialized_tf_example = tf.compat.v1.placeholder(dtype=tf.string, shape=[None], name='input_example_tensor')

    # Add a parse_example operator to the tensorflow graph, which will parse
    # raw, untransformed, tf examples.
    features = tf.io.parse_example(serialized_tf_example, raw_feature_spec)

    # Now that we have our raw examples, process them through the tf-transform
    # function computed during the preprocessing step.
    transformed_features = tf_transform_graph.transform_raw_features(features)

    # The key name MUST be 'examples'.
    receiver_tensors = {'examples': serialized_tf_example}

    # NOTE: Model is driven by transformed features (since training works on the
    # materialized output of TFT, but slicing will happen on raw features.
    features.update(transformed_features)

    return tfma.export.EvalInputReceiver(
        features=features,
        receiver_tensors=receiver_tensors,
        labels=transformed_features[transformed_name(LABEL_KEY)])


def input_fn(file_pattern, data_accessor, tf_transform_output, batch_size=200):
    """Generates features and label for tuning/training.
    
    Args:
      file_pattern: List of paths or patterns of input tfrecord files.
      data_accessor: DataAccessor for converting input to RecordBatch.
      tf_transform_output: A TFTransformOutput.
      batch_size: representing the number of consecutive elements of returned
        dataset to combine in a single batch

    Returns:
      A dataset that contains (features, indices) tuple where features is a
        dictionary of Tensors, and indices is a single Tensor of label indices.
    """
    return data_accessor.tf_dataset_factory(
        file_pattern,
        dataset_options.TensorFlowDatasetOptions(batch_size=batch_size, label_key=transformed_name(LABEL_KEY)),
        tf_transform_output.transformed_metadata.schema)


# TFX will call this function
def trainer_fn(trainer_fn_args, schema):
    """Build the estimator using the high level API.
    Args:
      trainer_fn_args: Holds args used to train the model as name/value pairs.
      schema: Holds the schema of the training examples.
    Returns:
      A dict of the following:
        - estimator: The estimator that will be used for training and eval.
        - train_spec: Spec for training.
        - eval_spec: Spec for eval.
        - eval_input_receiver_fn: Input function for eval.
    """
    #Values below are obtained from reference TFX tutorial, might be updated arbitrarily.
    # Number of nodes in the first layer of the DNN
    first_dnn_layer_size = 100
    num_dnn_layers = 4
    dnn_decay_factor = 0.7

    train_batch_size = 40
    eval_batch_size = 40

    tf_transform_graph = tft.TFTransformOutput(trainer_fn_args.transform_output)

    train_input_fn = lambda: input_fn(  # pylint: disable=g-long-lambda
        trainer_fn_args.train_files,
        trainer_fn_args.data_accessor,
        tf_transform_graph,
        batch_size=train_batch_size)

    eval_input_fn = lambda: input_fn(  # pylint: disable=g-long-lambda
        trainer_fn_args.eval_files,
        trainer_fn_args.data_accessor,
        tf_transform_graph,
        batch_size=eval_batch_size)

    train_spec = tf.estimator.TrainSpec(  # pylint: disable=g-long-lambda
        train_input_fn,
        max_steps=trainer_fn_args.train_steps)

    serving_receiver_fn = lambda: example_serving_receiver_fn(  # pylint: disable=g-long-lambda
        tf_transform_graph, schema)

    exporter = tf.estimator.FinalExporter('subset-InSDN', serving_receiver_fn)
    eval_spec = tf.estimator.EvalSpec(
        eval_input_fn,
        steps=trainer_fn_args.eval_steps,
        exporters=[exporter],
        name='subset-InSDN-eval')

    run_config = tf.estimator.RunConfig(
        save_checkpoints_steps=999, keep_checkpoint_max=1)

    run_config = run_config.replace(model_dir=trainer_fn_args.serving_model_dir)

    estimator = build_estimator(
        # Construct layers sizes with exponetial decay
        hidden_units=[max(2, int(first_dnn_layer_size * dnn_decay_factor**i)) for i in range(num_dnn_layers)],
        config=run_config,
        warm_start_from=trainer_fn_args.base_model) #??? Is there always a base model? 

    # Create an input receiver for TFMA processing
    receiver_fn = lambda: eval_input_receiver_fn(  # pylint: disable=g-long-lambda
        tf_transform_graph, schema)

    return {
        'estimator': estimator,
        'train_spec': train_spec,
        'eval_spec': eval_spec,
        'eval_input_receiver_fn': receiver_fn}


Overwriting insdn_trainer.py


Now, we pass in this model code to the `Trainer` component and run it to train the model.

In [92]:
trainer = Trainer(
    module_file=os.path.abspath(insdn_trainer_module_file),
    transformed_examples=transform.outputs['transformed_examples'],
    schema=schema_gen.outputs['schema'],
    transform_graph=transform.outputs['transform_graph'],
    train_args=trainer_pb2.TrainArgs(num_steps=1000), #Converted to 1000 from 10000
    eval_args=trainer_pb2.EvalArgs(num_steps=500)) #Converted to 500 frrom 5000
context.run(trainer)



INFO:tensorflow:Using config: {'_model_dir': 'C:\\Users\\uturk\\Desktop\\InSDN\\Pipeline\\tfx\\Trainer\\model_run\\6\\Format-Serving', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 999, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 1, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Not using Distribute Coordinator.

ValueError: in user code:

    C:\Users\uturk\anaconda3\lib\site-packages\tensorflow_estimator\python\estimator\canned\dnn.py:350 call  *
        net = self._input_layer(features, training=is_training)
    C:\Users\uturk\anaconda3\lib\site-packages\tensorflow\python\keras\engine\base_layer_v1.py:786 __call__  **
        outputs = call_fn(cast_inputs, *args, **kwargs)
    C:\Users\uturk\anaconda3\lib\site-packages\tensorflow\python\keras\feature_column\dense_features.py:168 call  **
        tensor = column.get_dense_tensor(transformation_cache,
    C:\Users\uturk\anaconda3\lib\site-packages\tensorflow\python\feature_column\feature_column_v2.py:2592 get_dense_tensor
        return transformation_cache.get(self, state_manager)
    C:\Users\uturk\anaconda3\lib\site-packages\tensorflow\python\feature_column\feature_column_v2.py:2355 get
        transformed = column.transform_feature(self, state_manager)
    C:\Users\uturk\anaconda3\lib\site-packages\tensorflow\python\feature_column\feature_column_v2.py:2565 transform_feature
        return self._transform_input_tensor(input_tensor)
    C:\Users\uturk\anaconda3\lib\site-packages\tensorflow\python\feature_column\feature_column_v2.py:2535 _transform_input_tensor
        raise ValueError(

    ValueError: The corresponding Tensor of numerical column must be a Tensor. SparseTensor is not supported. key: ACK_Flag_Cnt_xf


#### Analyze Training with TensorBoard
Optionally, we can connect TensorBoard to the Trainer to analyze our model's training curves.

In [192]:
# Get the URI of the output artifact representing the training logs, which is a directory
model_run_dir = trainer.outputs['model_run'].get()[0].uri

%load_ext tensorboard
%tensorboard --logdir {model_run_dir}

NameError: name 'trainer' is not defined

## 7) Evaluator
The `Evaluator` component computes model performance metrics over the evaluation set. It uses the [TensorFlow Model Analysis](https://www.tensorflow.org/tfx/model_analysis/get_started) library. The `Evaluator` can also optionally validate that a newly trained model is better than the previous model. This is useful in a production pipeline setting where you may automatically train and validate a model every day. In this notebook, we only train one model, so the `Evaluator` automatically will label the model as "good". 
​
`Evaluator` will take as input the data from `ExampleGen`, the trained model from `Trainer`, and slicing configuration. The slicing configuration allows you to slice your metrics on feature values. See an example of this configuration below:

In [None]:
eval_config = tfma.EvalConfig(
    model_specs=[
        # Using signature 'eval' implies the use of an EvalSavedModel. To use
        # a serving model remove the signature to defaults to 'serving_default'
        # and add a label_key.
        tfma.ModelSpec(signature_name='eval')
    ],
    metrics_specs=[
        tfma.MetricsSpec(
            # The metrics added here are in addition to those saved with the
            # model (assuming either a keras model or EvalSavedModel is used).
            # Any metrics added into the saved model (for example using
            # model.compile(..., metrics=[...]), etc) will be computed
            # automatically.
            metrics=[
                tfma.MetricConfig(class_name='ExampleCount')
            ],
            # To add validation thresholds for metrics saved with the model,
            # add them keyed by metric name to the thresholds map.
            thresholds = {
                'accuracy': tfma.MetricThreshold(
                    value_threshold=tfma.GenericValueThreshold(
                        lower_bound={'value': 0.5}),
                    # Change threshold will be ignored if there is no
                    # baseline model resolved from MLMD (first run).
                    change_threshold=tfma.GenericChangeThreshold(
                       direction=tfma.MetricDirection.HIGHER_IS_BETTER,
                       absolute={'value': -1e-10}))
            }
        )
    ],
    slicing_specs=[
        # An empty slice spec means the overall slice, i.e. the whole dataset.
        tfma.SlicingSpec(),
        # Data can be sliced along a feature column. In this case, data is
        # sliced along feature column Dst Port.
        tfma.SlicingSpec(feature_keys=['Dst Port'])
    ])

Next, we give this configuration to `Evaluator` and run it.

In [None]:
# Use TFMA to compute a evaluation statistics over features of a model and
# validate them against a baseline.

# The model resolver is only required if performing model validation in addition
# to evaluation. In this case we validate against the latest blessed model. If
# no model has been blessed before (as in this case) the evaluator will make our
# candidate the first blessed model.
model_resolver = resolver.Resolver(
      instance_name='latest_blessed_model_resolver',
      strategy_class=latest_blessed_model_resolver.LatestBlessedModelResolver,
      model=Channel(type=Model),
      model_blessing=Channel(type=ModelBlessing))
context.run(model_resolver)

evaluator = Evaluator(
    examples=example_gen.outputs['examples'],
    model=trainer.outputs['model'],
    #baseline_model=model_resolver.outputs['model'],
    eval_config=eval_config)
context.run(evaluator)

Now let's examine the output artifacts of `Evaluator`. 

In [None]:
evaluator.outputs

Using the `evaluation` output we can show the default visualization of global metrics on the entire evaluation set.

In [None]:
context.show(evaluator.outputs['evaluation'])

To see the visualization for sliced evaluation metrics, we can directly call the TensorFlow Model Analysis library.

In [None]:
# Get the TFMA output result path and load the result.
PATH_TO_RESULT = evaluator.outputs['evaluation'].get()[0].uri
tfma_result = tfma.load_eval_result(PATH_TO_RESULT)

# Show data sliced along feature column 'Dst_Port'.
tfma.view.render_slicing_metrics(
    tfma_result, slicing_column='Dst_Port')

This visualization shows the same metrics, but computed at every feature value of `Dst_Port` instead of on the entire evaluation set.

TensorFlow Model Analysis supports many other visualizations, such as Fairness Indicators and plotting a time series of model performance. To learn more, see [the tutorial](https://www.tensorflow.org/tfx/tutorials/model_analysis/tfma_basic).

Since we added thresholds to our config, validation output is also available. The precence of a `blessing` artifact indicates that our model passed validation. Since this is the first validation being performed the candidate is automatically blessed.

In [None]:
blessing_uri = evaluator.outputs.blessing.get()[0].uri
!ls -l {blessing_uri}

Now can also verify the success by loading the validation result record:

In [None]:
PATH_TO_RESULT = evaluator.outputs['evaluation'].get()[0].uri
print(tfma.load_validation_result(PATH_TO_RESULT))

## 8) Pusher
The `Pusher` component is usually at the end of a TFX pipeline. It checks whether a model has passed validation, and if so, exports the model to `serving_model_dir`.

In [None]:
pusher = Pusher(
    model=trainer.outputs['model'],
    model_blessing=evaluator.outputs['blessing'],
    push_destination=pusher_pb2.PushDestination(
        filesystem=pusher_pb2.PushDestination.Filesystem(
            base_directory=serving_model_dir)))
context.run(pusher)

Let's examine the output artifacts of `Pusher`. 

In [None]:
pusher.outputs

In particular, the Pusher will export your model in the SavedModel format, which looks like this:

In [None]:
push_uri = pusher.outputs['pushed_model'].get()[0].uri
model = tf.saved_model.load(push_uri)

for item in model.signatures.items():
    pp.pprint(item)

## We're finished :) 