# Submission 2: Machine Learning Pipeline - Apple Quality Predictor

**Nama: Tilman Eldon**

**Username Dicoding: eldonn**

Pada code ini merupakan code implementasi machine learning pipeline menggunakan TensorFlow Extended (TFX) untuk mengatur alur kerja dalam sebuah proyek machine learning penilaian kualitas apel.

## IMPORT

Code ini mengimpor modul dan fungsi yang diperlukan. os dan sys digunakan untuk operasi sistem dasar, logging digunakan untuk mencatat informasi, dan modul tfx digunakan untuk membuat dan menjalankan pipeline.

In [1]:
import os
import sys
from typing import Text

from absl import logging
from tfx.orchestration import metadata, pipeline
from tfx.orchestration.beam.beam_dag_runner import BeamDagRunner

## SET VARIABLES

Variables ini mengatur nama pipeline, jalur input data, jalur modul transformasi dan pelatihan, serta jalur output untuk menyimpan model dan metadata pipeline.

In [2]:
PIPELINE_NAME = "apple-quality-pipeline"

DATA_ROOT = "data"
TRANSFORM_MODULE_FILE = "modules/apple_quality_transform.py"
TRAINER_MODULE_FILE = "modules/apple_quality_trainer.py"

OUTPUT_BASE = "output"
serving_model_dir = os.path.join(OUTPUT_BASE, 'serving_model')
pipeline_root = os.path.join(OUTPUT_BASE, PIPELINE_NAME)
metadata_path = os.path.join(pipeline_root, "metadata.db")

## CREATE APPLE QUALITY PIPELINE INIT

Fungsi ini menginisialisasi pipeline TFX dengan komponen yang diberikan, root pipeline, dan argumen Beam. Argumen Beam digunakan untuk menjalankan pipeline dengan mode multi-processing dan menggunakan jumlah pekerja otomatis.

Kemudian dijalankan blok code untuk mengatur log level ke INFO dan mengimpor fungsi init_components dari modul apple_quality_components. Kemudian, ia menginisialisasi komponen pipeline menggunakan data dan modul yang telah didefinisikan sebelumnya. Setelah itu, pipeline diinisialisasi dan dijalankan menggunakan BeamDagRunner.

In [3]:
def init_apple_quality_pipeline(
    components, pipeline_root: Text
) -> pipeline.Pipeline:
    
    logging.info(f"Pipeline root set to: {pipeline_root}")
    beam_args = [
        "--direct_running_mode=multi_processing"
        # 0 auto-detect based on on the number of CPUs available 
        # during execution time.
        "----direct_num_workers=0" 
    ]
    
    return pipeline.Pipeline(
        pipeline_name=PIPELINE_NAME,
        pipeline_root=pipeline_root,
        components=components,
        enable_cache=True,
        metadata_connection_config=metadata.sqlite_metadata_connection_config(
            metadata_path
        ),
        eam_pipeline_args=beam_args
    )

if __name__ == "__main__":
    logging.set_verbosity(logging.INFO)
    
    from modules.apple_quality_components import init_components
    
    components = init_components(
        DATA_ROOT,
        training_module=TRAINER_MODULE_FILE,
        transform_module=TRANSFORM_MODULE_FILE,
        serving_model_dir=serving_model_dir,
    )
    
    pipeline = init_apple_quality_pipeline(components, pipeline_root)
    BeamDagRunner().run(pipeline=pipeline)

INFO:absl:Excluding no splits because exclude_splits is not set.
INFO:absl:Excluding no splits because exclude_splits is not set.
INFO:absl:Excluding no splits because exclude_splits is not set.
INFO:absl:Pipeline root set to: output\apple-quality-pipeline
INFO:absl:Generating ephemeral wheel package for 'd:\\Eldon\\Belajar\\Bangkit\\MLOps - Dicoding\\MLOps-FinalProject\\modules\\apple_quality_transform.py' (including modules: ['apple_quality_components', 'apple_quality_trainer', 'apple_quality_transform']).
INFO:absl:User module package has hash fingerprint version c31e895e1999f4ab835080d61d4799c1cfd9bdadd2c104f514fff0cce0907939.
INFO:absl:Executing: ['d:\\Eldon\\Belajar\\Bangkit\\MLOps - Dicoding\\mlops-tfx\\Scripts\\python.exe', 'C:\\Users\\nu085\\AppData\\Local\\Temp\\tmpft00frgt\\_tfx_generated_setup.py', 'bdist_wheel', '--bdist-dir', 'C:\\Users\\nu085\\AppData\\Local\\Temp\\tmpan_fo1pw', '--dist-dir', 'C:\\Users\\nu085\\AppData\\Local\\Temp\\tmpadvhuwem']
INFO:absl:Successfully b

INFO:absl:Node CsvExampleGen depends on [].
INFO:absl:Node CsvExampleGen is scheduled.
INFO:absl:Node Latest_blessed_model_resolver depends on [].
INFO:absl:Node Latest_blessed_model_resolver is scheduled.
INFO:absl:Node StatisticsGen depends on ['Run[CsvExampleGen]'].
INFO:absl:Node StatisticsGen is scheduled.
INFO:absl:Node SchemaGen depends on ['Run[StatisticsGen]'].
INFO:absl:Node SchemaGen is scheduled.
INFO:absl:Node ExampleValidator depends on ['Run[SchemaGen]', 'Run[StatisticsGen]'].
INFO:absl:Node ExampleValidator is scheduled.
INFO:absl:Node Transform depends on ['Run[CsvExampleGen]', 'Run[SchemaGen]'].
INFO:absl:Node Transform is scheduled.
INFO:absl:Node Trainer depends on ['Run[SchemaGen]', 'Run[Transform]'].
INFO:absl:Node Trainer is scheduled.
INFO:absl:Node Evaluator depends on ['Run[CsvExampleGen]', 'Run[Latest_blessed_model_resolver]', 'Run[Trainer]'].
INFO:absl:Node Evaluator is scheduled.
INFO:absl:Node Pusher depends on ['Run[Evaluator]', 'Run[Trainer]'].
INFO:absl

Instructions for updating:
Use ref() instead.


Instructions for updating:
Use ref() instead.
INFO:absl:Feature Acidity has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature Crunchiness has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature Juiciness has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature Quality has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature Ripeness has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature Size has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature Sweetness has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature Weight has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature Acidity has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature Crunchiness has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature Juiciness has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature Quality has a shape dim

INFO:tensorflow:Assets written to: output\apple-quality-pipeline\Transform\transform_graph\15\.temp_path\tftransform_tmp\9c25fa2b1c454e24aa63891d9eb4ab52\assets


INFO:tensorflow:Assets written to: output\apple-quality-pipeline\Transform\transform_graph\15\.temp_path\tftransform_tmp\9c25fa2b1c454e24aa63891d9eb4ab52\assets


INFO:tensorflow:struct2tensor is not available.


INFO:tensorflow:struct2tensor is not available.


INFO:tensorflow:tensorflow_decision_forests is not available.


INFO:tensorflow:tensorflow_decision_forests is not available.


INFO:tensorflow:tensorflow_text is not available.


INFO:tensorflow:tensorflow_text is not available.


INFO:tensorflow:struct2tensor is not available.


INFO:tensorflow:struct2tensor is not available.


INFO:tensorflow:tensorflow_decision_forests is not available.


INFO:tensorflow:tensorflow_decision_forests is not available.


INFO:tensorflow:tensorflow_text is not available.


INFO:tensorflow:tensorflow_text is not available.
INFO:absl:Cleaning up stateless execution info.
INFO:absl:Execution 15 succeeded.
INFO:absl:Cleaning up stateful execution info.
INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'post_transform_stats': [Artifact(artifact: uri: "output\\apple-quality-pipeline\\Transform\\post_transform_stats\\15"
, artifact_type: name: "ExampleStatistics"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
base_type: STATISTICS
)], 'pre_transform_schema': [Artifact(artifact: uri: "output\\apple-quality-pipeline\\Transform\\pre_transform_schema\\15"
, artifact_type: name: "Schema"
)], 'updated_analyzer_cache': [Artifact(artifact: uri: "output\\apple-quality-pipeline\\Transform\\updated_analyzer_cache\\15"
, artifact_type: name: "TransformCache"
)], 'post_transform_anomalies': [Artifact(artifact: uri: "output\\apple-quality-pipeline\\Transform\\post_transform_anomalies\\15"
, artifact_type: name: 

Model: "model"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 Acidity_xf (InputLayer)        [(None, 1)]          0           []                               
                                                                                                  
 Crunchiness_xf (InputLayer)    [(None, 1)]          0           []                               
                                                                                                  
 Juiciness_xf (InputLayer)      [(None, 1)]          0           []                               
                                                                                                  
 Ripeness_xf (InputLayer)       [(None, 1)]          0           []                               
                                                                                              



INFO:tensorflow:Assets written to: output\apple-quality-pipeline\Trainer\model\16\Format-Serving\assets


INFO:tensorflow:Assets written to: output\apple-quality-pipeline\Trainer\model\16\Format-Serving\assets


Epoch 2/10
Epoch 2: val_loss did not improve from 0.24542
Epoch 3/10
Epoch 3: val_loss did not improve from 0.24542
Epoch 4/10
Epoch 4: val_loss did not improve from 0.24542
Epoch 5/10
Epoch 5: val_loss did not improve from 0.24542
Epoch 6/10
Epoch 6: val_loss did not improve from 0.24542
Epoch 7/10
Epoch 7: val_loss did not improve from 0.24542
Epoch 8/10
Epoch 8: val_loss improved from 0.24542 to 0.26352, saving model to output\apple-quality-pipeline\Trainer\model\16\Format-Serving




INFO:tensorflow:Assets written to: output\apple-quality-pipeline\Trainer\model\16\Format-Serving\assets


INFO:tensorflow:Assets written to: output\apple-quality-pipeline\Trainer\model\16\Format-Serving\assets


Epoch 9/10
Epoch 9: val_loss improved from 0.26352 to 0.28265, saving model to output\apple-quality-pipeline\Trainer\model\16\Format-Serving




INFO:tensorflow:Assets written to: output\apple-quality-pipeline\Trainer\model\16\Format-Serving\assets


INFO:tensorflow:Assets written to: output\apple-quality-pipeline\Trainer\model\16\Format-Serving\assets


Epoch 10/10
Epoch 10: val_loss improved from 0.28265 to 0.29908, saving model to output\apple-quality-pipeline\Trainer\model\16\Format-Serving




INFO:tensorflow:Assets written to: output\apple-quality-pipeline\Trainer\model\16\Format-Serving\assets


INFO:tensorflow:Assets written to: output\apple-quality-pipeline\Trainer\model\16\Format-Serving\assets


INFO:tensorflow:struct2tensor is not available.


INFO:tensorflow:struct2tensor is not available.


INFO:tensorflow:tensorflow_decision_forests is not available.


INFO:tensorflow:tensorflow_decision_forests is not available.


INFO:tensorflow:tensorflow_text is not available.


INFO:tensorflow:tensorflow_text is not available.


INFO:tensorflow:Assets written to: output\apple-quality-pipeline\Trainer\model\16\Format-Serving\assets


INFO:tensorflow:Assets written to: output\apple-quality-pipeline\Trainer\model\16\Format-Serving\assets


You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.


INFO:absl:Training complete. Model written to output\apple-quality-pipeline\Trainer\model\16\Format-Serving. ModelRun written to output\apple-quality-pipeline\Trainer\model_run\16
INFO:absl:Cleaning up stateless execution info.
INFO:absl:Execution 16 succeeded.
INFO:absl:Cleaning up stateful execution info.
INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'model_run': [Artifact(artifact: uri: "output\\apple-quality-pipeline\\Trainer\\model_run\\16"
, artifact_type: name: "ModelRun"
)], 'model': [Artifact(artifact: uri: "output\\apple-quality-pipeline\\Trainer\\model\\16"
, artifact_type: name: "Model"
base_type: MODEL
)]}) for execution 16
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:node Trainer is finished.
INFO:absl:node Evaluator is running.
INFO:absl:Running launcher for node_info {
  type {
    name: "tfx.components.evaluator.component.Evaluator"
    base_type: EVALUATE
  }
  id: "Evaluator"
}
contexts {
  contexts {
    type {
      name: "p



INFO:absl:Using output\apple-quality-pipeline\Trainer\model\7\Format-Serving as baseline model.




INFO:absl:The 'example_splits' parameter is not set, using 'eval' split.
INFO:absl:Evaluating model.
INFO:absl:udf_utils.get_fn {'fairness_indicator_thresholds': 'null', 'eval_config': '{\n  "metrics_specs": [\n    {\n      "metrics": [\n        {\n          "class_name": "ExampleCount"\n        },\n        {\n          "class_name": "AUC"\n        },\n        {\n          "class_name": "FalsePositives"\n        },\n        {\n          "class_name": "TruePositives"\n        },\n        {\n          "class_name": "FalseNegatives"\n        },\n        {\n          "class_name": "TrueNegatives"\n        },\n        {\n          "class_name": "BinaryAccuracy",\n          "threshold": {\n            "change_threshold": {\n              "absolute": 0.0001,\n              "direction": "HIGHER_IS_BETTER"\n            },\n            "value_threshold": {\n              "lower_bound": 0.5\n            }\n          }\n        }\n      ]\n    }\n  ],\n  "model_specs": [\n    {\n      "label_key":































































INFO:absl:Evaluation complete. Results written to output\apple-quality-pipeline\Evaluator\evaluation\17.
INFO:absl:Checking validation results.


Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`


Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`
INFO:absl:Blessing result True written to output\apple-quality-pipeline\Evaluator\blessing\17.
INFO:absl:Cleaning up stateless execution info.
INFO:absl:Execution 17 succeeded.
INFO:absl:Cleaning up stateful execution info.
INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'blessing': [Artifact(artifact: uri: "output\\apple-quality-pipeline\\Evaluator\\blessing\\17"
, artifact_type: name: "ModelBlessing"
)], 'evaluation': [Artifact(artifact: uri: "output\\apple-quality-pipeline\\Evaluator\\evaluation\\17"
, artifact_type: name: "ModelEvaluation"
)]}) for execution 17
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:node Evaluator is finished.
INFO:absl:node Pusher is running.
INFO:absl:Running launcher for node_info {
  type {
    name: "tfx.components.pusher.component.Pusher"
    base_type: DEPLOY
  }
  id: "Pusher"
}
contexts {
  contexts {
    type {
      name: "pip