# Menjelaskan Dataset 

## Informasi Atribut Dataset Diabetes

Berikut adalah penjelasan masing-masing atribut pada dataset deteksi diabetes yang digunakan:

- **Pregnancies**  
  Untuk menyatakan jumlah kehamilan (Number of pregnancies).

- **Glucose**  
  Untuk menyatakan kadar glukosa dalam darah (Glucose level in blood).

- **BloodPressure**  
  Untuk menyatakan pengukuran tekanan darah (Blood pressure measurement).

- **SkinThickness**  
  Untuk menyatakan ketebalan lipatan kulit (Thickness of the skin).

- **Insulin**  
  Untuk menyatakan kadar insulin dalam darah (Insulin level in blood).

- **BMI**  
  Untuk menyatakan indeks massa tubuh (Body Mass Index).

- **DiabetesPedigreeFunction**  
  Untuk menyatakan persentase risiko diabetes berdasarkan silsilah keluarga (Diabetes percentage).

- **Age**  
  Untuk menyatakan usia pasien (Age).

- **Outcome**  
  Untuk menyatakan hasil akhir:
  - `1` menunjukkan pasien **positif diabetes (Yes)**.
  - `0` menunjukkan pasien **negatif diabetes (No)**.


# Mempersiapkan Library

## Import Library Pipeline TFX untuk Diabetes Prediction

- `import os`, `import sys`  
  Untuk manajemen path dan utilitas sistem selama eksekusi pipeline.

- `import requests`, `import json`, `import base64`  
  Digunakan untuk:
  - `requests`: melakukan HTTP request jika pipeline akan upload model ke server lain.
  - `json`: parsing dan pembuatan data JSON.
  - `base64`: encoding/decoding data saat komunikasi dengan REST API.

- `from typing import Text`  
  Memberikan tipe hinting `Text` (alias untuk `str`) agar kode pipeline lebih rapih dan terbaca.

- `import tensorflow as tf`  
  Untuk membuat, melatih, dan menyimpan model deep learning menggunakan TensorFlow.

- `import tensorflow_model_analysis as tfma`  
  Untuk evaluasi model menggunakan **TensorFlow Model Analysis (TFMA)**, termasuk:
  - Menghitung metrik akurasi, AUC, dan lain-lain.
  - Melakukan *slicing analysis* berdasarkan fitur tertentu.
  - Audit fairness dan performa model.

- `from tfx.components import CsvExampleGen, StatisticsGen, SchemaGen, ExampleValidator, Transform, Tuner, Trainer, Evaluator, Pusher`  
  Mengimpor **komponen utama pipeline TFX**:
  - `CsvExampleGen`: Membaca data CSV sebagai input pipeline.
  - `StatisticsGen`: Membuat statistik dari dataset.
  - `SchemaGen`: Membuat schema data secara otomatis.
  - `ExampleValidator`: Mendeteksi nilai hilang dan anomali.
  - `Transform`: Preprocessing menggunakan TensorFlow Transform.
  - `Tuner`: Tuning hyperparameter model.
  - `Trainer`: Melatih model TensorFlow.
  - `Evaluator`: Mengevaluasi model menggunakan TFMA.
  - `Pusher`: Mendorong model terlatih ke direktori serving.

- `from tfx.proto import example_gen_pb2, trainer_pb2, pusher_pb2`  
  Digunakan untuk konfigurasi:
  - Split data (pada `CsvExampleGen`).
  - Konfigurasi training (pada `Trainer`).
  - Konfigurasi push model ke serving (pada `Pusher`).

- `from tfx.types import Channel`  
  Untuk mendefinisikan *channel artifact* pada pipeline TFX.

- `from tfx.dsl.components.common.resolver import Resolver`  
  Mengambil *model terbaik sebelumnya* untuk baseline evaluasi model baru.

- `from tfx.dsl.input_resolution.strategies.latest_blessed_model_strategy import LatestBlessedModelStrategy`  
  Strategi `Resolver` untuk mengambil *model terbaru yang telah lolos evaluasi (blessed)*.

- `from tfx.types.standard_artifacts import Model, ModelBlessing`  
  Artifact standar TFX untuk:
  - `Model`: Representasi model yang telah dilatih.
  - `ModelBlessing`: Status evaluasi model (blessed atau tidak).

- `from absl import logging`  
  Logging terstruktur dan konsisten selama eksekusi pipeline.

- `from tfx.orchestration import metadata, pipeline`  
  Untuk manajemen metadata pipeline dan konstruksi pipeline TFX.

- `from tfx.orchestration.beam.beam_dag_runner import BeamDagRunner`  
  Untuk menjalankan pipeline TFX menggunakan runner **Apache Beam** baik secara lokal maupun di cloud.


In [10]:
import os
import sys
import requests
import json
import base64
from typing import Text
import tensorflow as tf
import tensorflow_model_analysis as tfma
from tfx.components import (
    CsvExampleGen, 
    StatisticsGen, 
    SchemaGen, 
    ExampleValidator, 
    Transform, 
    Tuner,
    Trainer,
    Evaluator,
    Pusher
)
from tfx.proto import example_gen_pb2, trainer_pb2, pusher_pb2
from tfx.types import Channel
from tfx.dsl.components.common.resolver import Resolver
from tfx.types.standard_artifacts import Model, ModelBlessing
from tfx.dsl.input_resolution.strategies.latest_blessed_model_strategy import (
    LatestBlessedModelStrategy)
from absl import logging
from tfx.orchestration import metadata, pipeline
from tfx.orchestration.beam.beam_dag_runner import BeamDagRunner

# Mempersiapkan Direktori

In [11]:
# Nama pipeline TFX yang akan digunakan untuk identifikasi pipeline
PIPELINE_NAME = "diabetes-pipeline"

# Direktori data yang akan digunakan sebagai input pipeline
DATA_ROOT = "data"

# Path ke module transform yang berisi fungsi preprocessing untuk Transform component TFX
TRANSFORM_MODULE_FILE = "./diabetes_transform.py"

# Path ke module tuner yang berisi fungsi hyperparameter tuning untuk Tuner component TFX
TUNER_MODULE_FILE = "./diabetes_tuner.py"

# Path ke module trainer yang berisi fungsi pelatihan model untuk Trainer component TFX
TRAINER_MODULE_FILE = "./diabetes_trainer.py"

# Direktori output utama untuk menyimpan semua artifact pipeline
OUTPUT_BASE = "elchilz-pipelines"

# Direktori tempat menyimpan model yang telah diserve
SERVING_MODEL_DIR = os.path.join('serving_model', PIPELINE_NAME)

# Root pipeline, tempat semua artifact pipeline akan disimpan sesuai nama pipeline
pipeline_root = os.path.join(OUTPUT_BASE, PIPELINE_NAME)

# Path untuk menyimpan metadata SQLite untuk tracking artifact dan pipeline TFX
metadata_path = os.path.join(pipeline_root, "metadata.sqlite")

# Mempersiapkan TFX Pipeline Component

In [12]:
def init_components(
    data_dir,
    transform_module,
    training_module,
    tuner_module,
    training_steps,
    eval_steps,
    serving_model_dir,
):
    """Menginisiasi komponen pipeline TFX.

    Args:
        data_dir (str): Path ke direktori data.
        transform_module (str): Path ke module transform.
        training_module (str): Path ke module training.
        tuner_module (str): Path ke module tuner.
        training_steps (int): Jumlah langkah training.
        eval_steps (int): Jumlah langkah evaluasi.
        serving_model_dir (str): Path ke direktori model serving.

    Returns:
        Tuple komponen TFX yang telah disusun untuk pipeline.
    """

    # Membuat konfigurasi output untuk ExampleGen agar data split menjadi train (80%) dan eval (20%)
    output = example_gen_pb2.Output(
        split_config=example_gen_pb2.SplitConfig(splits=[
            example_gen_pb2.SplitConfig.Split(name="train", hash_buckets=8),
            example_gen_pb2.SplitConfig.Split(name="eval", hash_buckets=2)
        ])
    )

    # Membaca dataset CSV dari data_dir menggunakan CsvExampleGen
    example_gen = CsvExampleGen(
        input_base=data_dir,
        output_config=output
    )

    # Menghitung statistik dari dataset untuk eksplorasi dan validasi data
    statistics_gen = StatisticsGen(
        examples=example_gen.outputs["examples"]
    )

    # Menghasilkan schema otomatis dari statistik data
    schema_gen = SchemaGen(
        statistics=statistics_gen.outputs["statistics"]
    )

    # Memvalidasi data untuk mendeteksi nilai hilang dan anomali
    example_validator = ExampleValidator(
        statistics=statistics_gen.outputs["statistics"],
        schema=schema_gen.outputs["schema"]
    )

    # Melakukan transformasi/preprocessing pada dataset menggunakan module transform
    transform = Transform(
        examples=example_gen.outputs["examples"],
        schema=schema_gen.outputs["schema"],
        module_file=os.path.abspath(transform_module)
    )

    # Melakukan hyperparameter tuning dengan module tuner
    tuner = Tuner(
        module_file=os.path.abspath(tuner_module),
        examples=transform.outputs["transformed_examples"],
        transform_graph=transform.outputs["transform_graph"],
        train_args=trainer_pb2.TrainArgs(num_steps=20), 
        eval_args=trainer_pb2.EvalArgs(num_steps=5)    
    )

    # Melatih model TensorFlow menggunakan module trainer dengan hyperparameter terbaik
    trainer = Trainer(
        module_file=os.path.abspath(training_module),
        examples=transform.outputs["transformed_examples"],
        transform_graph=transform.outputs["transform_graph"],
        schema=schema_gen.outputs["schema"],
        hyperparameters=tuner.outputs["best_hyperparameters"],
        train_args=trainer_pb2.TrainArgs(num_steps=training_steps),
        eval_args=trainer_pb2.EvalArgs(num_steps=eval_steps)
    )

    # Mengambil model terbaik sebelumnya sebagai baseline saat evaluasi model
    model_resolver = Resolver(
        strategy_class=LatestBlessedModelStrategy,
        model=Channel(type=Model),
        model_blessing=Channel(type=ModelBlessing)
    ).with_id('Latest_blessed_model_resolver')

    # Mendefinisikan metrik evaluasi yang akan digunakan untuk model
    metrics_specs = [
        tfma.MetricsSpec(metrics=[
            tfma.MetricConfig(class_name="AUC"),
            tfma.MetricConfig(class_name="Precision"),
            tfma.MetricConfig(class_name="Recall"),
            tfma.MetricConfig(class_name="ExampleCount"),
            tfma.MetricConfig(
                class_name="BinaryAccuracy",
                threshold=tfma.MetricThreshold(
                    value_threshold=tfma.GenericValueThreshold(
                        lower_bound={"value": 0.5}
                    ),
                    change_threshold=tfma.GenericChangeThreshold(
                        direction=tfma.MetricDirection.HIGHER_IS_BETTER,
                        absolute={"value": 0.0001}
                    )
                )
            )
        ])
    ]

    # Membuat konfigurasi evaluasi menggunakan TFMA
    eval_config = tfma.EvalConfig(
        model_specs=[tfma.ModelSpec(label_key="Outcome")],
        metrics_specs=metrics_specs
    )

    # Mengevaluasi model menggunakan Evaluator TFX dengan baseline dan TFMA config
    evaluator = Evaluator(
        examples=example_gen.outputs["examples"],
        model=trainer.outputs["model"],
        baseline_model=model_resolver.outputs["model"],
        eval_config=eval_config
    )

    # Mendorong (push) model yang sudah terlatih dan lolos evaluasi ke direktori serving_model_dir
    pusher = Pusher(
        model=trainer.outputs["model"],
        model_blessing=evaluator.outputs["blessing"],
        push_destination=pusher_pb2.PushDestination(
            filesystem=pusher_pb2.PushDestination.Filesystem(
                base_directory=serving_model_dir
            )
        )
    )

    # Mengembalikan tuple komponen pipeline untuk dipakai pada pipeline runner
    components = (
        example_gen,
        statistics_gen,
        schema_gen,
        example_validator,
        transform,
        tuner,
        trainer,
        model_resolver,
        evaluator,
        pusher
    )

    return components

In [13]:
def init_local_pipeline(
    components, pipeline_root: Text
) -> pipeline.Pipeline:
     """Inisialisasi pipeline TFX lokal dengan runner Beam.

    Args:
        components: List komponen TFX yang akan dijalankan dalam pipeline.
        pipeline_root (Text): Path direktori untuk menyimpan artefak pipeline.

    Returns:
        pipeline.Pipeline: Objek pipeline TFX siap dijalankan menggunakan BeamDagRunner.
    """
    logging.info(f"Pipeline root set to: {pipeline_root}")
    beam_args = [
        "--direct_running_mode=multi_processing"
        "----direct_num_workers=0" 
    ]
    
    return pipeline.Pipeline(
        pipeline_name=PIPELINE_NAME,
        pipeline_root=pipeline_root,
        components=components,
        enable_cache=True,
        metadata_connection_config=metadata.sqlite_metadata_connection_config(
            metadata_path
        ),
        eam_pipeline_args=beam_args
    )

# Menjalankan Pipeline Component 

In [14]:
# Mengatur tingkat logging agar menampilkan informasi saat pipeline berjalan
logging.set_verbosity(logging.INFO)

# Inisialisasi komponen pipeline TFX
# Fungsi init_components akan mengembalikan list komponen TFX:
# CsvExampleGen, StatisticsGen, SchemaGen, ExampleValidator, Transform, Tuner, Trainer, Evaluator, Pusher
components = init_components(
    DATA_ROOT,
    training_module=TRAINER_MODULE_FILE,
    tuner_module=TUNER_MODULE_FILE,
    transform_module=TRANSFORM_MODULE_FILE,
    training_steps=5000,
    eval_steps=1000,
    serving_model_dir=SERVING_MODEL_DIR,
)

# Membuat pipeline TFX lokal menggunakan Beam
pipeline = init_local_pipeline(components, pipeline_root)

# Menjalankan pipeline TFX menggunakan BeamDagRunner
BeamDagRunner().run(pipeline=pipeline)

Trial 17 Complete [00h 00m 10s]
val_binary_accuracy: 0.7906249761581421

Best val_binary_accuracy So Far: 0.809374988079071
Total elapsed time: 00h 01m 32s
INFO:tensorflow:Oracle triggered exit


INFO:tensorflow:Oracle triggered exit
INFO:absl:Finished tuning... Tuner ID: tuner0
INFO:absl:Best HyperParameters: {'space': [{'class_name': 'Choice', 'config': {'name': 'learning_rate', 'default': 0.01, 'conditions': [], 'values': [0.01, 0.001, 0.0001], 'ordered': True}}, {'class_name': 'Int', 'config': {'name': 'units_layer1', 'default': None, 'conditions': [], 'min_value': 64, 'max_value': 256, 'step': 32, 'sampling': None}}, {'class_name': 'Int', 'config': {'name': 'units_layer2', 'default': None, 'conditions': [], 'min_value': 32, 'max_value': 128, 'step': 16, 'sampling': None}}, {'class_name': 'Int', 'config': {'name': 'units_layer3', 'default': None, 'conditions': [], 'min_value': 16, 'max_value': 64, 'step': 16, 'sampling': None}}, {'class_name': 'Choice', 'config': {'name': 'dropout_rate', 'default': 0.2, 'conditions': [], 'values': [0.2, 0.3, 0.4], 'ordered': True}}, {'class_name': 'Choice', 'config': {'name': 'l2_reg', 'default': 0.001, 'conditions': [], 'values': [0.001, 0

Results summary
Results in elchilz-pipelines\diabetes-pipeline\Tuner\.system\executor_execution\7\.temp\7\kt_hyperband
Showing 10 best trials
<keras_tuner.engine.objective.Objective object at 0x00000189BCD62CD0>
Trial summary
Hyperparameters:
learning_rate: 0.01
units_layer1: 96
units_layer2: 80
units_layer3: 64
dropout_rate: 0.4
l2_reg: 0.0001
tuner/epochs: 50
tuner/initial_epoch: 7
tuner/bracket: 1
tuner/round: 1
tuner/trial_id: 0002
Score: 0.809374988079071
Trial summary
Hyperparameters:
learning_rate: 0.01
units_layer1: 160
units_layer2: 96
units_layer3: 64
dropout_rate: 0.2
l2_reg: 0.001
tuner/epochs: 50
tuner/initial_epoch: 7
tuner/bracket: 1
tuner/round: 1
tuner/trial_id: 0005
Score: 0.8062499761581421
Trial summary
Hyperparameters:
learning_rate: 0.01
units_layer1: 128
units_layer2: 64
units_layer3: 64
dropout_rate: 0.4
l2_reg: 0.001
tuner/epochs: 50
tuner/initial_epoch: 0
tuner/bracket: 0
tuner/round: 0
Score: 0.8062499761581421
Trial summary
Hyperparameters:
learning_rate: 0.

INFO:absl:Execution 7 succeeded.
INFO:absl:Cleaning up stateful execution info.
INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'best_hyperparameters': [Artifact(artifact: uri: "elchilz-pipelines\\diabetes-pipeline\\Tuner\\best_hyperparameters\\7"
, artifact_type: name: "HyperParameters"
)], 'tuner_results': [Artifact(artifact: uri: "elchilz-pipelines\\diabetes-pipeline\\Tuner\\tuner_results\\7"
, artifact_type: name: "TunerResults"
)]}) for execution 7
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:node Tuner is finished.
INFO:absl:node Trainer is running.
INFO:absl:Running launcher for node_info {
  type {
    name: "tfx.components.trainer.component.Trainer"
    base_type: TRAIN
  }
  id: "Trainer"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "diabetes-pipeline"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        

INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Going to run a new execution 8
INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=8, input_dict={'schema': [Artifact(artifact: id: 3
type_id: 20
uri: "elchilz-pipelines\\diabetes-pipeline\\SchemaGen\\schema\\4"
custom_properties {
  key: "is_external"
  value {
    int_value: 0
  }
}
custom_properties {
  key: "state"
  value {
    string_value: "published"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "1.11.0"
  }
}
state: LIVE
create_time_since_epoch: 1751460129542
last_update_time_since_epoch: 1751460129542
, artifact_type: id: 20
name: "Schema"
)], 'examples': [Artifact(artifact: id: 8
type_id: 16
uri: "elchilz-pipelines\\diabetes-pipeline\\Transform\\transformed_examples\\5"
properties {
  key: "split_names"
  value {
    string_value: "[\"eval\", \"train\"]"
  }
}
custom_properties {
  key: "is_external"
  value {
    int_value: 0
  }
}
custom_properties {
  key: "state"


INFO:absl:Train on the 'train' split when train_args.splits is not set.
INFO:absl:Evaluate on the 'eval' split when eval_args.splits is not set.
INFO:absl:udf_utils.get_fn {'eval_args': '{\n  "num_steps": 1000\n}', 'custom_config': 'null', 'train_args': '{\n  "num_steps": 5000\n}', 'module_path': 'diabetes_trainer@elchilz-pipelines\\diabetes-pipeline\\_wheels\\tfx_user_code_trainer-0.0+95356d5fdd2380bbb48961feeef68363a2b8b0614e4b3f63e069600b413c969c-py3-none-any.whl'} 'run_fn'
INFO:absl:Installing 'elchilz-pipelines\\diabetes-pipeline\\_wheels\\tfx_user_code_trainer-0.0+95356d5fdd2380bbb48961feeef68363a2b8b0614e4b3f63e069600b413c969c-py3-none-any.whl' to a temporary directory.
INFO:absl:Executing: ['C:\\Users\\USER\\anaconda3\\envs\\mlops\\python.exe', '-m', 'pip', 'install', '--target', 'C:\\Users\\USER\\AppData\\Local\\Temp\\tmp36c5eqg0', 'elchilz-pipelines\\diabetes-pipeline\\_wheels\\tfx_user_code_trainer-0.0+95356d5fdd2380bbb48961feeef68363a2b8b0614e4b3f63e069600b413c969c-py3-none

Model: "model_1"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 Pregnancies_xf (InputLayer)    [(None, 1)]          0           []                               
                                                                                                  
 Glucose_xf (InputLayer)        [(None, 1)]          0           []                               
                                                                                                  
 BloodPressure_xf (InputLayer)  [(None, 1)]          0           []                               
                                                                                                  
 SkinThickness_xf (InputLayer)  [(None, 1)]          0           []                               
                                                                                            

Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
INFO:tensorflow:struct2tensor is not available.


INFO:tensorflow:struct2tensor is not available.


INFO:tensorflow:tensorflow_decision_forests is not available.


INFO:tensorflow:tensorflow_decision_forests is not available.


INFO:tensorflow:tensorflow_text is not available.


INFO:tensorflow:tensorflow_text is not available.


INFO:tensorflow:Assets written to: elchilz-pipelines\diabetes-pipeline\Trainer\model\8\Format-Serving\assets


INFO:tensorflow:Assets written to: elchilz-pipelines\diabetes-pipeline\Trainer\model\8\Format-Serving\assets
INFO:absl:Training complete. Model written to elchilz-pipelines\diabetes-pipeline\Trainer\model\8\Format-Serving. ModelRun written to elchilz-pipelines\diabetes-pipeline\Trainer\model_run\8
INFO:absl:Cleaning up stateless execution info.
INFO:absl:Execution 8 succeeded.
INFO:absl:Cleaning up stateful execution info.
INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'model': [Artifact(artifact: uri: "elchilz-pipelines\\diabetes-pipeline\\Trainer\\model\\8"
, artifact_type: name: "Model"
base_type: MODEL
)], 'model_run': [Artifact(artifact: uri: "elchilz-pipelines\\diabetes-pipeline\\Trainer\\model_run\\8"
, artifact_type: name: "ModelRun"
)]}) for execution 8
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:node Trainer is finished.
INFO:absl:node Evaluator is running.
INFO:absl:Running launcher for node_info {
  type {
    name: "tfx.components.e

INFO:absl:udf_utils.get_fn {'fairness_indicator_thresholds': 'null', 'example_splits': 'null', 'eval_config': '{\n  "metrics_specs": [\n    {\n      "metrics": [\n        {\n          "class_name": "AUC"\n        },\n        {\n          "class_name": "Precision"\n        },\n        {\n          "class_name": "Recall"\n        },\n        {\n          "class_name": "ExampleCount"\n        },\n        {\n          "class_name": "BinaryAccuracy",\n          "threshold": {\n            "change_threshold": {\n              "absolute": 0.0001,\n              "direction": "HIGHER_IS_BETTER"\n            },\n            "value_threshold": {\n              "lower_bound": 0.5\n            }\n          }\n        }\n      ]\n    }\n  ],\n  "model_specs": [\n    {\n      "label_key": "Outcome"\n    }\n  ]\n}'} 'custom_eval_shared_model'
INFO:absl:Request was made to ignore the baseline ModelSpec and any change thresholds. This is likely because a baseline model was not provided: updated_config=




INFO:absl:The 'example_splits' parameter is not set, using 'eval' split.
INFO:absl:Evaluating model.
INFO:absl:udf_utils.get_fn {'fairness_indicator_thresholds': 'null', 'example_splits': 'null', 'eval_config': '{\n  "metrics_specs": [\n    {\n      "metrics": [\n        {\n          "class_name": "AUC"\n        },\n        {\n          "class_name": "Precision"\n        },\n        {\n          "class_name": "Recall"\n        },\n        {\n          "class_name": "ExampleCount"\n        },\n        {\n          "class_name": "BinaryAccuracy",\n          "threshold": {\n            "change_threshold": {\n              "absolute": 0.0001,\n              "direction": "HIGHER_IS_BETTER"\n            },\n            "value_threshold": {\n              "lower_bound": 0.5\n            }\n          }\n        }\n      ]\n    }\n  ],\n  "model_specs": [\n    {\n      "label_key": "Outcome"\n    }\n  ]\n}'} 'custom_extractors'
INFO:absl:Request was made to ignore the baseline ModelSpec and any



























INFO:absl:Evaluation complete. Results written to elchilz-pipelines\diabetes-pipeline\Evaluator\evaluation\9.
INFO:absl:Checking validation results.
INFO:absl:Blessing result True written to elchilz-pipelines\diabetes-pipeline\Evaluator\blessing\9.
INFO:absl:Cleaning up stateless execution info.
INFO:absl:Execution 9 succeeded.
INFO:absl:Cleaning up stateful execution info.
INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'evaluation': [Artifact(artifact: uri: "elchilz-pipelines\\diabetes-pipeline\\Evaluator\\evaluation\\9"
, artifact_type: name: "ModelEvaluation"
)], 'blessing': [Artifact(artifact: uri: "elchilz-pipelines\\diabetes-pipeline\\Evaluator\\blessing\\9"
, artifact_type: name: "ModelBlessing"
)]}) for execution 9
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:node Evaluator is finished.
INFO:absl:node Pusher is running.
INFO:absl:Running launcher for node_info {
  type {
    name: "tfx.components.pusher.component.Pusher"
    base_type: DE

INFO:absl:Model version: 1751461032
INFO:absl:Model written to serving path serving_model\diabetes-pipeline\1751461032.
INFO:absl:Model pushed to elchilz-pipelines\diabetes-pipeline\Pusher\pushed_model\10.
INFO:absl:Cleaning up stateless execution info.
INFO:absl:Execution 10 succeeded.
INFO:absl:Cleaning up stateful execution info.
INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'pushed_model': [Artifact(artifact: uri: "elchilz-pipelines\\diabetes-pipeline\\Pusher\\pushed_model\\10"
, artifact_type: name: "PushedModel"
base_type: MODEL
)]}) for execution 10
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:node Pusher is finished.


# Menampilkan Metrik Evaluasi Model

In [17]:
# Mengambil URI hasil evaluasi dari Evaluator TFX yang telah dijalankan sebelumnya
# dan memuat hasil evaluasi tersebut menggunakan TFMA
tfma_result = tfma.load_eval_result(r'elchilz-pipelines\diabetes-pipeline\Evaluator\evaluation\9')

# Menampilkan metrik evaluasi secara interaktif
# menggunakan TFMA slicing metrics widget untuk analisis performa model
tfma.view.render_slicing_metrics(tfma_result)

# Menampilkan Fairness Indicator secara interaktif untuk memeriksa fairness model
# jika dataset memiliki sensitive attributes untuk audit fairness
tfma.addons.fairness.view.widget_view.render_fairness_indicator(
    tfma_result
)

FairnessIndicatorViewer(slicingMetrics=[{'sliceValue': 'Overall', 'slice': 'Overall', 'metrics': {'binary_accu…

# Melakukan Prediksi

In [None]:
# Membuat dictionary raw_data yang akan digunakan untuk melakukan
# prediksi menggunakan model deteksi diabetes yang telah dilatih
raw_data = {
    "Pregnancies": 6,                   # Jumlah kehamilan
    "Glucose": 148,                     # Kadar glukosa dalam darah
    "BloodPressure": 72,                # Tekanan darah
    "SkinThickness": 35,                # Ketebalan lipatan kulit
    "Insulin": 10,                      # Kadar insulin dalam darah
    "BMI": 33.6,                        # Indeks massa tubuh
    "DiabetesPedigreeFunction": 0.627,  # Persentase risiko diabetes
    "Age": 50                           # Usia pasien
}

In [None]:
# Mendefinisikan fitur mana yang bertipe integer dan float,
# untuk membantu encoding data input ke dalam format TFRecord

int_features = ["Pregnancies", "Glucose", "BloodPressure", "SkinThickness", "Insulin", "Age"]
float_features = ["BMI", "DiabetesPedigreeFunction"]

feature_dict = {}

# Melakukan iterasi pada raw_data untuk mengisi feature_dict sesuai tipe data
for key, value in raw_data.items():
    if key in int_features:
        feature_dict[key] = tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
    elif key in float_features:
        feature_dict[key] = tf.train.Feature(float_list=tf.train.FloatList(value=[value]))
    else:
        raise ValueError(f"Unknown feature: {key}")

# Membuat objek tf.train.Example yang berisi semua fitur dalam bentuk serialized
# agar dapat digunakan sebagai input ke model TensorFlow saat prediksi
example = tf.train.Example(features=tf.train.Features(feature=feature_dict))

In [None]:
# Mengubah tf.train.Example menjadi byte string
serialized_example_bytes = example.SerializeToString()

# Mengenkripsi byte string menjadi base64 agar dapat dikirim melalui JSON POST request
encoded_example_string = base64.b64encode(serialized_example_bytes).decode('utf-8')

# Menyusun payload JSON sesuai dengan format TensorFlow Serving REST API
json_data = json.dumps({
    "instances": [{"examples": {"b64": encoded_example_string}}]
})

# Mengirimkan POST request ke endpoint model TFX yang sudah dideploy
response = requests.post(
    'https://mlops2-production-fce5.up.railway.app/v1/models/cc-model:predict',
    data=json_data
)

# Mengambil hasil JSON dari respons server
prediction_output = response.json()

if 'predictions' in prediction_output and len(prediction_output['predictions']) > 0:
    probability = prediction_output['predictions'][0][0]

    # Menentukan threshold klasifikasi untuk diabetes
    threshold = 0.5

    # Melakukan klasifikasi berdasarkan threshold
    if probability >= threshold:
        result = "Diabetes"
    else:
        result = "Bukan Diabetes"

    # Menampilkan hasil probabilitas dan klasifikasi
    print(f"Prediksi probabilitas: {probability}")
    print(f"Hasil klasifikasi: {result}")

else:
    # Jika gagal mendapatkan prediksi, cetak respons untuk debugging
    print("Gagal mendapatkan prediksi atau format respons tidak terduga.")
    print(prediction_output)