
# WORK FOR: 05 Prediction - TRITON SERVER

Working on Ensemble

In [621]:
BUCKET

'statmike-mlops-349915'

---
### Ensemble: Pipeline Instances To All Models And Versions

Triton Server has a model abstraction that can be specified with `platform: ensemble` in a `config.pbtxt` file.  This can be added to a model repository like any other model:

```
    <model-repository-path>/
        <model-name>/
            [config.pbtxt]
            [<output-labels-file> ...]
            <version>/
                <model-definition-file>
            <version>/
                <model-definition-file>
            ...
        <model-name>/
            [config.pbtxt]
            [<output-labels-file> ...]
            <version>/
                <model-definition-file>
            <version>/
                <model-definition-file>
        <ensemble-name>/
            [config.pbtxt]
            <version>/
                empty
            ...
        ...
```

The ensemble model specification is primarily made up of `ensemble_scheduling` which is a series of steps that map inputs > outputs > inputs ...

Reference [Ensemble Models](https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/user_guide/architecture.html?highlight=ensemble#ensemble-models)

List of input feature names to use for constructing the ensemble.  For this model all the input features have the same shape `[1, 1]` and data type `FP32`.  

In [622]:
feature_names = ['Time', 'V1', 'V2', 'V3', 'V4', 'V5', 'V6', 'V7', 'V8', 'V9', 'V10', 'V11', 'V12', 'V13', 'V14', 'V15', 'V16', 'V17', 'V18', 'V19', 'V20', 'V21', 'V22', 'V23', 'V24', 'V25', 'V26', 'V27', 'V28', 'Amount']

Start the `config.pbtxt` construction with a string representing the header:

In [623]:
ensemble_all = f"""name: "ensemble"
platform: "ensemble"
max_batch_size: 4"""

print(ensemble_all)

name: "ensemble"
platform: "ensemble"
max_batch_size: 4


Add the `input` specification for the ensemble using the feature names:

In [624]:
for n, name in enumerate(feature_names):
    if n == 0:
        ensemble_all += "\ninput ["
    else:
        ensemble_all += ","
    ensemble_all += f"""
    {{
        name: "{name}"
        data_type: TYPE_FP32
        dims: [ 1 ]
    }}"""
    if n == len(feature_names) - 1:
        ensemble_all += "\n]"

print(ensemble_all[0:150], f'\n\n\n<{len(ensemble_all)-300} characters hidden>\n\n\n', ensemble_all[-150:])

name: "ensemble"
platform: "ensemble"
max_batch_size: 4
input [
    {
        name: "Time"
        data_type: TYPE_FP32
        dims: [ 1 ]
    },
    


<2219 characters hidden>


 e: "V28"
        data_type: TYPE_FP32
        dims: [ 1 ]
    },
    {
        name: "Amount"
        data_type: TYPE_FP32
        dims: [ 1 ]
    }
]


Add the `output` specification for the ensemble using the model names and versions:

In [625]:
for m, model in enumerate(models_artifacts[0:1]):
    if m == 0:
        ensemble_all += "\noutput ["
    
    for v, version in enumerate(model[1][0:1]):
        ensemble_all += f"""
    {{
        name: "logistic_for_{model[0].display_name}_{version[0]}"
        data_type: TYPE_FP32
        dims: [ 2 ]
    }}"""
    
    if m == len(models_artifacts[0:1]) - 1:
        ensemble_all += "\n]"

print(ensemble_all[0:150], f'\n\n\n<{len(ensemble_all)-300} characters hidden>\n\n\n', ensemble_all[-150:])

name: "ensemble"
platform: "ensemble"
max_batch_size: 4
input [
    {
        name: "Time"
        data_type: TYPE_FP32
        dims: [ 1 ]
    },
    


<2328 characters hidden>


 pe: TYPE_FP32
        dims: [ 1 ]
    }
]
output [
    {
        name: "logistic_for_05_05_2"
        data_type: TYPE_FP32
        dims: [ 2 ]
    }
]


Build the `ensemble_scheduling` specification.  This is very large due to the number of models and input parameters.

In [626]:
for m, model in enumerate(models_artifacts[0:1]):
    if m == 0:
        ensemble_scheduling = """
ensemble_scheduling {
    step ["""
    
    for v, version in enumerate(model[1][0:1]):
        if m > 0 or v > 0:
            ensemble_scheduling += ""#","
        
        for n, name in enumerate(feature_names):
            if n == 0:
                input_map = ""
            #else:
            #    input_map += ","
            input_map += f"""
            input_map {{
                key: "{name}"
                value: "{name}"
            }}"""
        
        ensemble_scheduling += f"""
        {{
            model_name: "{model[0].display_name}"
            model_version: {version[0]}{input_map}
            output_map {{
                key: "logistic"
                value: "logistic_for_{model[0].display_name}_{version[0]}"
            }}
        }}"""
        
        
    if m == len(models_artifacts[0:1]) - 1:
        ensemble_scheduling += """
    ]
}"""
        
print(ensemble_scheduling[0:300], f'\n\n\n<{len(ensemble_scheduling)-600} characters hidden>\n\n\n', ensemble_scheduling[-300:])


ensemble_scheduling {
    step [
        {
            model_name: "05_05"
            model_version: 2
            input_map {
                key: "Time"
                value: "Time"
            }
            input_map {
                key: "V1"
                value: "V1"
            }
        


<2449 characters hidden>


             key: "V28"
                value: "V28"
            }
            input_map {
                key: "Amount"
                value: "Amount"
            }
            output_map {
                key: "logistic"
                value: "logistic_for_05_05_2"
            }
        }
    ]
}


Add the `ensemble_scheduling` specification to the overall ensemble specification in `ensemble_all`:

In [627]:
ensemble_all = ensemble_all + ensemble_scheduling

Add the ensemble model to the model repository in GCS:

In [628]:
bucket = gcs.lookup_bucket(BUCKET)
blob = bucket.blob(f'{SERIES}/{EXPERIMENT}/model_repo/ensemble/config.pbtxt')
blob.upload_from_string(ensemble_all)

Review the `config.pbtxt` in the browser with the following link:

In [629]:
print(f'https://storage.cloud.google.com/{BUCKET}/{SERIES}/{EXPERIMENT}/model_repo/ensemble/config.pbtxt')

https://storage.cloud.google.com/statmike-mlops-349915/05/triton/model_repo/ensemble/config.pbtxt


**NOTES ON TRITON MODEL REPOSITORY FOR ENSEMBLE**

All models in the TRITON model repository need version folders. But what about ensemble models? While nothing is required in the version folder, it still seems to be required. Since the souce of the model repository is a GCS URI registered in Vertex AI Model Registry, and object storage does not have the concept of "folders", you find this error:

>E0822 00:28:44.857235 1 model_repository_manager.cc:546] failed to load model 'ensemble_all': at least one version must be available under the version policy of model 'ensemble_all'

To solve this, the following cells create an empty text file named `empty.txt` and copy it to the `/1/empty.txt` location of the ensemble model in the model registry folder of GCS.

Check out [this related GitHub issue](https://github.com/triton-inference-server/server/issues/3623) for confirmation.

In [630]:
blob = bucket.blob(f'{SERIES}/{EXPERIMENT}/model_repo/ensemble/1/empty.txt')
blob.upload_from_string('# just an empty file to help force the creation of a version folder: /1/empty.txt')