# MetaSpore Getting Started

MetaSpore is a machine learning platform, which provides a one-stop solution for data preprocessing, model training and online prediction.

In this article, we introduce the basic API of MetaSpore briefly.

## Prepare Data

We use the publicly available dataset [Terabyte Click Logs](https://labs.criteo.com/2013/12/download-terabyte-click-logs-2/) published by CriteoLabs as our demo dataset.

We sample the dataset with sampling rate 0.001 so that the running of the demo can finish quickly. More information about the demo dataset can be found in [MetaSpore Demo Dataset](https://ks3-cn-beijing.ksyuncs.com/dmetasoul-bucket/demo/criteo/index.html).

Execute the following cell to download the demo dataset into the working directory. Those data files take up about 2.1 GiB disk space and the downloading process may take sveral minutes. If the downloading fails, please refer to [MetaSpore Demo Dataset](https://ks3-cn-beijing.ksyuncs.com/dmetasoul-bucket/demo/criteo/index.html) and download the dataset manually.

In [1]:
# import metaspore
# metaspore.demo.download_dataset()

You can check the downloaded dataset by executing the following cell.

In [2]:
# !ls -l ${PWD}/data/

(Optional) To upload the dataset to your own s3 bucket:

1. Fill ``{YOUR_S3_BUCKET}`` and ``{YOUR_S3_PATH}`` with your preferred values in the following cell.
2. Uncomment the cell by removing the leading ``#`` character.
3. Execute the cell.

In [3]:
# YOUR_S3_BUCKET='s3://sagemaker-us-west-2-452145973879/datasets/CriteoLabs/'
# YOUR_S3_PATH='datasets/CriteoLabs'

In [4]:
# !aws s3 cp --recursive ${PWD}/data/ s3://sagemaker-us-west-2-452145973879/datasets/CriteoLabs/demo/data/

Alternatively, you can open a terminal by selecting the ``File`` -> ``New`` -> ``Terminal`` menu item and executing Bash commands in it.

You can check the uploaded dataset in your s3 bucket by uncommenting and executing the following cell.

In [5]:
# !aws s3 ls s3://sagemaker-us-west-2-452145973879/datasets/CriteoLabs/demo/data/
# !aws s3 ls s3://mv-mtg-di-for-poc-datalab/2024/06/14/00/

The ``schema`` directory contains configuration files and must also be uploaded to s3 so that the model can be trained in cluster environment. 

In [6]:
#!aws s3 cp --recursive ${PWD}/schema/ s3://sagemaker-us-west-2-452145973879/datasets/CriteoLabs/demo/schema/

In the rest of the article, we assume the demo dataset and schemas has been uploaded to `ROOT_DIR`.

In [7]:
# ROOT_DIR = 's3://sagemaker-us-west-2-452145973879/datasets/CriteoLabs/demo'
# ROOT_DIR = '.'
ROOT_DIR = 's3://mv-mtg-di-for-poc-datalab'

## Define the Model

We can define our neural network model by subclassing ``torch.nn.Module`` as usual PyTorch models. The following ``DemoModule`` class provides an example.

Compared to usual PyTorch models, the notable difference is the ``_sparse`` layer created by instantiating ``ms.EmbeddingSumConcat`` which takes an embedding size and paths of two text files. ``ms.EmbeddingSumConcat`` makes it possible to define large-scale sparse models in PyTorch, which is a distinguishing feature of MetaSpore.

The ``_schema_dir`` field is an s3 directory which makes it possible to use the ``DemoModule`` class in cluster environment.

In [8]:
# import torch
# import metaspore as ms

# class DemoModule(torch.nn.Module):
#     def __init__(self):
#         super().__init__()
#         self._embedding_size = 16
#         self._schema_dir = ROOT_DIR + '/schema/'
#         self._column_name_path = self._schema_dir + 'column_name_demo.txt'
#         self._combine_schema_path = self._schema_dir + 'combine_schema_demo.txt'
#         self._sparse = ms.EmbeddingSumConcat(self._embedding_size, self._column_name_path, self._combine_schema_path)
#         self._sparse.updater = ms.FTRLTensorUpdater()
#         self._sparse.initializer = ms.NormalTensorInitializer(var=0.01)
#         self._dense = torch.nn.Sequential(
#             ms.nn.Normalization(self._sparse.feature_count * self._embedding_size),
#             torch.nn.Linear(self._sparse.feature_count * self._embedding_size, 1024),
#             torch.nn.ReLU(),
#             torch.nn.Linear(1024, 512),
#             torch.nn.ReLU(),
#             torch.nn.Linear(512, 1),
#         )

#     def forward(self, x):
#         x = self._sparse(x)
#         x = self._dense(x)
#         return torch.sigmoid(x)

In [9]:
import metaspore as ms
import torch
import torch.nn as nn


def nansum(x):
    return torch.where(torch.isnan(x), torch.zeros_like(x), x).sum()


def log_loss(yhat, y):
    return nansum(-(y * (yhat + 1e-12).log() + (1 - y) *
                    (1 - yhat + 1e-12).log()))

# 自定义的主函数入口
class DNNModelMain(nn.Module):
    def __init__(self, ): # feature_config_file
        super().__init__()
        self._embedding_size = 16
        self._schema_dir = ROOT_DIR + '/schema/'
        self._column_name_path = self._schema_dir + 'column_name_mobivista.txt'
        self._combine_schema_path = self._schema_dir + 'combine_schema_mobivista.txt'
        # self.feature_config_file = feature_config_file  # TODO not used
        self._sparse = ms.EmbeddingSumConcat(
            self._embedding_size,
            combine_schema_source=self._column_name_path,
            combine_schema_file_path=self._combine_schema_path,
            # enable_feature_gen=True,
            # feature_config_file=feature_config_file,
            # enable_fgs=False
        )
        self._sparse.updater = ms.FTRLTensorUpdater(alpha=0.01)
        self._sparse.initializer = ms.NormalTensorInitializer(var=0.001)
        extra_attributes = {
            "enable_fresh_random_keep": True,
            "fresh_dist_range_from": 0, 
            "fresh_dist_range_to": 1000,
            "fresh_dist_range_mean": 950,
            "enable_feature_gen": True,
            "use_hash_code": False
        }
        self._sparse.extra_attributes = extra_attributes
        feature_count = self._sparse.feature_count
        feature_dim = self._sparse.feature_count * self._embedding_size

        self._gateEmbedding = GateEmbedding(feature_dim, feature_count, self._embedding_size)
        self._h1 = nn.Linear(feature_dim, 1024)
        self._h2 = FourChannelHidden(1024, 512)
        self._h3 = FourChannelHidden(512, 256)
        self._h4 = nn.Linear(256, 1)

        self._bn = ms.nn.Normalization(feature_dim, momentum=0.01, eps=1e-5, affine=True)
        self._zero = torch.zeros(1, 1)
        self.act0 = nn.Sigmoid()

    def forward(self, x):
        emb = self._sparse(x)
        bno = self._bn(emb)
        
        # print(f"self._sparse._data.type: {type(self._sparse._data)}, self._sparse._data.shape: {self._sparse._data.shape}") 
        # print(f"x.type: {type(x)}, x.shape: {x.shape}, x: {x}")
        # print(f"emb.type: {type(emb)}, emb.shape: {emb.shape}, ") # emb: {emb}
        # print(f"bno.type: {type(bno)}, bno.shape: {bno.shape}, ")  # bno: {bno}
        
        d = self._gateEmbedding(bno)
        o = self._h1(d)
        r, s1, s2, s3 = self._h2(o, self._zero, self._zero, self._zero)
        r, s1, s2, s3 = self._h3(r, s1, s2, s3)
        return self.act0(self._h4(r))


class FourChannelHidden(nn.Module):
    def __init__(self, in_size, out_size):
        super().__init__()
        self.wc2 = nn.Linear(int(in_size / 4), int(in_size / 4))
        self.wc3 = nn.Linear(int(in_size), int(in_size - in_size / 4 * 3))
        self.w = nn.Linear(int(in_size + int(in_size / 4) * 2) + int(in_size - int(in_size / 4) * 3) + 3, out_size)
        self.act1 = nn.Tanh()
        self.act = nn.ReLU()
        self.fl = int(in_size / 4)

    def forward(self, input, i1, i2, i3):
        f0 = input[:, :self.fl]
        f1 = input[:, self.fl:self.fl * 2]
        f2 = input[:, self.fl * 2:self.fl * 3]
        f3 = input[:, self.fl * 3:]

        c1 = self.act1(f0 * f1) * f1
        c2 = self.act1(self.wc2(f2) * f2)
        c3 = self.act1(f3 * self.wc3(input))

        s1 = torch.sum(c1, 1, True) + i1
        s2 = torch.sum(c2, 1, True) + i2
        s3 = torch.sum(c3, 1, True) + i3

        return self.act(self.w(torch.cat((input, c1, c2, c3, s1, s2, s3), 1))), s1, s2, s3


class GateEmbedding(nn.Module):
    def __init__(self, in_size, out_size, emb_size):
        super().__init__()
        self.layer1 = torch.nn.Linear(in_size, out_size)
        self.out_size = out_size
        self.emb_size = emb_size
        self.act2 = nn.Sigmoid()

    def forward(self, input):
        gate = self.act2(self.layer1(input))
        gate_reshape = torch.reshape(gate, (-1, self.out_size, 1))
        input_reshape = torch.reshape(input, (-1, self.out_size, self.emb_size))
        return (gate_reshape * input_reshape).reshape(-1, self.out_size * self.emb_size)



Instantiating the ``DemoModule`` class to define our PyTorch model.

In [10]:
# module = DemoModule()
# module = DNNModelMain('schema/combine_schema_mobivista.txt')
module = DNNModelMain()

[WARN] 2024-07-16 02:51:43.621 STSAssumeRoleWithWebIdentityCredentialsProvider [140624245376832] Token file must be specified to use STS AssumeRole web identity creds provider.
[32mloaded combine schema from[m [32mcombine schema file [m's3://mv-mtg-di-for-poc-datalab/schema/combine_schema_mobivista.txt'
[2024-07-16 02:51:43.621] [info] [s3_sdk_filesys.cpp:357] Try to open S3 stream: s3://mv-mtg-di-for-poc-datalab/schema/combine_schema_mobivista.txt, read_only true
[2024-07-16 02:51:43.776] [info] [s3_sdk_filesys.cpp:380] Opened read-only stream for object: s3://mv-mtg-di-for-poc-datalab/schema/combine_schema_mobivista.txt with total length 2260
[2024-07-16 02:51:43.779] [info] [s3_sdk_filesys.cpp:419] Read S3 object s3://mv-mtg-di-for-poc-datalab/schema/combine_schema_mobivista.txt with size 2260 at position 0 larger than total size: 2260, change size to 2260
[2024-07-16 02:51:43.879] [info] [s3_sdk_filesys.cpp:413] Read S3 object s3://mv-mtg-di-for-poc-datalab/schema/combine_schem

## Train the Model

To train our model, first we need to create a ``ms.PyTorchEstimator`` passing in several arguments including our PyTorch model ``module`` and the number of workers and servers.

``model_out_path`` specifies where to store the trained model.

``input_label_column_index`` specifies the column index of the label column in the dataset, which is ``0`` for the demo dataset.

In [11]:
model_out_path = ROOT_DIR + '/output/dev/model_out/'
estimator = ms.PyTorchEstimator(module=module,
                                worker_count=8,
                                server_count=8,
                                model_out_path=model_out_path,
                                experiment_name='0.1',
                                input_label_column_name='label',
                                training_epoches=1,
                                shuffle_training_dataset=True)

Next, we create a Spark session by calling ``ms.spark.get_session()`` and load the training dataset by call ``ms.input.read_s3_csv()``.

``delimiter`` specifies the column delimiter of the dataset, which is the TAB character ``'\t'`` for the demo dataset.

We also need to pass column names because the csv files do not contain headers.

In [12]:
column_names = []
with open(f'./schema/column_name_mobivista.txt', 'r') as f:
    for line in f:
        column_names.append(line.split(' ')[1].strip())
print(column_names)

['_11001', '_11002', '_11003', '_11004', '_11007', '_11008', '_11021', '_11022', '_11023', '_11024', '_11041', '_11042', '_11043', '_11044', '_11045', '_11046', '_11061', '_11062', '_11063', '_11064', '_11065', '_11066', '_11081', '_11082', '_11083', '_11084', '_11085', '_11086', '_11601', '_11602', '_11603', '_12001', '_12002', '_12003', '_12004', '_12005', '_12006', '_20001', '_20002', '_20003', '_20101', '_20102', '_20201', '_20202', '_20203', '_20204', '_20205', '_20206', '_20207', '_20208', '_20209', '_20210', '_30001', '_30002', '_30003', '_30004', '_30005', '_30006', '_30201', '_30202', '_30203', '_30204', '_30205', '_30206', '_30207', '_40001', '_40002', '_40003', '_40004', '_40005', '_40201', '_40202', '_40203', '_40204', '_40205', '_40206', '_40207', '_40208', '_40209', '_40210', '_40211', '_40212', '_40213', '_40214', '_40215', '_40231', '_40301', '_40302', '_40303', '_40304', '_40305', '_40306', '_40307', '_40321', '_40322', '_40323', '_40324', '_50801', '_50802', '_50805',

In [13]:
# train_dataset_path = ROOT_DIR + '/data/train/day_0_0.001_train.csv'

file_base_path = 's3://mv-mtg-di-for-poc-datalab/2024/06/14/00/'
num_files = 100
file_names = [f'part-{str(i).zfill(5)}-1e73cc51-9b17-4439-9d71-7d505df2cae3-c000.snappy.orc' for i in range(num_files)]
train_dataset_path = [file_base_path + file_name for file_name in file_names]

# train_dataset_path = 's3://mv-mtg-di-for-poc-datalab/2024/06/14/00/part-00000-1e73cc51-9b17-4439-9d71-7d505df2cae3-c000.snappy.orc'
# train_dataset_path = 's3://mv-mtg-di-for-poc-datalab/2024/06/14/00/'

spark_confs = {
    'spark.eventLog.enabled':'true',
    'spark.executor.memory': '20g',
    'spark.driver.memory': '10g',
}

spark_session = ms.spark.get_session(local=True,
                                     batch_size=200,
                                     worker_count=estimator.worker_count,
                                     server_count=estimator.server_count,
                                     log_level='WARN',
                                     spark_confs=spark_confs)

train_dataset = ms.input.read_s3_csv(spark_session, 
                                     train_dataset_path, 
                                     format='orc',
                                     shuffle=False,
                                     delimiter='\t', 
                                     multivalue_delimiter="\001", 
                                     column_names=column_names,
                                     multivalue_column_names=column_names[:-1])

# train_dataset = spark_session.read.orc(train_dataset_path)

# train_dataset.printSchema()

24/07/16 02:51:45 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
                                                                                

ignore shuffle


In [14]:
train_dataset.count()

                                                                                

1318351

In [15]:
# train_dataset = train_dataset.limit(10000)

Finally, we call the ``fit()`` method of ``ms.PyTorchEstimator`` to train our model. This will take several minutes and you can see the progress by looking at the output of the cell. The trained model is stored in ``model_out_path`` and the ``model`` variable.

In [16]:
model = estimator.fit(train_dataset)

[2024-07-16 02:52:24.365] [info] PS job with coordinator address 172.31.0.41:51327 started.
[2024-07-16 02:52:24.365] [info] PSRunner::RunPS: pid: 22031, tid: 22361, thread: 0x7fe52d780700
[2024-07-16 02:52:24.365] [info] PSRunner::RunPSCoordinator: pid: 22031, tid: 22361, thread: 0x7fe52d780700
[2024-07-16 02:52:24.366] [info] ActorProcess::Receiving: Coordinator pid: 22031, tid: 22365, thread: 0x7fe569798700


[2024-07-16 02:52:25.735] [info] PS job with coordinator address 172.31.0.41:51327 started.
[2024-07-16 02:52:25.735] [info] PS job with coordinator address 172.31.0.41:51327 started.
[2024-07-16 02:52:25.735] [info] PSRunner::RunPS: pid: 22434, tid: 22434, thread: 0x7f8b7fa51740
[2024-07-16 02:52:25.735] [info] PSRunner::RunPSServer: pid: 22434, tid: 22434, thread: 0x7f8b7fa51740
[2024-07-16 02:52:25.735] [info] PSRunner::RunPS: pid: 22386, tid: 22386, thread: 0x7f8b7fa51740
[2024-07-16 02:52:25.735] [info] PSRunner::RunPSServer: pid: 22386, tid: 22386, thread: 0x7f8b7fa51740
[2024-07-16 02:52:25.736] [info] ActorProcess::Receiving: Server pid: 22386, tid: 22458, thread: 0x7f8b51f49700
[2024-07-16 02:52:25.736] [info] ActorProcess::Receiving: Server pid: 22434, tid: 22459, thread: 0x7f8b51f49700
[2024-07-16 02:52:25.740] [info] PS job with coordinator address 172.31.0.41:51327 started.
[2024-07-16 02:52:25.740] [info] PS job with coordinator address 172.31.0.41:51327 started.
[2024-07

[2024-07-16 02:52:25.803] [info] C[0]:9: The coordinator has connected to 8 servers and 8 workers.
PS Coordinator node [32mC[0]:9[m is ready.


[2024-07-16 02:52:25.804] [info] S[0]:10 has connected to others.
[2024-07-16 02:52:25.804] [info] S[2]:42 has connected to others.
[2024-07-16 02:52:25.804] [info] S[1]:26 has connected to others.
[2024-07-16 02:52:25.804] [info] W[1]:28 has connected to others.
[2024-07-16 02:52:25.804] [info] W[4]:76 has connected to others.
[2024-07-16 02:52:25.804] [info] W[5]:92 has connected to others.
[2024-07-16 02:52:25.804] [info] S[5]:90 has connected to others.
[2024-07-16 02:52:25.804] [info] S[4]:74 has connected to others.
[2024-07-16 02:52:25.804] [info] W[0]:12 has connected to others.
PS Server node [38;5;196mS[2]:42[m is ready.
PS Worker node [38;5;051mW[1]:28[m is ready.[2024-07-16 02:52:25.804] [info] S[6]:106 has connected to others.
[2024-07-16 02:52:25.804] [info] W[6]:108 has connected to others.
PS Server node [38;5;196mS[1]:26[m is ready.

PS Server node [38;5;196mS[0]:10[m is ready.[2024-07-16 02:52:25.804] [info] S[7]:122 has connected to others.

[2024-07-16 02:52

shuffle df to partitions 16


24/07/16 02:52:27 WARN package: Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'.
[Stage 4:>                  (0 + 8) / 8][Stage 18:>                (0 + 8) / 16]

2024-07-16 02:56:37.855 -- auc: 0.5579498364231189, Δauc: 0.5579498364231189, pcoc: 22.686565289327078, Δpcoc: 22.686565289327078, loss: 0.0027615015506744383, Δloss: 0.0027615015506744383, #instance: 2000
2024-07-16 02:56:37.870 -- auc: 0.5350083008073991, Δauc: 0.517478082862356, pcoc: 19.61339674430129, Δpcoc: 17.275116329607755, loss: 0.002746586561203003, Δloss: 0.0027316715717315672, #instance: 4000
2024-07-16 02:56:37.882 -- auc: 0.5300434835708943, Δauc: 0.5225261892994544, pcoc: 17.77900511626877, Δpcoc: 14.97550093003039, loss: 0.0027760663032531737, Δloss: 0.0028350257873535154, #instance: 6000
2024-07-16 02:56:37.892 -- auc: 0.5194890080548262, Δauc: 0.4911282051282051, pcoc: 17.26498581128924, Δpcoc: 15.887414073944091, loss: 0.0027975799441337587, Δloss: 0.0028621208667755126, #instance: 8000
2024-07-16 02:56:37.902 -- auc: 0.5305169352115131, Δauc: 0.5700409460859768, pcoc: 16.892999404288354, Δpcoc: 15.5509307986381, loss: 0.0027812838077545168, Δloss: 0.002716099262237

[Stage 4:>                  (0 + 8) / 8][Stage 18:>                (0 + 8) / 16]

2024-07-16 02:56:56.483 -- auc: 0.5178874798020292, Δauc: 0.5599664699440843, pcoc: 10.33806850410975, Δpcoc: 1.9671017808734246, loss: 0.002115902851609623, Δloss: 0.0007522104978561401, #instance: 34000
2024-07-16 02:56:57.477 -- auc: 0.5191827462052684, Δauc: 0.5674026452448575, pcoc: 9.937482050055094, Δpcoc: 2.475394847781159, loss: 0.0020357657935884265, Δloss: 0.0006734358072280884, #instance: 36000
2024-07-16 02:56:57.490 -- auc: 0.5179919971825055, Δauc: 0.5967117746928992, pcoc: 9.374853887128591, Δpcoc: 1.715851153096845, loss: 0.001971311744890715, Δloss: 0.0008111388683319092, #instance: 38000
2024-07-16 02:56:57.501 -- auc: 0.5198977119913417, Δauc: 0.6046780185421229, pcoc: 9.011631731803943, Δpcoc: 2.009945078099028, loss: 0.0019057548642158507, Δloss: 0.0006601741313934326, #instance: 40000
2024-07-16 02:56:57.511 -- auc: 0.5212999401091869, Δauc: 0.5999595245616807, pcoc: 8.647090945867351, Δpcoc: 1.9661800036063561, loss: 0.0018488123246601649, Δloss: 0.0007099615335

[Stage 4:>                  (0 + 8) / 8][Stage 18:>                (0 + 8) / 16]

2024-07-16 02:57:55.978 -- auc: 0.641380062696522, Δauc: 0.7761821012228018, pcoc: 3.6187553264656636, Δpcoc: 1.2525057459986486, loss: 0.0009594026271140937, Δloss: 0.000471937358379364, #instance: 132000
2024-07-16 02:57:55.996 -- auc: 0.6434866806650552, Δauc: 0.7999751411626833, pcoc: 3.5702897039887995, Δpcoc: 0.920557137193351, loss: 0.0009536729401616908, Δloss: 0.0005755136013031006, #instance: 134000
2024-07-16 02:57:56.006 -- auc: 0.6455057462641556, Δauc: 0.7943351250292124, pcoc: 3.526807970042953, Δpcoc: 0.9740348989313299, loss: 0.0009477490388295229, Δloss: 0.0005508476495742798, #instance: 136000
2024-07-16 02:57:56.016 -- auc: 0.6471304028465981, Δauc: 0.7739753460015131, pcoc: 3.494346027426411, Δpcoc: 1.1768456023672353, loss: 0.0009412538452424864, Δloss: 0.0004995806813240051, #instance: 138000
2024-07-16 02:57:56.026 -- auc: 0.6491304160934495, Δauc: 0.808561731557377, pcoc: 3.460864796049027, Δpcoc: 1.138104369242986, loss: 0.0009348555735179356, Δloss: 0.0004933

[Stage 4:>                  (0 + 8) / 8][Stage 18:>                (0 + 8) / 16]

2024-07-16 02:58:57.271 -- auc: 0.6986909876562051, Δauc: 0.7151654582636178, pcoc: 2.460678879269161, Δpcoc: 1.0785626010461287, loss: 0.0007539774689300002, Δloss: 0.000510942816734314, #instance: 242000
2024-07-16 02:59:00.038 -- auc: 0.6994391774675102, Δauc: 0.7948117696319135, pcoc: 2.446740411508744, Δpcoc: 0.9483551272639522, loss: 0.0007522372395777311, Δloss: 0.000541669487953186, #instance: 244000
2024-07-16 02:59:00.050 -- auc: 0.6998179806304875, Δauc: 0.7415704038220594, pcoc: 2.4394821639594864, Δpcoc: 1.290128856091886, loss: 0.0007496237599752783, Δloss: 0.00043077924847602844, #instance: 246000
2024-07-16 02:59:00.414 -- auc: 0.7006842148220543, Δauc: 0.8274007395510402, pcoc: 2.431296185302168, Δpcoc: 1.254112522776534, loss: 0.0007470382014589925, Δloss: 0.00042901450395584106, #instance: 248000
2024-07-16 02:59:00.426 -- auc: 0.7012855965635184, Δauc: 0.7816555669398907, pcoc: 2.4201805160289935, Δpcoc: 1.045311172803243, loss: 0.0007452732391357422, Δloss: 0.00052

[Stage 4:>                  (0 + 8) / 8][Stage 18:>                (0 + 8) / 16]

2024-07-16 03:00:00.847 -- auc: 0.7234838894211453, Δauc: 0.7608213484225325, pcoc: 2.014416766147779, Δpcoc: 0.795892792232966, loss: 0.0006743266806427369, Δloss: 0.0006066951155662537, #instance: 354000
2024-07-16 03:00:02.143 -- auc: 0.7237846134016022, Δauc: 0.7828461538461539, pcoc: 2.007804642717835, Δpcoc: 0.8877109336853027, loss: 0.0006734547258427973, Δloss: 0.0005191187262535095, #instance: 356000
2024-07-16 03:00:02.373 -- auc: 0.7241956823022546, Δauc: 0.8046247939486374, pcoc: 2.005178338066349, Δpcoc: 1.3271142280463017, loss: 0.0006717720917483282, Δloss: 0.00037226322293281556, #instance: 358000
2024-07-16 03:00:03.034 -- auc: 0.7246688004784082, Δauc: 0.8218488345779588, pcoc: 2.0019025998123947, Δpcoc: 1.2236417863104079, loss: 0.0006702107204331292, Δloss: 0.0003907252550125122, #instance: 360000
2024-07-16 03:00:03.046 -- auc: 0.7251127544001537, Δauc: 0.8109546384086261, pcoc: 1.996881825237311, Δpcoc: 1.016803806478327, loss: 0.0006690271363403257, Δloss: 0.0004

[Stage 4:>                  (0 + 8) / 8][Stage 18:>                (0 + 8) / 16]

2024-07-16 03:00:56.328 -- auc: 0.7388873011870434, Δauc: 0.8159157321118307, pcoc: 1.803396127902478, Δpcoc: 0.9337282180786133, loss: 0.0006329111421423336, Δloss: 0.00051805979013443, #instance: 454000
2024-07-16 03:00:56.761 -- auc: 0.7389910172376818, Δauc: 0.7637199569412167, pcoc: 1.7998422783749044, Δpcoc: 1.04719269509409, loss: 0.000632488079630492, Δloss: 0.0005364528894424438, #instance: 456000
2024-07-16 03:00:56.771 -- auc: 0.7390479109059234, Δauc: 0.7538599747275312, pcoc: 1.7958615835573661, Δpcoc: 0.9651211958665115, loss: 0.0006321297549524682, Δloss: 0.0005504317283630371, #instance: 458000
2024-07-16 03:00:57.182 -- auc: 0.7393329821852248, Δauc: 0.8028931993456415, pcoc: 1.7914163602341198, Δpcoc: 0.9101306308399547, loss: 0.0006317540315182313, Δloss: 0.0005457133650779724, #instance: 460000
2024-07-16 03:00:57.192 -- auc: 0.7394717709365675, Δauc: 0.7729638201336314, pcoc: 1.7896594745920817, Δpcoc: 1.2959746091793745, loss: 0.0006309045074564038, Δloss: 0.00043

[Stage 4:>                  (0 + 8) / 8][Stage 18:>                (0 + 8) / 16]

2024-07-16 03:01:57.203 -- auc: 0.7491554978366304, Δauc: 0.7860995559710212, pcoc: 1.6477998679157726, Δpcoc: 0.946975005756725, loss: 0.0006090894679794108, Δloss: 0.0005547821521759034, #instance: 562000
2024-07-16 03:01:58.123 -- auc: 0.7494409708839089, Δauc: 0.8309344240002511, pcoc: 1.6457429705058941, Δpcoc: 1.0800542052911253, loss: 0.0006086381744619802, Δloss: 0.00048182469606399534, #instance: 564000
2024-07-16 03:01:58.664 -- auc: 0.7496098806085567, Δauc: 0.7963819420653953, pcoc: 1.6429833360186468, Δpcoc: 0.9517971056478994, loss: 0.0006083905735833064, Δloss: 0.0005385671257972717, #instance: 566000
2024-07-16 03:01:58.986 -- auc: 0.7496917756187422, Δauc: 0.7739242552079001, pcoc: 1.641552357580677, Δpcoc: 1.189662678297176, loss: 0.0006079057216854162, Δloss: 0.00047069263458251955, #instance: 568000
2024-07-16 03:01:59.277 -- auc: 0.7498445818360597, Δauc: 0.7919616687602299, pcoc: 1.6389943787410806, Δpcoc: 0.993720531463623, loss: 0.000607675975561142, Δloss: 0.00

[Stage 4:>                  (0 + 8) / 8][Stage 18:=>               (1 + 8) / 16]

2024-07-16 03:02:43.377 -- auc: 0.757053805689273, Δauc: 0.7788181847893585, pcoc: 1.55785882791647, Δpcoc: 1.1456185198844748, loss: 0.0005959442080354985, Δloss: 0.0004981119632720948, #instance: 648000


[Stage 4:>                  (0 + 8) / 8][Stage 18:===>             (3 + 8) / 16]

2024-07-16 03:02:43.730 -- auc: 0.7570823470881458, Δauc: 0.7666411079695212, pcoc: 1.5567637812407915, Δpcoc: 1.1669769395481457, loss: 0.0005956542628355674, Δloss: 0.0004988564835695309, #instance: 649941
2024-07-16 03:02:44.270 -- auc: 0.757209341968427, Δauc: 0.8042793367346939, pcoc: 1.5561675369811954, Δpcoc: 1.3220522284507752, loss: 0.0005951580670010832, Δloss: 0.0004339090585708618, #instance: 651941
2024-07-16 03:02:44.281 -- auc: 0.7573563706909519, Δauc: 0.810843989769821, pcoc: 1.5551957478106886, Δpcoc: 1.2151559193929036, loss: 0.0005947458740056842, Δloss: 0.0004603831171989441, #instance: 653941
2024-07-16 03:02:45.089 -- auc: 0.7573915194529343, Δauc: 0.7708448507344812, pcoc: 1.5536243439612842, Δpcoc: 1.0764313019238985, loss: 0.0005945876552387935, Δloss: 0.0005428547859191895, #instance: 655941




2024-07-16 03:02:59.632 -- auc: 0.7576000047860854, Δauc: 0.8225834292289989, pcoc: 1.5514864285006358, Δpcoc: 0.9356501622633501, loss: 0.0005944743247387692, Δloss: 0.0005563717509220345, #instance: 657892
2024-07-16 03:03:04.737 -- auc: 0.7576868971385291, Δauc: 0.7876598664919833, pcoc: 1.5501527519853306, Δpcoc: 1.0789796577559576, loss: 0.0005944139315576561, Δloss: 0.0005726547909723054, #instance: 659718
2024-07-16 03:03:07.000 -- auc: 0.7577556898343811, Δauc: 0.7837581113387978, pcoc: 1.5486869974910202, Δpcoc: 1.0618427495161693, loss: 0.0005941236485437879, Δloss: 0.000498371183872223, #instance: 661718
2024-07-16 03:03:10.075 -- auc: 0.7579238769904101, Δauc: 0.8103622341668614, pcoc: 1.5465117269716266, Δpcoc: 0.9140617110512473, loss: 0.0005939649965829914, Δloss: 0.0005414735674858093, #instance: 663718
2024-07-16 03:03:12.428 -- auc: 0.7579943842550344, Δauc: 0.7842390464759776, pcoc: 1.5456127803683568, Δpcoc: 1.2101593572039937, loss: 0.0005935772200650913, Δloss: 0.



2024-07-16 03:03:47.945 -- auc: 0.7614566813676472, Δauc: 0.769403386990605, pcoc: 1.5030231880657476, Δpcoc: 1.420611023902893, loss: 0.0005862451310536223, Δloss: 0.0003970984518527985, #instance: 723306
2024-07-16 03:03:48.840 -- auc: 0.7616260908682676, Δauc: 0.8295918367346939, pcoc: 1.5020773839183212, Δpcoc: 1.0885481655597686, loss: 0.0005857810276049892, Δloss: 0.00041793662309646607, #instance: 725306
2024-07-16 03:03:52.372 -- auc: 0.7616528154185869, Δauc: 0.7730315116095403, pcoc: 1.5004037633429357, Δpcoc: 0.9362327043826764, loss: 0.0005856693096340633, Δloss: 0.0005451544523239136, #instance: 727306
2024-07-16 03:03:53.413 -- auc: 0.7618724547687813, Δauc: 0.8451559788171421, pcoc: 1.4994390030794826, Δpcoc: 1.1307118249976116, loss: 0.0005853000777915707, Δloss: 0.0004510278105735779, #instance: 729306
2024-07-16 03:03:54.337 -- auc: 0.7620366467847939, Δauc: 0.8178553104155978, pcoc: 1.4980419647557188, Δpcoc: 1.0151871699912876, loss: 0.0005850841322808824, Δloss: 0.



2024-07-16 03:04:47.387 -- auc: 0.7673386282898296, Δauc: 0.8008646880112176, pcoc: 1.4406786470063937, Δpcoc: 0.9315660736777566, loss: 0.0005756196496465015, Δloss: 0.0005458357334136963, #instance: 825306
2024-07-16 03:04:48.276 -- auc: 0.7674047888008428, Δauc: 0.7951024592307451, pcoc: 1.4396611784294657, Δpcoc: 1.0238280004384566, loss: 0.0005754467530212774, Δloss: 0.0005041004419326782, #instance: 827306
2024-07-16 03:04:48.594 -- auc: 0.7675163645975813, Δauc: 0.8176967593967986, pcoc: 1.4390730978560917, Δpcoc: 1.164521527844806, loss: 0.0005751220996622961, Δloss: 0.00044082826375961304, #instance: 829306
2024-07-16 03:04:49.189 -- auc: 0.7675439100235211, Δauc: 0.7809410298778341, pcoc: 1.4374591916210535, Δpcoc: 0.8678352330860338, loss: 0.0005751253874384572, Δloss: 0.0005764886736869812, #instance: 831306
2024-07-16 03:04:49.201 -- auc: 0.767688953359614, Δauc: 0.8290764421710713, pcoc: 1.436686372256754, Δpcoc: 1.0902056905958388, loss: 0.0005748278121435668, Δloss: 0.0



2024-07-16 03:05:49.127 -- auc: 0.7719572218791393, Δauc: 0.7916290746479425, pcoc: 1.3916930334188182, Δpcoc: 1.2504042234176245, loss: 0.0005654641428284637, Δloss: 0.0004294881224632263, #instance: 935306
2024-07-16 03:05:50.402 -- auc: 0.7720230098778572, Δauc: 0.8043685283545254, pcoc: 1.3914730512381723, Δpcoc: 1.2602363134685315, loss: 0.0005651454355879841, Δloss: 0.0004161010384559631, #instance: 937306
2024-07-16 03:05:50.896 -- auc: 0.7720416837776235, Δauc: 0.7782783695705044, pcoc: 1.3909483174080721, Δpcoc: 1.107242226600647, loss: 0.0005649119090836765, Δloss: 0.00045546901226043704, #instance: 939306
2024-07-16 03:05:51.336 -- auc: 0.7720401382923361, Δauc: 0.7798305348640919, pcoc: 1.3892861578466733, Δpcoc: 0.7890618717859662, loss: 0.0005650477056050646, Δloss: 0.0006288249492645264, #instance: 941306
2024-07-16 03:05:51.727 -- auc: 0.7721162935431214, Δauc: 0.8095408163265306, pcoc: 1.3890250612240873, Δpcoc: 1.2401151299476623, loss: 0.0005647447341583636, Δloss: 0



2024-07-16 03:06:47.841 -- auc: 0.7761543710565375, Δauc: 0.7776807288779172, pcoc: 1.3523480136484092, Δpcoc: 0.9988038588543328, loss: 0.0005564309394079979, Δloss: 0.0005143601894378662, #instance: 1045306
2024-07-16 03:06:50.923 -- auc: 0.7762872269844635, Δauc: 0.8490721374215655, pcoc: 1.35194190043886, Δpcoc: 1.1289062914641008, loss: 0.0005562271182944885, Δloss: 0.00044969940185546873, #instance: 1047306
2024-07-16 03:06:52.430 -- auc: 0.7763427447387803, Δauc: 0.8053253765274225, pcoc: 1.3515190779112272, Δpcoc: 1.1137142923143175, loss: 0.0005560587269042077, Δloss: 0.0004678800702095032, #instance: 1049306
2024-07-16 03:06:52.746 -- auc: 0.7763828739496643, Δauc: 0.7998730063870317, pcoc: 1.351289845353937, Δpcoc: 1.2095346683409156, loss: 0.0005558476360035436, Δloss: 0.0004450981616973877, #instance: 1051306
2024-07-16 03:06:53.369 -- auc: 0.7764313931450333, Δauc: 0.8020526297098618, pcoc: 1.3501353351865402, Δpcoc: 0.8446390299961485, loss: 0.0005558691612495812, Δloss:



2024-07-16 03:07:47.266 -- auc: 0.7793016996538739, Δauc: 0.7638359934490411, pcoc: 1.3200062653250493, Δpcoc: 0.949004663611358, loss: 0.0005510085029214629, Δloss: 0.0005580779314041138, #instance: 1153306
2024-07-16 03:07:48.513 -- auc: 0.7794004588836299, Δauc: 0.8342925683146423, pcoc: 1.3192807648021145, Δpcoc: 0.9294358491897583, loss: 0.0005509195122847412, Δloss: 0.0004996027946472168, #instance: 1155306
2024-07-16 03:07:48.524 -- auc: 0.7793889067907023, Δauc: 0.7710424800148726, pcoc: 1.3189838144710864, Δpcoc: 1.130055915225636, loss: 0.0005507977572924943, Δloss: 0.0004804656207561493, #instance: 1157306
2024-07-16 03:07:53.360 -- auc: 0.7794979562882491, Δauc: 0.8376229624631557, pcoc: 1.317907436762101, Δpcoc: 0.7975716056494877, loss: 0.0005507764529508813, Δloss: 0.0005384486317634582, #instance: 1159306
2024-07-16 03:07:54.645 -- auc: 0.779547065662117, Δauc: 0.8096814060438973, pcoc: 1.317606593416897, Δpcoc: 1.1210369509319926, loss: 0.0005506042055001466, Δloss: 0.



2024-07-16 03:08:47.401 -- auc: 0.7823610449308036, Δauc: 0.8342474745227182, pcoc: 1.2938279710250842, Δpcoc: 1.045012090517127, loss: 0.0005463508598751982, Δloss: 0.000455095499753952, #instance: 1257306
2024-07-16 03:08:47.796 -- auc: 0.7824114421293108, Δauc: 0.813537569369348, pcoc: 1.2936132246244625, Δpcoc: 1.1412032094112663, loss: 0.000546188941489829, Δloss: 0.0004443984627723694, #instance: 1259306
2024-07-16 03:08:48.303 -- auc: 0.7825228722114921, Δauc: 0.8597155484572703, pcoc: 1.2935423519510088, Δpcoc: 1.235003439155785, loss: 0.0005459192515935915, Δloss: 0.00037610819935798644, #instance: 1261306
2024-07-16 03:08:48.728 -- auc: 0.7826356202422331, Δauc: 0.8501134949110346, pcoc: 1.2929991041585012, Δpcoc: 0.953768574461645, loss: 0.0005457962446949117, Δloss: 0.00046822157502174376, #instance: 1263306
2024-07-16 03:08:49.117 -- auc: 0.7826965734254623, Δauc: 0.818308768920177, pcoc: 1.292427609329848, Δpcoc: 0.9349867519067259, loss: 0.0005456999002012039, Δloss: 0.0



2024-07-16 03:08:55.294 -- auc: 0.7828649816533401, Δauc: 0.7960969387755102, pcoc: 1.2907882231924204, Δpcoc: 1.3082618832588195, loss: 0.0005454735411990152, Δloss: 0.00043759599328041076, #instance: 1271306
2024-07-16 03:08:56.226 -- auc: 0.7828759095789395, Δauc: 0.7895018613279547, pcoc: 1.2906275512962995, Δpcoc: 1.1697435030123082, loss: 0.0005453130765499928, Δloss: 0.00044331324100494384, #instance: 1273306
2024-07-16 03:08:56.555 -- auc: 0.782922324631508, Δauc: 0.8146323645553625, pcoc: 1.2899902340943403, Δpcoc: 0.9114238161307114, loss: 0.0005452581928370064, Δloss: 0.0005103163123130798, #instance: 1275306
2024-07-16 03:08:57.060 -- auc: 0.7829774227635575, Δauc: 0.8169227711586942, pcoc: 1.2894478395225188, Δpcoc: 0.9469644098865743, loss: 0.0005451632174159891, Δloss: 0.00048460185527801516, #instance: 1277306
2024-07-16 03:08:57.458 -- auc: 0.7830369124169546, Δauc: 0.8196757697361412, pcoc: 1.2885074987430414, Δpcoc: 0.786089905377092, loss: 0.0005451723349732637, Δlo



2024-07-16 03:09:03.617 -- auc: 0.7832790578902815, Δauc: 0.8666163556531284, pcoc: 1.2873095920014739, Δpcoc: 1.0150482684373856, loss: 0.0005448028073192508, Δloss: 0.0004684283577153813, #instance: 1285168




2024-07-16 03:09:04.665 -- auc: 0.7833355956644583, Δauc: 0.820133518167226, pcoc: 1.2871128906074416, Δpcoc: 1.1410902057375227, loss: 0.0005446269181808909, Δloss: 0.0004316033720970154, #instance: 1287168
2024-07-16 03:09:05.039 -- auc: 0.783442273328161, Δauc: 0.8422165547520661, pcoc: 1.2859774858332245, Δpcoc: 0.7320951037108898, loss: 0.0005446823070798159, Δloss: 0.0005803297162055969, #instance: 1289168
2024-07-16 03:09:05.462 -- auc: 0.7834073718749082, Δauc: 0.762300418374317, pcoc: 1.2855850554293147, Δpcoc: 1.0298103640476863, loss: 0.0005446402281936375, Δloss: 0.0005175168514251709, #instance: 1291168
2024-07-16 03:09:06.283 -- auc: 0.7834790650542306, Δauc: 0.828970160473249, pcoc: 1.2851798020180987, Δpcoc: 1.0150137353450694, loss: 0.0005445207670345197, Δloss: 0.00046739855408668517, #instance: 1293168
2024-07-16 03:09:07.467 -- auc: 0.783549093600307, Δauc: 0.8304476476809545, pcoc: 1.2848991627072854, Δpcoc: 1.0800977307696675, loss: 0.000544355866146864, Δloss: 0.



2024-07-16 03:09:12.357 -- auc: 0.7835596172474558, Δauc: 0.7906722241254616, pcoc: 1.2842985712435375, Δpcoc: 0.9348099496629503, loss: 0.0005443605662675867, Δloss: 0.0005474042892456055, #instance: 1297168
2024-07-16 03:09:12.886 -- auc: 0.7836621918346321, Δauc: 0.8507416879795396, pcoc: 1.2839870054892186, Δpcoc: 1.066050222184923, loss: 0.0005442053123649028, Δloss: 0.00044351011514663695, #instance: 1299168
2024-07-16 03:09:14.178 -- auc: 0.7837302259700358, Δauc: 0.8277834699453552, pcoc: 1.28355673331476, Δpcoc: 1.0009934107462566, loss: 0.0005440984788491473, Δloss: 0.0004747011363506317, #instance: 1301168
2024-07-16 03:09:15.210 -- auc: 0.7838205573640724, Δauc: 0.837382474972403, pcoc: 1.2828854959046536, Δpcoc: 0.8904602174405698, loss: 0.0005440567586252274, Δloss: 0.0005169142484664917, #instance: 1303168
2024-07-16 03:09:19.983 -- auc: 0.7838610164170688, Δauc: 0.8095555642725454, pcoc: 1.2828085602259993, Δpcoc: 1.2204235883859487, loss: 0.0005438538621931378, Δloss: 

                                                                                

2024-07-16 03:09:27.925 -- auc: 0.7839065242436158, Δauc: 0.7606320717908701, pcoc: 1.2816720398513064, Δpcoc: 1.0844598900188098, loss: 0.0005435816696920206, Δloss: 0.0006334474876491779, #instance: 1311645
2024-07-16 03:09:27.935 -- auc: 0.7839545457010328, Δauc: 0.8335021097046413, pcoc: 1.2813592873247428, Δpcoc: 0.9497269332408905, loss: 0.0005437663865995976, Δloss: 0.0007431762699237086, #instance: 1312860
2024-07-16 03:09:27.945 -- auc: 0.7840202067993857, Δauc: 0.8535724452035297, pcoc: 1.2809751588197948, Δpcoc: 0.8732739679515362, loss: 0.0005437249852831285, Δloss: 0.0004984675894966729, #instance: 1314061
2024-07-16 03:09:27.954 -- auc: 0.7840727525444542, Δauc: 0.87704802259887, pcoc: 1.2811067188550616, Δpcoc: 1.5606367111206054, loss: 0.0005435780537432858, Δloss: 0.0003422464518900087, #instance: 1315020
2024-07-16 03:09:27.964 -- auc: 0.7840840785750797, Δauc: 0.7954792658055745, pcoc: 1.2806182923373604, Δpcoc: 0.8912690937519073, loss: 0.000543572396681654, Δloss: 

[38;5;196msaving model to s3://mv-mtg-di-for-poc-datalab/output/dev/model_out/[m
[38;5;196msaving model to s3://mv-mtg-di-for-poc-datalab/output/dev/model_out/[m[38;5;196msaving model to s3://mv-mtg-di-for-poc-datalab/output/dev/model_out/[m[38;5;196msaving model to s3://mv-mtg-di-for-poc-datalab/output/dev/model_out/[m


[38;5;196msaving model to s3://mv-mtg-di-for-poc-datalab/output/dev/model_out/[m
[38;5;196msaving model to s3://mv-mtg-di-for-poc-datalab/output/dev/model_out/[m
[38;5;196msaving model to s3://mv-mtg-di-for-poc-datalab/output/dev/model_out/[m[38;5;196msaving model to s3://mv-mtg-di-for-poc-datalab/output/dev/model_out/[m

Get aws endpoint from env: 
[WARN] 2024-07-16 03:09:28.435 STSAssumeRoleWithWebIdentityCredentialsProvider [140237118707520] Token file must be specified to use STS AssumeRole web identity creds provider.
[2024-07-16 03:09:28.435] [info] [s3_sdk_filesys.cpp:357] Try to open S3 stream: s3://mv-mtg-di-for-poc-datalab/output/dev/model_ou

[2024-07-16 03:13:40.911] [info] C[0]:9 has stopped.
[2024-07-16 03:13:40.919] [info] PS job with coordinator address 172.31.0.41:51327 stopped.


[2024-07-16 03:13:41.190] [info] PS job with coordinator address 172.31.0.41:51327 stopped.
[2024-07-16 03:13:41.191] [info] PS job with coordinator address 172.31.0.41:51327 stopped.
[2024-07-16 03:13:41.204] [info] PS job with coordinator address 172.31.0.41:51327 stopped.
[2024-07-16 03:13:41.228] [info] PS job with coordinator address 172.31.0.41:51327 stopped.
[2024-07-16 03:13:41.231] [info] PS job with coordinator address 172.31.0.41:51327 stopped.
[2024-07-16 03:13:41.235] [info] PS job with coordinator address 172.31.0.41:51327 stopped.
[2024-07-16 03:13:41.243] [info] PS job with coordinator address 172.31.0.41:51327 stopped.
[2024-07-16 03:13:41.244] [info] PS job with coordinator address 172.31.0.41:51327 stopped.


## Evaluate the Model

To evaluate our model, we use the ``ms.input.read_s3_csv()`` function again to load the test dataset, passing in the column delimiter ``'\t'``.

In [17]:
# test_dataset_path = ROOT_DIR + '/data/test/day_1_0.001_test.csv'
# test_dataset = ms.input.read_s3_csv(spark_session, test_dataset_path, delimiter='\t', column_names=column_names)

test_dataset_path = 's3://mv-mtg-di-for-poc-datalab/2024/06/15/00/part-00000-f79b9ee6-aaf5-4117-88d5-44eea69dcea3-c000.snappy.orc'
# test_dataset = spark_session.read.orc(test_dataset_path)
test_dataset = ms.input.read_s3_csv(spark_session, 
                                     test_dataset_path, 
                                     format='orc',
                                     delimiter='\t', 
                                     multivalue_delimiter="\001", 
                                     column_names=column_names,
                                     multivalue_column_names=column_names[:-1])
# test_dataset.printSchema()

ignore shuffle


Next, we call the ``model.transform()`` method to transform the test dataset, which will add a column named ``rawPrediction`` to the test dataset representing the predicted labels. For ease of integration with Spark MLlib, ``model.transform()`` will also add a column named ``label`` to the test dataset representing the actual labels.

Like the training process, this will take several minutes and you can see the progress by looking at the output of the cell. The transformed test dataset is stored in the ``result`` variable.

In [18]:
result = model.transform(test_dataset)

[2024-07-16 03:13:44.509] [info] PS job with coordinator address 172.31.0.41:59757 started.
[2024-07-16 03:13:44.509] [info] PSRunner::RunPS: pid: 22031, tid: 23805, thread: 0x7fe557f91700
[2024-07-16 03:13:44.509] [info] PSRunner::RunPSCoordinator: pid: 22031, tid: 23805, thread: 0x7fe557f91700
[2024-07-16 03:13:44.510] [info] ActorProcess::Receiving: Coordinator pid: 22031, tid: 23808, thread: 0x7fe569798700
[2024-07-16 03:13:44.534] [info] C[0]:9: The coordinator has connected to 8 servers and 8 workers.
PS Coordinator node [32mC[0]:9[m is ready.


[2024-07-16 03:13:44.529] [info] PS job with coordinator address 172.31.0.41:59757 started.
[2024-07-16 03:13:44.529] [info] PSRunner::RunPS: pid: 22375, tid: 23820, thread: 0x7f8b52f4b700
[2024-07-16 03:13:44.529] [info] PSRunner::RunPSWorker: pid: 22375, tid: 23820, thread: 0x7f8b52f4b700
[38;5;046mps agent registered for process 22375 thread 0x7f8b7fa51740[m
[2024-07-16 03:13:44.530] [info] PS job with coordinator address 172.31.0.41:59757 started.
[2024-07-16 03:13:44.530] [info] PSRunner::RunPS: pid: 22406, tid: 23824, thread: 0x7f8b52f4b700
[2024-07-16 03:13:44.530] [info] PSRunner::RunPSWorker: pid: 22406, tid: 23824, thread: 0x7f8b52f4b700
[2024-07-16 03:13:44.530] [info] PS job with coordinator address 172.31.0.41:59757 started.
[2024-07-16 03:13:44.530] [info] PSRunner::RunPS: pid: 22380, tid: 23825, thread: 0x7f8b52f4b700
[2024-07-16 03:13:44.530] [info] PSRunner::RunPSWorker: pid: 22380, tid: 23825, thread: 0x7f8b52f4b700
[38;5;046mps agent registered for process 22406 t

_11001
[2024-07-16 03:13:44.771] [info] add expr bkdr_hash(_11001, StringBKDRHashFunctionOption::name=_11001)
_11002
[2024-07-16 03:13:44.771] [info] add expr bkdr_hash(_11002, StringBKDRHashFunctionOption::name=_11002)
_11003
[2024-07-16 03:13:44.771] [info] add expr bkdr_hash(_11003, StringBKDRHashFunctionOption::name=_11003)
_11004
[2024-07-16 03:13:44.771] [info] add expr bkdr_hash(_11004, StringBKDRHashFunctionOption::name=_11004)
_11007
[2024-07-16 03:13:44.771] [info] add expr bkdr_hash(_11007, StringBKDRHashFunctionOption::name=_11007)
_11008
[2024-07-16 03:13:44.771] [info] add expr bkdr_hash(_11008, StringBKDRHashFunctionOption::name=_11008)
_11021
[2024-07-16 03:13:44.771] [info] add expr bkdr_hash(_11021, StringBKDRHashFunctionOption::name=_11021)
_11022
[2024-07-16 03:13:44.771] [info] add expr bkdr_hash(_11022, StringBKDRHashFunctionOption::name=_11022)
_11023
[2024-07-16 03:13:44.771] [info] add expr bkdr_hash(_11023, StringBKDRHashFunctionOption::name=_11023)
_11024
[20

2024-07-16 03:17:34.419 -- auc: 0.7817412414829121, Δauc: 0.7817412414829121, pcoc: 1.4757429549568577, Δpcoc: 1.4757429549568577, loss: 0.0004228756427764893, Δloss: 0.0004228756427764893, #instance: 2000
2024-07-16 03:17:36.714 -- auc: 0.8215325670498084, Δauc: 0.854364807007223, pcoc: 1.3041277212255142, Δpcoc: 1.165374979059747, loss: 0.00043420791625976564, Δloss: 0.000445540189743042, #instance: 4000
2024-07-16 03:17:37.440 -- auc: 0.8234624735562502, Δauc: 0.8248358974358975, pcoc: 1.2576682479293257, Δpcoc: 1.1786871433258057, loss: 0.00045395630598068235, Δloss: 0.0004934530854225159, #instance: 6000
2024-07-16 03:17:40.085 -- auc: 0.8258774105906745, Δauc: 0.8341541417384114, pcoc: 1.2729164732378082, Δpcoc: 1.3219286260150729, loss: 0.0004485385343432426, Δloss: 0.00043228521943092344, #instance: 8000




2024-07-16 03:17:43.131 -- auc: 0.8263654279279279, Δauc: 0.8286422061229259, pcoc: 1.217808922816967, Δpcoc: 1.040462806008079, loss: 0.00046418623328208923, Δloss: 0.0005267770290374756, #instance: 10000


[2024-07-16 03:17:46.713] [info] W[0]:12 has stopped.                           
[2024-07-16 03:17:46.713] [info] W[2]:44 has stopped.
[2024-07-16 03:17:46.713] [info] W[1]:28 has stopped.
[2024-07-16 03:17:46.713] [info] W[4]:76 has stopped.
[2024-07-16 03:17:46.713] [info] W[3]:60 has stopped.
[2024-07-16 03:17:46.713] [info] W[5]:92 has stopped.
[2024-07-16 03:17:46.713] [info] W[7]:124 has stopped.
[2024-07-16 03:17:46.713] [info] W[6]:108 has stopped.
[2024-07-16 03:17:46.713] [info] S[2]:42 has stopped.
[2024-07-16 03:17:46.713] [info] S[0]:10 has stopped.
[2024-07-16 03:17:46.713] [info] S[1]:26 has stopped.
[2024-07-16 03:17:46.713] [info] S[5]:90 has stopped.
[2024-07-16 03:17:46.713] [info] S[4]:74 has stopped.
[2024-07-16 03:17:46.713] [info] S[3]:58 has stopped.
[2024-07-16 03:17:46.713] [info] S[7]:122 has stopped.
[2024-07-16 03:17:46.713] [info] S[6]:106 has stopped.
[2024-07-16 03:17:46.714] [info] PS job with coordinator address 172.31.0.41:59757 stopped.
[2024-07-16 0

2024-07-16 03:17:46.607 -- auc: 0.8211806566109294, Δauc: 0.7903020892151327, pcoc: 1.1986445928142018, Δpcoc: 1.0927844842274983, loss: 0.0004772909062956444, Δloss: 0.0005612952204851004, #instance: 11560
2024-07-16 03:17:46.618 -- auc: 0.8237797902561834, Δauc: 0.8493577981651377, pcoc: 1.1813690301619078, Δpcoc: 1.023585557937622, loss: 0.0004819464279274083, Δloss: 0.0005299980619124004, #instance: 12680
2024-07-16 03:17:46.628 -- auc: 0.8244418633400065, Δauc: 0.8313761467889909, pcoc: 1.1717252688493558, Δpcoc: 1.0740018208821616, loss: 0.00048586983611618263, Δloss: 0.0005302884216819491, #instance: 13800
2024-07-16 03:17:46.664 -- auc: 0.8244418633400065, Δauc: 1.0, pcoc: 1.1717252688493558, Δpcoc: nan, loss: 0.00048586983611618263, Δloss: nan, #instance: 13800
2024-07-16 03:17:46.674 -- auc: 0.8244418633400065, Δauc: 1.0, pcoc: 1.1717252688493558, Δpcoc: nan, loss: 0.00048586983611618263, Δloss: nan, #instance: 13800
2024-07-16 03:17:46.684 -- auc: 0.8244418633400065, Δauc: 1

[2024-07-16 03:17:46.876] [info] PS job with coordinator address 172.31.0.41:59757 stopped.
[2024-07-16 03:17:46.889] [info] PS job with coordinator address 172.31.0.41:59757 stopped.
[2024-07-16 03:17:46.889] [info] PS job with coordinator address 172.31.0.41:59757 stopped.
[2024-07-16 03:17:46.890] [info] PS job with coordinator address 172.31.0.41:59757 stopped.
[2024-07-16 03:17:46.891] [info] PS job with coordinator address 172.31.0.41:59757 stopped.
[2024-07-16 03:17:46.892] [info] PS job with coordinator address 172.31.0.41:59757 stopped.
[2024-07-16 03:17:46.900] [info] PS job with coordinator address 172.31.0.41:59757 stopped.
[2024-07-16 03:17:46.912] [info] PS job with coordinator address 172.31.0.41:59757 stopped.


``result`` is a normal PySpark DataFrame and can be inspected by its methods.

In [19]:
result.show(5)

+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+------------

In [20]:
result_pd = result.toPandas()
result_pd[result_pd['label']==0].head()


                                                                                

Unnamed: 0,_11001,_11002,_11003,_11004,_11007,_11008,_11021,_11022,_11023,_11024,...,_80144,_80145,_80146,_80147,_80148,_80149,_80150,_80151,label,rawPrediction
0,"[120248835527556541700.6931471824645996, 1493...","[141772368161457520630.6931471824645996, 1493...",[1187569319056953.4339871406555176],[1187569319062941.3862943649291992],"[120248835527556541700.6931471824645996, 1417...",[1187569319390603.4339871406555176],[93351669474747412440.6931471824645996],[57474464586887399740.6931471824645996],[1187569318903570.6931471824645996],[1187569318902660.6931471824645996],...,"[108444874906803988331.0, 1442976750699403878...",[120778770410835650001.0],[137161293658021960231.0],[177868718758286701041.0],[120778770410834821051.0],[120778770410834530021.0],[11264345888808339641.0],[120778770410778302591.0],0.0,0.155403
1,"[10787832899110671220.6931471824645996, 11267...",[],[1187569319056953.4339871406555176],[],"[118330750822017570580.6931471824645996, 1272...",[1187569319390602.995732307434082],"[10787832899110671220.6931471824645996, 11960...","[112678252360946792430.6931471824645996, 1250...",[1187569318903571.6094379425048828],[1187569318902661.945910096168518],...,"[10159436557395306301.0, 11545538683475017864...","[130216112555249546711.0, 2597973395502793715...",[62423217256752089271.0],[130216112555249132891.0],"[130216112555249009521.0, 1392697776517784263...","[130216112555249051311.0, 1751367607848644577...",[62423217256694666601.0],[130216112555210877641.0],0.0,0.192204
2,"[118074487017061505650.6931471824645996, 1183...","[103441021300152843870.6931471824645996, 1069...",[1187569319056952.5649492740631104],[1187569319062942.4849066734313965],"[103441021300152843870.6931471824645996, 1069...",[1187569319390603.1354942321777344],[],[56781507164361920530.6931471824645996],[],[1187569318902660.6931471824645996],...,"[127059589628045409081.0, 1770294181161403403...",[124069974375180026861.0],[20143758067925801991.0],[127059589628044381051.0],"[127059589628044259601.0, 1770294181161424728...",[124069974375179531441.0],[20143758068203763001.0],[127059589627985483241.0],0.0,0.009473
3,"[100526117827408219330.6931471824645996, 1155...","[101574285697699037070.6931471824645996, 1084...",[1187569319056954.234106540679932],[1187569319062945.164785861968994],"[100526117827408219330.6931471824645996, 1015...",[1187569319390605.283203601837158],"[132182422960092266980.6931471824645996, 1683...","[103057321266917028030.6931471824645996, 1031...",[1187569318903571.0986123085021973],[1187569318902663.97029185295105],...,"[106001042736559328931.0, 1103157403546153005...","[37683130965266694421.0, 6048009056449158698...",[63340032098683249861.0],[60480090564491828621.0],"[103576476166836788421.0, 1063744915184569113...","[121314994446557680821.0, 3768313096526881540...",[91696515358083713801.0],[83815087634930800461.0],0.0,0.08659
4,"[115633972886692014871.3862943649291992, 1231...","[115633972886692014870.6931471824645996, 1225...",[1187569319056953.178053855895996],[1187569319062942.7725887298583984],"[115633972886692014871.6094379425048828, 1225...",[1187569319390603.5835189819335938],[57474464586887399740.6931471824645996],[73737875405770893260.6931471824645996],[1187569318903570.6931471824645996],[1187569318902660.6931471824645996],...,"[17575797228106840461.0, 8425347615015788875...",[120362844306992554561.0],[85052345276137848811.0],[17575797228106177231.0],[63361388077108851291.0],[63361388077108892421.0],[45775881559041977191.0],[63361388079422640031.0],0.0,0.000248


In [21]:
result_pd[result_pd['label']==1].head()

Unnamed: 0,_11001,_11002,_11003,_11004,_11007,_11008,_11021,_11022,_11023,_11024,...,_80144,_80145,_80146,_80147,_80148,_80149,_80150,_80151,label,rawPrediction
63,"[104620418056146081330.6931471824645996, 1062...",[],[1187569319056954.753590106964111],[],"[104620418056146081330.6931471824645996, 1062...",[1187569319390604.430816650390625],"[127892341845370442530.6931471824645996, 1563...","[110950711939454100660.6931471824645996, 1183...",[1187569318903571.7917594909667969],[1187569318902663.367295742034912],...,"[105933614083154295231.0, 1062522390975181368...","[129596072363121941871.0, 1527917215876760072...",[156051290013057733801.0],[15279172158766531601.0],"[105933614083153836971.0, 1072649206291084363...","[128620735839288783061.0, 1295960723631221070...",[156051290012978026391.0],[15279172158731226111.0],1.0,0.096737
113,"[142318104563597162540.6931471824645996, 3526...","[142318104563597162541.3862943649291992, 3610...",[1187569319056953.9889841079711914],[1187569319062945.459585666656494],"[142318104563597162541.6094379425048828, 3526...",[1187569319390605.568344593048096],[36107841458994229780.6931471824645996],"[39630214814404936760.6931471824645996, 40457...",[1187569318903570.6931471824645996],[1187569318902662.5649492740631104],...,"[106061919320742175361.0, 1060631373873155932...",[136917921933945535191.0],[50928361379492542301.0],[136917921933944791131.0],"[106081817414152852661.0, 1369179219339449980...",[136917921933945075631.0],[50928361379415778851.0],[136917921933865744841.0],1.0,0.03093
132,"[100101933848291012220.6931471824645996, 1016...","[107734475477308076950.6931471824645996, 1114...",[1187569319056955.288267135620117],[1187569319062944.574710845947266],"[100101933848291012220.6931471824645996, 1016...",[1187569319390605.463831901550293],"[172840892160596556460.6931471824645996, 1730...","[107734475477308076951.0986123085021973, 1102...",[1187569318903572.397895336151123],[1187569318902663.6888794898986816],...,"[110204691836615960801.0, 1122047231687370522...",[52646567680003071521.0],[67103645266423977451.0],[135869932307155051571.0],"[110204691836614811401.0, 1358699323071532923...","[135869932307153329031.0, 6425259470847241092...",[183937062994583982991.0],[64252594708416056731.0],1.0,0.100108
315,[],[],[],[],[],[],[],[],[],[],...,"[109607787170272620181.0, 1107201198244415906...",[126855478711768204071.0],[26881475138864686101.0],[114504604610535394681.0],[126855478711768645441.0],[126855478711767703551.0],[11264281730343989061.0],[126855478711711187481.0],1.0,0.007977
329,"[103767651444137829250.6931471824645996, 1039...","[100978332819887972301.0986123085021973, 1019...",[1187569319056954.672828674316406],[1187569319062944.9972124099731445],"[101916632624280216970.6931471824645996, 1071...",[1187569319390604.9628448486328125],"[103918480814268574851.3862943649291992, 1064...","[100978332819887972301.0986123085021973, 1037...",[1187569318903573.4339871406555176],[1187569318902664.8978400230407715],...,"[116579843032178911731.0, 1191277981337401093...","[145517052396087058301.0, 1552937270282783357...",[50088960850490903331.0],[155293727028277601971.0],"[112179979732406231741.0, 1191277981337422315...","[119127798133742272031.0, 1455170523960872540...",[61040712848087343141.0],[119127798133661016761.0],1.0,0.060428


Finally, we use ``pyspark.ml.evaluation.BinaryClassificationEvaluator`` to compute test AUC.

In [22]:
import pyspark
evaluator = pyspark.ml.evaluation.BinaryClassificationEvaluator()
test_auc = evaluator.evaluate(result)
print('test_auc: %g' % test_auc)

test_auc: 0.82442


When all computations are done, we should call the ``stop()`` method of ``spark_session`` to make sure all the resources are released.

In [23]:
# spark_session.stop()

## Summary

We illustrated how to train and evaluate neural network model in MetaSpore. Users familiar with PyTorch and Spark MLlib should get started easily, which is the design goal of MetaSpore.