# Bing Data Bot Detection Training Pipeline

## Prerequisite

- Create windows compute, see https://eemo.visualstudio.com/TEE/_git/TEEGit?path=%2FOffline%2FSmartCompose%2Faml%2FSetup_AML_Windows_Compute_Cluster.md&version=GBhaoz%2Fsmartcompose&_a=preview for instructions.
- Manually install dependencies on created windows compute by
```
pip install azureml-pipeline-wrapper[notebooks]==0.1.0.16323496 azureml-sdk==0.1.0.16323496 --extra-index-url https://azuremlsdktestpypi.azureedge.net/CLI-SDK-Runners-Validation/16323496
pip install pandas pyarrow
```

In [1]:
import os
from azureml.core import Workspace
from azureml.core.compute import AmlCompute, ComputeTarget
from azureml.pipeline.wrapper import Module, Pipeline, dsl

## Configure workspace and compute

In [2]:
# configure workspace information here.
workspace = Workspace.get(
    name='heta-EUS',
    subscription_id='e9b2ec51-5c94-4fa8-809a-dc1e695e4896',
    resource_group='thy-experiment'
)
print(workspace.name, workspace.resource_group, workspace.location, workspace.subscription_id, sep = '\n')

If you run your code in unattended mode, i.e., where you can't give a user input, then we recommend to use ServicePrincipalAuthentication or MsiAuthentication.
Please refer to aka.ms/aml-notebook-auth for different authentication mechanisms in azureml-sdk.


heta-EUS
thy-experiment
eastus
e9b2ec51-5c94-4fa8-809a-dc1e695e4896


In [3]:
# specify windows compute name.
windows_compute_target = "windows-cluster"
# specify aml compute name.
aml_compute_target = 'aml-compute'
try:
    windows_compute = AmlCompute(workspace, windows_compute_target)
    print("Found existing windows compute target: {}".format(windows_compute_target))
except:
    print("Need to create a windows compute")

try:
    aml_compute = AmlCompute(workspace, aml_compute_target)
    print("Found existing aml compute target: {}".format(aml_compute_target))
except:
    print("Creating new aml compute target: {}".format(aml_compute_target))  
    provisioning_config = AmlCompute.provisioning_configuration(vm_size = "STANDARD_D2_V2",
                                                                min_nodes = 0, 
                                                                max_nodes = 4)    
    aml_compute = ComputeTarget.create(workspace, aml_compute_target, provisioning_config)
    aml_compute.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)

Found existing windows compute target: windows-cluster
Found existing aml compute target: aml-compute


## Prepare dataset

In [4]:
# prepare dataset.
data_folder = './data'
label_id_file_name = 'bing_data_label.tsv'
fea_id_file_name = 'bing_data_feature.tsv'
target_path = 'bot-detection'
label_id_dataset_name = 'bing-data-label-dataset'
fea_id_dataset_name = 'bing-data-feature-dataset'

In [5]:
# upload dataset
datastore = workspace.get_default_datastore()
datastore.upload_files(files = [f'{data_folder}/{label_id_file_name}'], target_path = target_path, overwrite = True,show_progress = True)
datastore.upload_files(files = [f'{data_folder}/{fea_id_file_name}'], target_path = target_path, overwrite = True,show_progress = True)

Uploading an estimated of 1 files
Uploading ./data/bing_data_label.tsv
Uploaded ./data/bing_data_label.tsv, 1 files out of an estimated total of 1
Uploaded 1 files
Uploading an estimated of 1 files
Uploading ./data/bing_data_feature.tsv
Uploaded ./data/bing_data_feature.tsv, 1 files out of an estimated total of 1
Uploaded 1 files


$AZUREML_DATAREFERENCE_98b6c870873f452190175926ff63c1f0

In [6]:
# register dataset
from azureml.core.dataset import Dataset

label_id_dataset = Dataset.File.from_files((datastore, f'{target_path}/{label_id_file_name}'))
label_id_dataset = label_id_dataset.register(workspace, name=label_id_dataset_name, create_new_version=True)

fea_id_dataset = Dataset.File.from_files((datastore, f'{target_path}/{fea_id_file_name}'))
fea_id_dataset = fea_id_dataset.register(workspace, name=fea_id_dataset_name, create_new_version=True)

## Load modules

In [7]:
# load built-in modules
remove_dup_rows_func = Module.load(workspace, namespace='azureml', name='Remove Duplicate Rows')
join_data_func = Module.load(workspace, namespace='azureml', name='Join Data')
norm_data_func = Module.load(workspace, namespace='azureml', name='Normalize Data')
split_data_func = Module.load(workspace, namespace='azureml', name='Split Data')

In [8]:
# load local modules
module_folder = r'./modules'
yaml_file_name = 'entry.spec.yaml'
tlc_train_module = Module.from_yaml(workspace, yaml_file=f'{module_folder}/tlc_train/{yaml_file_name}')
tlc_test_module = Module.from_yaml(workspace, yaml_file=f'{module_folder}/tlc_test/{yaml_file_name}')

## Set up a pipeline

In [9]:
# define pipeline
@dsl.pipeline(name='bot detection', description='bot detection', default_compute_target='aml-compute')
def generated_pipeline():
    remove_dup_id_in_label = remove_dup_rows_func(
        dataset=label_id_dataset,
        key_column_selection_filter_expression='{\"isFilter\":true,\"rules\":[{\"exclude\":false,\"ruleType\":\"ColumnNames\",\"columns\":[\"Id\"]}]}',
        retain_first_duplicate_row=True
    )
    
    remove_dup_id_in_fea = remove_dup_rows_func(
        dataset=fea_id_dataset,
        key_column_selection_filter_expression='{\"isFilter\":true,\"rules\":[{\"exclude\":false,\"ruleType\":\"ColumnNames\",\"columns\":[\"Id\"]}]}',
        retain_first_duplicate_row=True
    )
    
    join = join_data_func(
        dataset1=remove_dup_id_in_label.outputs.results_dataset,
        dataset2=remove_dup_id_in_fea.outputs.results_dataset,
        comma_separated_case_sensitive_names_of_join_key_columns_for_l='{\"isFilter\":true,\"rules\":[{\"exclude\":false,\"ruleType\":\"ColumnNames\",\"columns\":[\"Id\"]}]}',
        comma_separated_case_sensitive_names_of_join_key_columns_for_r='{\"isFilter\":true,\"rules\":[{\"exclude\":false,\"ruleType\":\"ColumnNames\",\"columns\":[\"Id\"]}]}',
        match_case=True,
        join_type="Inner Join",
        keep_right_key_columns_in_joined_table=False
    )
    
    norm = norm_data_func(
        dataset=join.outputs.results_dataset,
        transformation_method="MinMax",
        use_0_for_constant_columns_when_checked=True,
        columns_to_transform='{\"isFilter\":true,\"rules\":[{\"exclude\":false,\"ruleType\":\"ColumnNames\",\"columns\":[\"BD_NormalizedImpressionWithFDAuthUserCount\"]}]}'
    )
    
    split = split_data_func(
        dataset=norm.outputs.transformed_dataset,
        splitting_mode="Split Rows",
        fraction_of_rows_in_the_first_output_dataset=0.9,
        randomized_split=True,
        random_seed=0,
        stratified_split="False",
        stratification_key_column='{\"isFilter\":true,\"rules\":[{\"exclude\":false,\"ruleType\":\"ColumnNames\",\"columns\":[\"Label\"]}]}'
    )
    
    train = tlc_train_module(
        training_data=split.outputs.results_dataset1,
        predictor="FastRankClassification{nl=60 mil=100 iter=100 lr=0.1}",
        instances_reader="StreamingInstances",
        instances_settings="{header+ label=0 name=1}",
        cache_instances_in_memory="+",
        random_seed=123,
        use_threads="+",
        proportion_of_train_data_to_use=1,
        print_model_summary="+"
    )
    train.runsettings.configure(target=windows_compute_target)
    
    test = tlc_test_module(
        trained_model=train.outputs.trained_model,
        test_data=split.outputs.results_dataset2,
        instances_reader="StreamingInstances",
        instances_settings="{header+ label=0 name=1}",
        cache_instances_in_memory="+",
        random_seed=123,
        use_threads="+",
        proportion_of_train_data_to_use=1,
        print_model_summary="+"
    )
    test.runsettings.configure(target=windows_compute_target)

In [10]:
# create a pipeline
pipeline = generated_pipeline()

In [11]:
# validate pipeline and visualize the graph
pipeline.validate()

windows-cluster not found in workspace, assume this is an AmlCompute
windows-cluster not found in workspace, assume this is an AmlCompute


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

SupportDetectView()

{'result': 'validation passed', 'errors': []}

In [12]:
# submit a pipeline run
pipeline.submit(experiment_name='bing_data_bot_detection').wait_for_completion()

windows-cluster not found in workspace, assume this is an AmlCompute
windows-cluster not found in workspace, assume this is an AmlCompute
Submitted PipelineRun e64b98e7-87cd-419b-b9f8-b03bb73b7c8e
Link to Azure Machine Learning Portal: https://ml.azure.com/experiments/bing_data_bot_detection/runs/e64b98e7-87cd-419b-b9f8-b03bb73b7c8e?wsid=/subscriptions/e9b2ec51-5c94-4fa8-809a-dc1e695e4896/resourcegroups/thy-experiment/workspaces/heta-EUS
PipelineRunId: e64b98e7-87cd-419b-b9f8-b03bb73b7c8e
Link to Azure Machine Learning Portal: https://ml.azure.com/experiments/bing_data_bot_detection/runs/e64b98e7-87cd-419b-b9f8-b03bb73b7c8e?wsid=/subscriptions/e9b2ec51-5c94-4fa8-809a-dc1e695e4896/resourcegroups/thy-experiment/workspaces/heta-EUS
PipelineRun Status: Running


StepRunId: 4202dc95-69a7-4c3a-afcf-a9e5e6887933
Link to Azure Machine Learning Portal: https://ml.azure.com/experiments/bing_data_bot_detection/runs/4202dc95-69a7-4c3a-afcf-a9e5e6887933?wsid=/subscriptions/e9b2ec51-5c94-4fa8-809a-d

2020-06-25 11:28:49,751 studio.modulehost    INFO       |   Invoking ModuleEntry(azureml.studio.modules.datatransform.manipulation.remove_duplicate_rows.remove_duplicate_rows; RemoveDuplicateRowsModule; run)
2020-06-25 11:28:49,751 studio.core          DEBUG      |   Input Ports:
2020-06-25 11:28:49,752 studio.core          DEBUG      |   |   Dataset = <azureml.studio.modulehost.cli_parser.CliInputValue object at 0x7f870b40f240>
2020-06-25 11:28:49,752 studio.core          DEBUG      |   Output Ports:
2020-06-25 11:28:49,752 studio.core          DEBUG      |   |   Results dataset = /mnt/batch/tasks/shared/LS_root/jobs/heta-eus/azureml/4202dc95-69a7-4c3a-afcf-a9e5e6887933/mounts/workspaceblobstore/azureml/4202dc95-69a7-4c3a-afcf-a9e5e6887933/Results_dataset
2020-06-25 11:28:49,752 studio.core          DEBUG      |   Parameters:
2020-06-25 11:28:49,752 studio.core          DEBUG      |   |   Key column selection filter expression = {"isFilter":true,"rules":[{"exclude":false,"ruleType":"C


StepRun(azureml://Remove Duplicate Rows) Execution Summary
StepRun( azureml://Remove Duplicate Rows ) Status: Finalizing
{'runId': '4202dc95-69a7-4c3a-afcf-a9e5e6887933', 'target': 'aml-compute', 'status': 'Finalizing', 'startTimeUtc': '2020-06-25T11:28:22.228473Z', 'properties': {'azureml.runsource': 'azureml.StepRun', 'ContentSnapshotId': '15db5008-6c13-4636-9e88-9dc17f6633b0', 'StepType': 'PythonScriptStep', 'azureml.moduleid': '4ae6c832-f2eb-588f-a6a3-4af9dee9494b', 'azureml.pipelinerunid': 'e64b98e7-87cd-419b-b9f8-b03bb73b7c8e', '_azureml.ComputeTargetType': 'amlcompute', 'ProcessInfoFile': 'azureml-logs/process_info.json', 'ProcessStatusFile': 'azureml-logs/process_status.json'}, 'inputDatasets': [{'dataset': {'id': 'c23c7999-fab0-4e11-b35a-454bebda780b'}, 'consumptionDetails': {'type': 'RunInput', 'inputName': 'Dataset', 'mechanism': 'Mount'}}], 'outputDatasets': [], 'runDefinition': {'script': 'urldecode_invoker.py', 'useAbsolutePath': False, 'arguments': ['python', '-m', 'azu




StepRunId: 2317315d-4532-46b6-b989-de9707209a3a
Link to Azure Machine Learning Portal: https://ml.azure.com/experiments/bing_data_bot_detection/runs/2317315d-4532-46b6-b989-de9707209a3a?wsid=/subscriptions/e9b2ec51-5c94-4fa8-809a-dc1e695e4896/resourcegroups/thy-experiment/workspaces/heta-EUS
StepRun( azureml://Remove Duplicate Rows ) Status: Queued
StepRun( azureml://Remove Duplicate Rows ) Status: Running

Streaming azureml-logs/65_job_prep-tvmps_a56482c300b240509ada69f6cfb808feb28e51b61dba1e1dc21739eef54f5503_d.txt
Entering job preparation. Current time:2020-06-25T11:29:12.180024

Streaming azureml-logs/70_driver_log.txt
Entering context manager injector. Current time:2020-06-25T11:29:17.655419
Initialize DatasetContextManager.
Starting the daemon thread to refresh tokens in background for process with pid = 103
Set Dataset Dataset's target path to /tmp/tmpfhquwfi1
Enter __enter__ of DatasetContextManager
SDK version: azureml-core==1.6.0.post1 azureml-dataprep==1.6.3
Processing 'D


Streaming azureml-logs/75_job_post-tvmps_a56482c300b240509ada69f6cfb808feb28e51b61dba1e1dc21739eef54f5503_d.txt
Entering job release. Current time:2020-06-25T11:29:41.018491
Failure while loading azureml_run_type_providers. Failed to load entrypoint azureml.PipelineRun = azureml.pipeline.core.run:PipelineRun._from_dto with exception (ruamel.yaml 0.15.89 (/azureml-envs/azureml_a62e52e5220d24c0f3e291434f99287c/lib/python3.6/site-packages), Requirement.parse('ruamel.yaml>0.16.7'), {'azureml-core'}).
Failure while loading azureml_run_type_providers. Failed to load entrypoint azureml.ReusedStepRun = azureml.pipeline.core.run:StepRun._from_reused_dto with exception (ruamel.yaml 0.15.89 (/azureml-envs/azureml_a62e52e5220d24c0f3e291434f99287c/lib/python3.6/site-packages), Requirement.parse('ruamel.yaml>0.16.7'), {'azureml-core'}).
Failure while loading azureml_run_type_providers. Failed to load entrypoint azureml.StepRun = azureml.pipeline.core.run:StepRun._from_dto with exception (ruamel.yam




StepRunId: 6793c3fb-2712-44ac-b36c-6b5c7b7632e5
Link to Azure Machine Learning Portal: https://ml.azure.com/experiments/bing_data_bot_detection/runs/6793c3fb-2712-44ac-b36c-6b5c7b7632e5?wsid=/subscriptions/e9b2ec51-5c94-4fa8-809a-dc1e695e4896/resourcegroups/thy-experiment/workspaces/heta-EUS
StepRun( azureml://Join Data ) Status: NotStarted
StepRun( azureml://Join Data ) Status: Queued
StepRun( azureml://Join Data ) Status: Running

Streaming azureml-logs/55_azureml-execution-tvmps_a56482c300b240509ada69f6cfb808feb28e51b61dba1e1dc21739eef54f5503_d.txt
2020-06-25T11:30:25Z Executing 'Copy ACR Details file' on 10.0.0.4
2020-06-25T11:30:25Z Copy ACR Details file succeeded on 10.0.0.4. Output: 
>>>   
>>>   
2020-06-25T11:30:25Z Starting output-watcher...
2020-06-25T11:30:25Z IsDedicatedCompute == True, won't poll for Low Pri Preemption
Login Succeeded
Using default tag: latest
latest: Pulling from azureml/azureml_1b19c9f6d5dc5a4da4df9e1ba0ccb2d7
Digest: sha256:d5173bb5fc75f022a73a7a0fd

2020-06-25 11:30:41,721 studio.core          INFO       |   |   |   Create sidecar file 'data.visualization' - End with 0.6315s elapsed.
2020-06-25 11:30:41,722 studio.core          INFO       |   |   |   Create sidecar file 'data.metadata' - Start:
2020-06-25 11:30:41,794 studio.core          INFO       |   |   |   Create sidecar file 'data.metadata' - End with 0.0724s elapsed.
2020-06-25 11:30:41,947 studio.common        INFO       |   |   |   Writing meta successfully, datatype=DataTypes.DATASET
2020-06-25 11:30:41,948 studio.core          INFO       |   |   |   Create data type file 'data_type.json' - Start:
2020-06-25 11:30:42,016 studio.core          INFO       |   |   |   Create data type file 'data_type.json' - End with 0.0681s elapsed.
2020-06-25 11:30:42,016 studio.core          INFO       |   |   Handle output port "Results dataset" - End with 1.0797s elapsed.
2020-06-25 11:30:42,016 studio.core          INFO       |   ModuleReflector._handle_output_ports - End with 1.0800s 




StepRunId: fe1d9a64-538a-4d8f-9fbe-62d098ae6ad6
Link to Azure Machine Learning Portal: https://ml.azure.com/experiments/bing_data_bot_detection/runs/fe1d9a64-538a-4d8f-9fbe-62d098ae6ad6?wsid=/subscriptions/e9b2ec51-5c94-4fa8-809a-dc1e695e4896/resourcegroups/thy-experiment/workspaces/heta-EUS
StepRun( azureml://Normalize Data ) Status: NotStarted
StepRun( azureml://Normalize Data ) Status: Queued

Streaming azureml-logs/55_azureml-execution-tvmps_a56482c300b240509ada69f6cfb808feb28e51b61dba1e1dc21739eef54f5503_d.txt
2020-06-25T11:31:25Z Executing 'Copy ACR Details file' on 10.0.0.4
2020-06-25T11:31:25Z Copy ACR Details file succeeded on 10.0.0.4. Output: 
>>>   
>>>   
2020-06-25T11:31:25Z Starting output-watcher...
2020-06-25T11:31:25Z IsDedicatedCompute == True, won't poll for Low Pri Preemption
Login Succeeded
Using default tag: latest
latest: Pulling from azureml/azureml_1b19c9f6d5dc5a4da4df9e1ba0ccb2d7
Digest: sha256:d5173bb5fc75f022a73a7a0fda3561c5b05573048b7963c1e96343402f5104


Streaming azureml-logs/75_job_post-tvmps_a56482c300b240509ada69f6cfb808feb28e51b61dba1e1dc21739eef54f5503_d.txt
Entering job release. Current time:2020-06-25T11:31:43.786181
Failure while loading azureml_run_type_providers. Failed to load entrypoint azureml.PipelineRun = azureml.pipeline.core.run:PipelineRun._from_dto with exception (ruamel.yaml 0.15.89 (/azureml-envs/azureml_a62e52e5220d24c0f3e291434f99287c/lib/python3.6/site-packages), Requirement.parse('ruamel.yaml>0.16.7'), {'azureml-core'}).
Failure while loading azureml_run_type_providers. Failed to load entrypoint azureml.ReusedStepRun = azureml.pipeline.core.run:StepRun._from_reused_dto with exception (ruamel.yaml 0.15.89 (/azureml-envs/azureml_a62e52e5220d24c0f3e291434f99287c/lib/python3.6/site-packages), Requirement.parse('ruamel.yaml>0.16.7'), {'azureml-core'}).
Failure while loading azureml_run_type_providers. Failed to load entrypoint azureml.StepRun = azureml.pipeline.core.run:StepRun._from_dto with exception (ruamel.yam




StepRunId: 6b18e3e3-99e6-4e16-a7fc-bce5e29b2937
Link to Azure Machine Learning Portal: https://ml.azure.com/experiments/bing_data_bot_detection/runs/6b18e3e3-99e6-4e16-a7fc-bce5e29b2937?wsid=/subscriptions/e9b2ec51-5c94-4fa8-809a-dc1e695e4896/resourcegroups/thy-experiment/workspaces/heta-EUS
StepRun( azureml://Split Data ) Status: NotStarted
StepRun( azureml://Split Data ) Status: Queued
StepRun( azureml://Split Data ) Status: Running

Streaming azureml-logs/55_azureml-execution-tvmps_a56482c300b240509ada69f6cfb808feb28e51b61dba1e1dc21739eef54f5503_d.txt
2020-06-25T11:32:53Z Starting output-watcher...
2020-06-25T11:32:53Z IsDedicatedCompute == True, won't poll for Low Pri Preemption
0c092d32cb1a355fe98284ee3f660f560664c472e9d178821e7fe0e02a7474cd

Streaming azureml-logs/65_job_prep-tvmps_a56482c300b240509ada69f6cfb808feb28e51b61dba1e1dc21739eef54f5503_d.txt
Entering job preparation. Current time:2020-06-25T11:32:55.662685
Starting job preparation. Current time:2020-06-25T11:32:56.37


Streaming azureml-logs/75_job_post-tvmps_a56482c300b240509ada69f6cfb808feb28e51b61dba1e1dc21739eef54f5503_d.txt

StepRun(azureml://Split Data) Execution Summary
StepRun( azureml://Split Data ) Status: Finalizing
{'runId': '6b18e3e3-99e6-4e16-a7fc-bce5e29b2937', 'target': 'aml-compute', 'status': 'Finalizing', 'startTimeUtc': '2020-06-25T11:32:51.475902Z', 'properties': {'azureml.runsource': 'azureml.StepRun', 'ContentSnapshotId': '15db5008-6c13-4636-9e88-9dc17f6633b0', 'StepType': 'PythonScriptStep', 'azureml.moduleid': 'fa76b4f0-d6e3-517f-9dc3-a16d6e02312a', 'azureml.pipelinerunid': 'e64b98e7-87cd-419b-b9f8-b03bb73b7c8e', '_azureml.ComputeTargetType': 'amlcompute', 'ProcessInfoFile': 'azureml-logs/process_info.json', 'ProcessStatusFile': 'azureml-logs/process_status.json'}, 'inputDatasets': [], 'outputDatasets': [], 'runDefinition': {'script': 'urldecode_invoker.py', 'useAbsolutePath': False, 'arguments': ['python', '-m', 'azureml.studio.modulehost.module_invoker', '--module-name=azu




StepRunId: d4dc213b-f43a-4e9d-a789-631c3c65e6ff
Link to Azure Machine Learning Portal: https://ml.azure.com/experiments/bing_data_bot_detection/runs/d4dc213b-f43a-4e9d-a789-631c3c65e6ff?wsid=/subscriptions/e9b2ec51-5c94-4fa8-809a-dc1e695e4896/resourcegroups/thy-experiment/workspaces/heta-EUS
StepRun( TLC: Train ) Status: NotStarted
StepRun( TLC: Train ) Status: Queued
StepRun( TLC: Train ) Status: Running

Streaming azureml-logs/65_job_prep-tvmps_6c862dcdf55fc702d9fdab02eb6f81796d806c272539e9545afe20f225dac1a7_d.txt
Entering job preparation. Current time:2020-06-25T11:34:42.490033
Starting job preparation. Current time:2020-06-25T11:34:45.476484
Extracting the control code.
fetching and extracting the control code on master node.
Retrieving project from snapshot: f8291825-d979-47e9-bfdb-6e5fa612f029
Starting the daemon thread to refresh tokens in background for process with pid = 9160
Starting project file download.
Finished project file download.
Entering job preparation. Current t

Uploading output 'Trained_Model_Summary'.
Uploading output 'Trained_Model_Ini_File'.
Exit __exit__ of DatasetContextManager

Streaming azureml-logs/75_job_post-tvmps_6c862dcdf55fc702d9fdab02eb6f81796d806c272539e9545afe20f225dac1a7_d.txt
Entering job release. Current time:2020-06-25T11:36:08.295869
Failure while loading azureml_run_type_providers. Failed to load entrypoint hyperdrive = azureml.train.hyperdrive:HyperDriveRun._from_run_dto with exception (entrypoints 0.2.3 (c:\anaconda\lib\site-packages), Requirement.parse('entrypoints<0.4.0,>=0.3.0'), {'flake8'}).
Starting job release. Current time:2020-06-25T11:36:19.557121
Logging experiment finalizing status in history service.
Starting the daemon thread to refresh tokens in background for process with pid = 9284
Entering context manager injector. Current time:2020-06-25T11:36:19.728945

StepRun(TLC: Train) Execution Summary
StepRun( TLC: Train ) Status: Finalizing
{'runId': 'd4dc213b-f43a-4e9d-a789-631c3c65e6ff', 'target': 'windows-c




StepRunId: 564c05c3-7580-4f44-b151-9f3de65eeddf
Link to Azure Machine Learning Portal: https://ml.azure.com/experiments/bing_data_bot_detection/runs/564c05c3-7580-4f44-b151-9f3de65eeddf?wsid=/subscriptions/e9b2ec51-5c94-4fa8-809a-dc1e695e4896/resourcegroups/thy-experiment/workspaces/heta-EUS
StepRun( TLC: Test ) Status: NotStarted
StepRun( TLC: Test ) Status: Queued
StepRun( TLC: Test ) Status: Running

Streaming azureml-logs/65_job_prep-tvmps_6c862dcdf55fc702d9fdab02eb6f81796d806c272539e9545afe20f225dac1a7_d.txt
Entering job preparation. Current time:2020-06-25T11:38:08.321791
Starting job preparation. Current time:2020-06-25T11:38:13.029414
Extracting the control code.
fetching and extracting the control code on master node.
Retrieving project from snapshot: 089da66a-7f5c-441d-aab0-0a43f384bc1c
Starting the daemon thread to refresh tokens in background for process with pid = 10120
Starting project file download.
Finished project file download.
Entering job preparation. Current tim



PipelineRun Execution Summary
PipelineRun Status: Completed
{'runId': 'e64b98e7-87cd-419b-b9f8-b03bb73b7c8e', 'status': 'Completed', 'startTimeUtc': '2020-06-25T11:27:52.108367Z', 'endTimeUtc': '2020-06-25T11:41:06.039051Z', 'properties': {'azureml.runsource': 'azureml.PipelineRun', 'runSource': 'Designer', 'runType': 'HTTP', 'azureml.parameters': '{}'}, 'inputDatasets': [], 'logFiles': {'logs/azureml/executionlogs.txt': 'https://hetaeus3271596160.blob.core.windows.net/azureml/ExperimentRun/dcid.e64b98e7-87cd-419b-b9f8-b03bb73b7c8e/logs/azureml/executionlogs.txt?sv=2019-02-02&sr=b&sig=drBkjIQqezqk5%2FC77A%2FL0ItLdnW%2Bjh6wcbnI3gtY3rM%3D&st=2020-06-25T11%3A31%3A10Z&se=2020-06-25T19%3A41%3A10Z&sp=r', 'logs/azureml/stderrlogs.txt': 'https://hetaeus3271596160.blob.core.windows.net/azureml/ExperimentRun/dcid.e64b98e7-87cd-419b-b9f8-b03bb73b7c8e/logs/azureml/stderrlogs.txt?sv=2019-02-02&sr=b&sig=QO2HAweKNdT%2Fr0TS7aqfurcL1PKyKY%2F8s0DjUENG39I%3D&st=2020-06-25T11%3A31%3A10Z&se=2020-06-25T19

<RunStatus.completed: 'Completed'>