Copyright (c) 2012-2022 Esri R&D Center Zurich

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

  https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
A copy of the license is available in the repository's LICENSE file.

 # PyPRT - Dataset Collection

This notebook presents a way of collecting data from CGA reports. To do so, we do repeated generations of an initial shape with varying input attributes. Finally, some simple numerical processing is applied on the collected dataset.

In [None]:
import sys
import os

import pyprt
from pyprt.pyprt_utils import visualize_prt_results

import pandas as pd

In [None]:
CS_FOLDER = os.getcwd()

def asset_file(filename):
    return os.path.join(CS_FOLDER, 'data', filename)

#### PRT Initialization

In [None]:
print("\nInitializing PRT.")
pyprt.initialize_prt()

if not pyprt.is_prt_initialized():
    raise Exception("PRT is not initialized")

In [None]:
initial_shape1 = pyprt.InitialShape(
    [0, 0, 0,  10, 0, 0,  10, 0, 10,  0, 0, 20])

rpk = asset_file("extrusion_rule.rpk")
attrs = {}
encoder = 'com.esri.pyprt.PyEncoder'

mod = pyprt.ModelGenerator([initial_shape1])
generated_model = mod.generate_model(
    [attrs], rpk, encoder, {})

visualize_prt_results(generated_model)

#### Gather values from generated models report

In [None]:
def get_sum_report(model):
    sum_rep = {}
    all_rep = model.get_report()
    for it in all_rep:
        if "_sum" in it:
            sum_rep[it] = all_rep[it]
    return sum_rep

In [None]:
initial_shape2 = pyprt.InitialShape(
    [0, 0, 0,  10, 0, 0,  10, 0, 10,  0, 0, 10])
initial_shape3 = pyprt.InitialShape(
    [0, 0, 0,  10, 0, 0,  10, 0, 10,  0, 0, 30])

In [None]:
reports = []
model_to_generate = pyprt.ModelGenerator(
    [initial_shape1, initial_shape2, initial_shape3])

for val in range(0, 10):
    attrs['minBuildingHeight'] = float(val)
    models = model_to_generate.generate_model([attrs], rpk, encoder, {'emitGeometry': False})

    for model in models:
        if model:
            reports.append(get_sum_report(model))

#### Transform report in pandas dataframe for future dataset processing

In [None]:
reports_df = pd.DataFrame(reports)
reports_df

In [None]:
dataset_uniqueRows = reports_df.drop_duplicates()

In [None]:
dataset_uniqueRows

The next steps in a ML/DL application would be to split the dataset into a training and a testing set. Finally, the idea would be to train an algorithm on the training set.

In [None]:
print("\nShutdown PRT.")
pyprt.shutdown_prt()