# Script Purpose
This script creates reference data sets for the performance test between Azure Cosmos DB and Azure SQL Database. The data sets represent test processes for three different items: E-bike, road bike and mountain bike. Each item receives its own reference dataset, which can later be used in experiments to insert data into the databases. For both databases - Azure Cosmos DB and Azure SQL Database - one data set is created for each of the three articles.

#### Structure of a data record

A data record consists of:
1. **Orderdata**: Serial number, article name, order number, etc.
2. **Test process steps**: Process step, process name, limit values, setpoints, etc.
3. **Machine settings**: Setting values (setpoints) for the machines for the respective test process steps.

#### Generation of the test process steps

100 test process steps are created for each article. Each of these steps contains:
- Randomly generated values for test parameters such as lower limit, upper limit and target value.
- Several randomly generated machine settings (setpoints).

#### Storage of data records

The reference data records are converted into JSON format and saved in specific folders for Azure Cosmos DB and Azure SQL Database. The storage location and file name are based on the item name and the respective database.

#### Use of the reference data records

By creating and saving these reference data records, subsequent performance tests and database experiments can be carried out efficiently. The generated data records are read in by the respective scripts. Placeholders (such as id, SerialNumber, ArticleName, OrderNumber, MeasuredValue) are filled with actual values and then inserted into the databases.

# Import Libraries

In [None]:
import json
import random
import os

# Define Functions

In [None]:
def generate_and_store_reference_dataset(database, directory_to_store_datasets, article):
    """
    Generates a reference dataset and stores it as a JSON file.

    Args:
        database (str): Database for that a new dataset will generated.
        directory_to_store_datasets (str): Directory to store the dataset.
        article (str): Name of the article.

    Returns:
        None
    """

    # List to store all generated Inspectionsteps 
    inspection_steps = []

    # Generate 100 Process-Steps
    for counter_dataset in range(1,101):


        # Define Lower, Upper and Targetvalue 

        random_lower_border_value = random.randint(3, 100)
        random_upper_border_value = random_lower_border_value + random.randint(5, 20)
        random_target_value = (random_upper_border_value + random_lower_border_value) / 2


        # Generate Setpoints
        set_points = {}

        for setpoint_counter in range(1, random.randint(1, 5) + 1):

            if setpoint_counter == 1:
                setpoint_value = random.randint(1, 5)
            elif setpoint_counter == 2:
                setpoint_value = random.randint(20, 25)
            elif setpoint_counter == 3:
                setpoint_value = random.randint(30, 35)
            elif setpoint_counter == 4:
                setpoint_value = random.randint(15, 20)
            elif setpoint_counter == 5:
                setpoint_value = random.randint(5, 10)
            set_points[f"MachineSetPointGroupFor{article}.AnyMachineSetpoint{setpoint_counter}"] = str(setpoint_value)

        # Define the Inspection Step 

        inspection = {
            "InspectionStep": str(counter_dataset),
            "InspectionName": f"InspectionStepFor{article}{counter_dataset}",
            "InspectionLowerBorderValue": str(random_lower_border_value),
            "InspectionTargetValue": str(random_target_value),
            "InspectionUpperBorderValue": str(random_upper_border_value),
            "InspectionUnit": "AnyUnit",
            "InspectionResultMeasuredValue": "{measured_value}",
            "InspectionResultPassed": "True",
            "InspectionSetpoints": set_points
        }

        # Append all Generated Inspection-Steps to the list 
        inspection_steps.append(inspection)

    # Create the head of the result data file 
        
    if database == "Azure-Cosmos-DB":
        data = {
            "id": "{id}",
            "OrderNumber": "{order_number}",
            "ArticleName": article,
            "MachineName": "{machine_name}",
            "SerialNumber": "{serial_number}",
            "InspectionDateTime": "{inspection_datetime}",
            "UpdateDateTime": "{update_datetime}",
            "InspectionsAndResults": inspection_steps
        }
              
    else:
        data = {
            "OrderNumber": "{order_number}",
            "ArticleName": article,
            "MachineName": "{machine_name}",
            "SerialNumber": "{serial_number}",
            "InspectionsAndResults": inspection_steps
        }

    # Converting the data to JSON and print the data
    json_data = json.dumps(data, indent=4)

    # Save the generated dataset
    # save_json_dataset(article, json_data, directory_to_store_datasets)
    
    # Create the directory if it does not exist
    if not os.path.exists(directory_to_store_datasets):
        os.makedirs(directory_to_store_datasets)
        
    # Save the JSON data in a file in the directory
    file_path = os.path.join(directory_to_store_datasets, f"reference_dataset_{article.lower()}.json")
    with open(file_path, "w") as output_file:
        output_file.write(json_data)
        
    # Print if dataset is successfuly generated
    print(f"JSON-File stored at: {file_path}")


# Main-Script

In [None]:
database_to_generate_data = ["Azure_Cosmos_DB", "Azure_SQL_Database"]
article_names             = ["EBike","Roadbike","Mountainbike"]

for database in database_to_generate_data:
    
    print(f'Generating Datasets for {database}')
    directory_to_store_datasets = f'Reference_Datasets_{database}'
    
    for article in article_names:
        generate_and_store_reference_dataset(database, directory_to_store_datasets, article)