# Test Data Generation: Canvas Tables Demonstration Notebook

**Affiliation**: *Kwantum Edu Analytics*. **Last Modified**: *5/26/2023*.

This OEA test data generation notebook illustrates use of the OEA_py, canvas_roster_test_data_gen_py and canvas_activity_test_data_gen_py python classes for creating and developing Canvas table test data in stage1.

Use the main function outlined in the canvas_roster_test_data_gen_py class notebook ```genCanvasRoster(startdate, enddate, reportgendate, use_general_module_base_truth)``` to create test data for **7** roster tables. 

Use the main function outlined in the canvas_activity_test_data_gen_py class notebook ```genCanvasActivity(startdate, enddate, reportgendate, canvas_version, canvas_roster_tables_source_path, max_num_activities_per_class)``` to create test data for **6** activity tables. 

Parameter descriptions, additional information around methods and test data generation processes/comments are given in the class notebooks. 

*These methods only create higher ed. Canvas module test data currently; these can be updated and adapted to generate K-12 test data.*

In [1]:
%run OEA_py

StatementMeta(, 8, -1, Finished, Available)

2023-05-26 14:26:38,811 - OEA - INFO - Now using workspace: dev
2023-05-26 14:26:38,812 - OEA - INFO - OEA initialized.


In [2]:
# set the workspace (this determines where in the data lake you'll be writing to and reading from).
# You can work in 'dev', 'prod', or a sandbox with any name you choose.
# For example, Sam the developer can create a 'sam' workspace and expect to find his datasets in the data lake under oea/sandboxes/sam
oea.set_workspace('dev')

StatementMeta(spark3p3sm, 8, 3, Finished, Available)

2023-05-26 14:26:39,245 - OEA - INFO - Now using workspace: dev


## Generate Canvas Roster Test Data

The functions below create the 7 tables described in the canvas_roster_test_data_gen_py class notebook.

In [None]:
%run /canvas_roster_test_data_gen_py

In [None]:
rosterdatagen = CanvasRosterDataGen()

In [None]:
# depending on sizes of base tables, Canvas generation can take up to 5 min
# refer to the canvas_roster_test_data_gen_py class notebook for additional details

start_date = '2022-01-01T00:00:00' # roster start date
end_date = '2022-06-01T00:00:00' # roster end date
report_gen_date = '2022-02-02T00:00:00' # date the tables/reports were (fictitously) generated
use_general_module_base_truth_tables = True # <- choose whether you'd like to generate test data based on user-generated base-truth tables (set to "False"), or to import and use general module base-truth tables (to link with other OEA module test datasets; set to "True").

rosterdatagen.genCanvasRoster(start_date, end_date, report_gen_date, use_general_module_base_truth_tables)

## Generate Canvas Activity Test Data

The functions below create the 6 tables described in the canvas_activity_test_data_gen_py class notebook.

**Note**: The function to generate Canvas activity test data, requires that Canvas roster test data tables have already been created.

In [18]:
%run /canvas_activity_test_data_gen_py

StatementMeta(, 8, -1, Finished, Available)

In [19]:
activitydatagen = CanvasActivityDataGen()

StatementMeta(spark3p3sm, 8, 20, Finished, Available)

In [20]:
start_date = '2022-01-01T00:00:00' # roster start date
end_date = '2022-06-01T00:00:00' # roster end date
report_gen_date = '2022-02-02T00:00:00' # date the tables/reports were (fictitously) generated
canvas_version = 'v2' # version of Canvas data desired to be generated (accepted values: v1 or v2)
canvas_roster_tables_source_path = 'stage1/Transactional/canvas/v2.0' # <- directory path of the Canvas roster tables
max_num_activities_per_class = 3 # <- choose max number of assignments, modules, etc. you'd like to generate per class (NOTE: students with activities are chosen at random).

activitydatagen.genCanvasActivity(start_date, end_date, report_gen_date, canvas_version, canvas_roster_tables_source_path, max_num_activities_per_class)

StatementMeta(spark3p3sm, 8, 21, Finished, Available)

2023-05-26 14:40:49,186 - OEA - INFO - General module base-truth tables already exist - delete the "base_general_modules" folder/directory if you want to replace these.
2023-05-26 14:40:52,299 - OEA - INFO - Generating Canvas test data based on general module base-truth tables...
2023-05-26 14:40:52,305 - azure.core.pipeline.policies.http_logging_policy - INFO - Request URL: 'https://stoeacisd3v08oct.blob.core.windows.net/oea?restype=REDACTED&comp=REDACTED&prefix=REDACTED&delimiter=REDACTED&include=REDACTED'
Request method: 'GET'
Request headers:
    'x-ms-version': 'REDACTED'
    'Accept': 'application/xml'
    'x-ms-date': 'REDACTED'
    'x-ms-client-request-id': '50594a70-fbd3-11ed-bbd9-6045bdf070cc'
    'User-Agent': 'azsdk-python-storage-blob/12.14.1 Python/3.10.6 (Linux-4.15.0-1164-azure-x86_64-with-glibc2.27)'
    'Authorization': 'REDACTED'
No body was attached to the request
2023-05-26 14:40:52,350 - azure.core.pipeline.policies.http_logging_policy - INFO - Response status: 20

  [(c, t) for (_, c), t in zip(pdf_slice.iteritems(), arrow_types)]
  [(c, t) for (_, c), t in zip(pdf_slice.iteritems(), arrow_types)]
  [(c, t) for (_, c), t in zip(pdf_slice.iteritems(), arrow_types)]
  [(c, t) for (_, c), t in zip(pdf_slice.iteritems(), arrow_types)]
  [(c, t) for (_, c), t in zip(pdf_slice.iteritems(), arrow_types)]
  [(c, t) for (_, c), t in zip(pdf_slice.iteritems(), arrow_types)]
  [(c, t) for (_, c), t in zip(pdf_slice.iteritems(), arrow_types)]
  [(c, t) for (_, c), t in zip(pdf_slice.iteritems(), arrow_types)]
  [(c, t) for (_, c), t in zip(pdf_slice.iteritems(), arrow_types)]
  [(c, t) for (_, c), t in zip(pdf_slice.iteritems(), arrow_types)]
  [(c, t) for (_, c), t in zip(pdf_slice.iteritems(), arrow_types)]
  [(c, t) for (_, c), t in zip(pdf_slice.iteritems(), arrow_types)]
  [(c, t) for (_, c), t in zip(pdf_slice.iteritems(), arrow_types)]
  [(c, t) for (_, c), t in zip(pdf_slice.iteritems(), arrow_types)]
  [(c, t) for (_, c), t in zip(pdf_slice.iterite