You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently timestamp_column is the only configuration that is needed to be configured globally in the model config section (usually it's being configured in the properties.yml under elementary in the config tag).
Passing the timestamp_column as a test param will enable running multiple tests with different timestamp columns. For example running a test with updated_at column which represents the update time of the row or running a test with event_time which represents the time the event was sent.
Design
There are three main files where the test macros are implemented - test_table_anomalies.sql, test_column_anomalies.sql and test_all_columns_anomalies.sql (please note that currently there is some code duplication in these files and in the future we will probably fix it).
All of these test macros should receive a new parameter (defined at the end) with a default value 'none', called 'timestamp_column'.
In each test currently there are two lines of code which are responsible for extracting the timestamp_column from the global model config {%- set table_config = elementary.get_table_config_from_graph(model) %} {%- set timestamp_column = elementary.insensitive_get_dict_value(table_config, 'timestamp_column') %}
The macro 'get_table_config_from_graph' returns the timestamp_column and its normalized data type (called 'timestamp_column_data_type')
The following code in the macro 'get_table_config_from_graph' that is responsible for finding the timestamp column data type should be extracted to a macro called find_normalized_data_type_for_column - {% set columns_from_relation = adapter.get_columns_in_relation(model_relation) %} {% if columns_from_relation and columns_from_relation is iterable %} {% for column_obj in columns_from_relation %} {% if column_obj.column | lower == timestamp_column | lower %} {% set timestamp_column_data_type = elementary.normalize_data_type(column_obj.dtype) %}
Then in the test itself if the received timestamp_column new param is not none, use this extracted macro to find the column normalized data type and pass this timestamp_column and timestamp_column_data_type to the relevant functions (get_is_column_timestamp, column_monitoring_query, table_monitoring_query).
If the timestamp_column is none, use the global timestamp column as it is implemented today
The text was updated successfully, but these errors were encountered:
Task Overview
Design
There are three main files where the test macros are implemented - test_table_anomalies.sql, test_column_anomalies.sql and test_all_columns_anomalies.sql (please note that currently there is some code duplication in these files and in the future we will probably fix it).
All of these test macros should receive a new parameter (defined at the end) with a default value 'none', called 'timestamp_column'.
In each test currently there are two lines of code which are responsible for extracting the timestamp_column from the global model config
{%- set table_config = elementary.get_table_config_from_graph(model) %}
{%- set timestamp_column = elementary.insensitive_get_dict_value(table_config, 'timestamp_column') %}
The macro 'get_table_config_from_graph' returns the timestamp_column and its normalized data type (called 'timestamp_column_data_type')
The following code in the macro 'get_table_config_from_graph' that is responsible for finding the timestamp column data type should be extracted to a macro called find_normalized_data_type_for_column -
{% set columns_from_relation = adapter.get_columns_in_relation(model_relation) %} {% if columns_from_relation and columns_from_relation is iterable %} {% for column_obj in columns_from_relation %} {% if column_obj.column | lower == timestamp_column | lower %} {% set timestamp_column_data_type = elementary.normalize_data_type(column_obj.dtype) %}
Then in the test itself if the received timestamp_column new param is not none, use this extracted macro to find the column normalized data type and pass this timestamp_column and timestamp_column_data_type to the relevant functions (get_is_column_timestamp, column_monitoring_query, table_monitoring_query).
If the timestamp_column is none, use the global timestamp column as it is implemented today
The text was updated successfully, but these errors were encountered: