# Choosing data for TimeBasedCesnetDataset

### Import

In [1]:
import logging
from datetime import datetime

from cesnet_tszoo.utils.enums import AgreggationType, SourceType, TimeFormat
from cesnet_tszoo.datasets import CESNET_TimeSeries24
from cesnet_tszoo.configs import TimeBasedConfig # Time based dataset MUST use TimeBasedConfig

### Setting logger

In [2]:
logging.basicConfig(
    level=logging.INFO,
    format="[%(asctime)s][%(name)s][%(levelname)s] - %(message)s")

### Preparing dataset

In [3]:
time_based_dataset = CESNET_TimeSeries24.get_dataset(data_root="/some_directory/", source_type=SourceType.INSTITUTION_SUBNETS, aggregation=AgreggationType.AGG_1_HOUR, is_series_based=False, display_details=True)

[2025-08-05 19:46:14,649][wrapper_dataset][INFO] - Dataset is time-based. Use cesnet_tszoo.configs.TimeBasedConfig



Dataset details:

    AgreggationType.AGG_1_HOUR
        Time indices: range(0, 6717)
        Datetime: (datetime.datetime(2023, 10, 9, 0, 0, tzinfo=datetime.timezone.utc), datetime.datetime(2024, 7, 14, 21, 0, tzinfo=datetime.timezone.utc))

    SourceType.INSTITUTION_SUBNETS
        Time series indices: [0 1 2 3 4 ... 543 544 545 546 547], Length=548; use 'get_available_ts_indices' for full list
        Features with default values: {'n_flows': 0, 'n_packets': 0, 'n_bytes': 0, 'tcp_udp_ratio_packets': 0.5, 'tcp_udp_ratio_bytes': 0.5, 'dir_ratio_packets': 0.5, 'dir_ratio_bytes': 0.5, 'avg_duration': 0, 'avg_ttl': 0, 'sum_n_dest_asn': 0, 'avg_n_dest_asn': 0, 'std_n_dest_asn': 0, 'sum_n_dest_ports': 0, 'avg_n_dest_ports': 0, 'std_n_dest_ports': 0, 'sum_n_dest_ip': 0, 'avg_n_dest_ip': 0, 'std_n_dest_ip': 0}
        
        Additional data: ['ids_relationship', 'weekends_and_holidays']
        


### Selecting which time series to load

- Sets time series that will be used for train/val/test/all sets

#### Setting ts_ids with count of time series

- Sets time series used in sets with count.
- When `test_ts_ids` is used, this will contain only values that are not in `test_ts_ids`.
- Count must be greater than zero.
- Total sum of time series in `ts_ids` must be smaller than number of time series in dataset without time series used in `test_ts_ids`.
- Is affected by `random_state`.
    - When `random_state` is set, `ts_ids` (and `test_ts_ids`) will contain same time series.

In [4]:
config = TimeBasedConfig(ts_ids=54, random_state = 111)
time_based_dataset.set_dataset_config_and_initialize(config, display_config_details=True, workers=0)

[2025-08-05 19:46:14,657][config][INFO] - Quick validation succeeded.
[2025-08-05 19:46:14,663][config][INFO] - Using all times for all_time_period because train_time_period, val_time_period, and test_time_period are all set to None.
[2025-08-05 19:46:14,664][config][INFO] - Finalization and validation completed successfully.
[2025-08-05 19:46:14,669][cesnet_dataset][INFO] - Updating config on train/val/test/all and selected time series.
100%|██████████| 54/54 [00:00<00:00, 359.45it/s]
[2025-08-05 19:46:14,829][cesnet_dataset][INFO] - Config initialized successfully.



Config Details
    Used for database: CESNET-TimeSeries24
    Aggregation: AgreggationType.AGG_1_HOUR
    Source: SourceType.INSTITUTION_SUBNETS

    Time series
        Time series IDS: [ 54 226 135 160 236 ...   7 118 322 275  86], Length=54
        Test time series IDS: None
    Time periods
        Train time periods: None
        Val time periods: None
        Test time periods: None
        All time periods: range(0, 6718)
    Features
        Taken features: ['n_flows', 'n_packets', 'n_bytes', 'sum_n_dest_asn', 'avg_n_dest_asn', 'std_n_dest_asn', 'sum_n_dest_ports', 'avg_n_dest_ports', 'std_n_dest_ports', 'sum_n_dest_ip', 'avg_n_dest_ip', 'std_n_dest_ip', 'tcp_udp_ratio_packets', 'tcp_udp_ratio_bytes', 'dir_ratio_packets', 'dir_ratio_bytes', 'avg_duration', 'avg_ttl']
        Default values: [0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.5 0.5 0.5 0.5 0.  0. ]
        Time series ID included: True
        Time included: True    
        Time format: TimeFormat.ID_TIME
    S

#### Setting ts_ids with percentage of time series in dataset

- Sets time series used in sets with percentage of time series in dataset.
- When `test_ts_ids` is used, this will contain only values that are not in `test_ts_ids`.
- Percentage must be greater than 0.
- Percentages must be smaller than 1.0 (without time series used in `test_ts_ids`).
- Is affected by `random_state`.
    - When `random_state` is set, `ts_ids` (and `test_ts_ids`) will contain same time series.

In [5]:
config = TimeBasedConfig(ts_ids=0.1, random_state = 111)
time_based_dataset.set_dataset_config_and_initialize(config, display_config_details=True, workers=0)

[2025-08-05 19:46:14,843][config][INFO] - Quick validation succeeded.
[2025-08-05 19:46:14,854][config][INFO] - Using all times for all_time_period because train_time_period, val_time_period, and test_time_period are all set to None.
[2025-08-05 19:46:14,855][config][INFO] - Finalization and validation completed successfully.
[2025-08-05 19:46:14,860][cesnet_dataset][INFO] - Updating config on train/val/test/all and selected time series.
100%|██████████| 54/54 [00:00<00:00, 423.39it/s]
[2025-08-05 19:46:14,995][cesnet_dataset][INFO] - Config initialized successfully.



Config Details
    Used for database: CESNET-TimeSeries24
    Aggregation: AgreggationType.AGG_1_HOUR
    Source: SourceType.INSTITUTION_SUBNETS

    Time series
        Time series IDS: [ 54 226 135 160 236 ...   7 118 322 275  86], Length=54
        Test time series IDS: None
    Time periods
        Train time periods: None
        Val time periods: None
        Test time periods: None
        All time periods: range(0, 6718)
    Features
        Taken features: ['n_flows', 'n_packets', 'n_bytes', 'sum_n_dest_asn', 'avg_n_dest_asn', 'std_n_dest_asn', 'sum_n_dest_ports', 'avg_n_dest_ports', 'std_n_dest_ports', 'sum_n_dest_ip', 'avg_n_dest_ip', 'std_n_dest_ip', 'tcp_udp_ratio_packets', 'tcp_udp_ratio_bytes', 'dir_ratio_packets', 'dir_ratio_bytes', 'avg_duration', 'avg_ttl']
        Default values: [0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.5 0.5 0.5 0.5 0.  0. ]
        Time series ID included: True
        Time included: True    
        Time format: TimeFormat.ID_TIME
    S

#### Setting ts_ids with specific time series indices

- `ts_ids` and `test_ts_ids` must have unique values

In [6]:
config = TimeBasedConfig(ts_ids=[0,1,2,3,4,5])
time_based_dataset.set_dataset_config_and_initialize(config, display_config_details=True, workers=0)

[2025-08-05 19:46:15,015][config][INFO] - Quick validation succeeded.
[2025-08-05 19:46:15,021][config][INFO] - Using all times for all_time_period because train_time_period, val_time_period, and test_time_period are all set to None.
[2025-08-05 19:46:15,022][config][INFO] - Finalization and validation completed successfully.
[2025-08-05 19:46:15,026][cesnet_dataset][INFO] - Updating config on train/val/test/all and selected time series.
100%|██████████| 6/6 [00:00<00:00, 545.00it/s]
[2025-08-05 19:46:15,039][cesnet_dataset][INFO] - Config initialized successfully.



Config Details
    Used for database: CESNET-TimeSeries24
    Aggregation: AgreggationType.AGG_1_HOUR
    Source: SourceType.INSTITUTION_SUBNETS

    Time series
        Time series IDS: [0 1 2 3 4 5], Length=6
        Test time series IDS: None
    Time periods
        Train time periods: None
        Val time periods: None
        Test time periods: None
        All time periods: range(0, 6718)
    Features
        Taken features: ['n_flows', 'n_packets', 'n_bytes', 'sum_n_dest_asn', 'avg_n_dest_asn', 'std_n_dest_asn', 'sum_n_dest_ports', 'avg_n_dest_ports', 'std_n_dest_ports', 'sum_n_dest_ip', 'avg_n_dest_ip', 'std_n_dest_ip', 'tcp_udp_ratio_packets', 'tcp_udp_ratio_bytes', 'dir_ratio_packets', 'dir_ratio_bytes', 'avg_duration', 'avg_ttl']
        Default values: [0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.5 0.5 0.5 0.5 0.  0. ]
        Time series ID included: True
        Time included: True    
        Time format: TimeFormat.ID_TIME
    Sliding window
        Sliding win

### Creating train/val/test sets

- Sets time period in set for every time series in `ts_ids`
- You can leave any set value set as None.
- Can use `nan_threshold` to set how many nan values will be tolerated.
    - `nan_threshold` = 1.0, means that time series can be completely empty.
    - is applied after sets.
    - Is checked seperately for every set.

#### Setting sets with time indices

- Sets sets as range of time indices.
- Sets must follow these rules:
    - Used time periods must be connected.
    - Sets can share subset of times.
    - start of `train_time_period` < start of `val_time_period` < start of `test_time_period`.

In [7]:
config = TimeBasedConfig(ts_ids=54, train_time_period=range(0, 2000), val_time_period=range(2000, 4000), test_time_period=range(4000, 5000))
time_based_dataset.set_dataset_config_and_initialize(config, display_config_details=True, workers=0)

[2025-08-05 19:46:15,060][config][INFO] - Quick validation succeeded.
[2025-08-05 19:46:15,079][config][INFO] - Finalization and validation completed successfully.
[2025-08-05 19:46:15,083][cesnet_dataset][INFO] - Updating config on train/val/test/all and selected time series.
100%|██████████| 54/54 [00:00<00:00, 824.80it/s]
[2025-08-05 19:46:15,155][cesnet_dataset][INFO] - Config initialized successfully.



Config Details
    Used for database: CESNET-TimeSeries24
    Aggregation: AgreggationType.AGG_1_HOUR
    Source: SourceType.INSTITUTION_SUBNETS

    Time series
        Time series IDS: [356 293 374 176 255 ...  80 104  51 485 465], Length=54
        Test time series IDS: None
    Time periods
        Train time periods: range(0, 2000)
        Val time periods: range(2000, 4000)
        Test time periods: range(4000, 5000)
        All time periods: range(0, 5000)
    Features
        Taken features: ['n_flows', 'n_packets', 'n_bytes', 'sum_n_dest_asn', 'avg_n_dest_asn', 'std_n_dest_asn', 'sum_n_dest_ports', 'avg_n_dest_ports', 'std_n_dest_ports', 'sum_n_dest_ip', 'avg_n_dest_ip', 'std_n_dest_ip', 'tcp_udp_ratio_packets', 'tcp_udp_ratio_bytes', 'dir_ratio_packets', 'dir_ratio_bytes', 'avg_duration', 'avg_ttl']
        Default values: [0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.5 0.5 0.5 0.5 0.  0. ]
        Time series ID included: True
        Time included: True    
        T

#### Setting sets with datetime

- Sets sets with tuple of datetime objects.
- Datetime objects are expected to be of UTC.
- Sets must follow these rules:
    - Used time periods must be connected.
    - Sets can share subset of times.
    - start of `train_time_period` < start of `val_time_period` < start of `test_time_period`.

In [8]:
config = TimeBasedConfig(ts_ids=54, train_time_period=(datetime(2023, 10, 9, 0), datetime(2023, 11, 9, 23)), val_time_period=(datetime(2023, 11, 9, 23), datetime(2023, 12, 9, 23)), test_time_period=(datetime(2023, 12, 9, 23), datetime(2023, 12, 25, 23)))
time_based_dataset.set_dataset_config_and_initialize(config, display_config_details=True, workers=0)

[2025-08-05 19:46:15,169][config][INFO] - Quick validation succeeded.
[2025-08-05 19:46:15,176][config][INFO] - Finalization and validation completed successfully.
[2025-08-05 19:46:15,180][cesnet_dataset][INFO] - Updating config on train/val/test/all and selected time series.
100%|██████████| 54/54 [00:00<00:00, 1237.56it/s]
[2025-08-05 19:46:15,228][cesnet_dataset][INFO] - Config initialized successfully.



Config Details
    Used for database: CESNET-TimeSeries24
    Aggregation: AgreggationType.AGG_1_HOUR
    Source: SourceType.INSTITUTION_SUBNETS

    Time series
        Time series IDS: [224  88 350 143  97 ...  91 429 164 202   7], Length=54
        Test time series IDS: None
    Time periods
        Train time periods: range(0, 767)
        Val time periods: range(767, 1487)
        Test time periods: range(1487, 1871)
        All time periods: range(0, 1871)
    Features
        Taken features: ['n_flows', 'n_packets', 'n_bytes', 'sum_n_dest_asn', 'avg_n_dest_asn', 'std_n_dest_asn', 'sum_n_dest_ports', 'avg_n_dest_ports', 'std_n_dest_ports', 'sum_n_dest_ip', 'avg_n_dest_ip', 'std_n_dest_ip', 'tcp_udp_ratio_packets', 'tcp_udp_ratio_bytes', 'dir_ratio_packets', 'dir_ratio_bytes', 'avg_duration', 'avg_ttl']
        Default values: [0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.5 0.5 0.5 0.5 0.  0. ]
        Time series ID included: True
        Time included: True    
        Tim

#### Setting sets with percentage

- Sets sets a percentage of whole time period from dataset.
- Always starts from first time.
- Must be: 0 < sum of percentages of set time periods <= 1.

In [9]:
config = TimeBasedConfig(ts_ids=54, train_time_period=0.5, val_time_period=0.3, test_time_period=0.2)
time_based_dataset.set_dataset_config_and_initialize(config, display_config_details=True, workers=0)

[2025-08-05 19:46:15,246][config][INFO] - Quick validation succeeded.
[2025-08-05 19:46:15,264][config][INFO] - Finalization and validation completed successfully.
[2025-08-05 19:46:15,268][cesnet_dataset][INFO] - Updating config on train/val/test/all and selected time series.
100%|██████████| 54/54 [00:00<00:00, 554.62it/s]
[2025-08-05 19:46:15,372][cesnet_dataset][INFO] - Config initialized successfully.



Config Details
    Used for database: CESNET-TimeSeries24
    Aggregation: AgreggationType.AGG_1_HOUR
    Source: SourceType.INSTITUTION_SUBNETS

    Time series
        Time series IDS: [121 454   3 330 185 ... 119 312 442 369 485], Length=54
        Test time series IDS: None
    Time periods
        Train time periods: range(0, 3359)
        Val time periods: range(3359, 5374)
        Test time periods: range(5374, 6717)
        All time periods: range(0, 6717)
    Features
        Taken features: ['n_flows', 'n_packets', 'n_bytes', 'sum_n_dest_asn', 'avg_n_dest_asn', 'std_n_dest_asn', 'sum_n_dest_ports', 'avg_n_dest_ports', 'std_n_dest_ports', 'sum_n_dest_ip', 'avg_n_dest_ip', 'std_n_dest_ip', 'tcp_udp_ratio_packets', 'tcp_udp_ratio_bytes', 'dir_ratio_packets', 'dir_ratio_bytes', 'avg_duration', 'avg_ttl']
        Default values: [0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.5 0.5 0.5 0.5 0.  0. ]
        Time series ID included: True
        Time included: True    
        T

### Selecting features

- Affects which features will be returned when loading data.
- Setting `include_time` as True will add time to features that return when loading data.
- Setting `include_ts_id` as True will add time series id to features that return when loading data.

#### Setting features to take as "all"

In [10]:
config = TimeBasedConfig(ts_ids=54, train_time_period=0.5, val_time_period=0.3, test_time_period=0.2, features_to_take="all")
time_based_dataset.set_dataset_config_and_initialize(config, display_config_details=True, workers=0)

[2025-08-05 19:46:15,386][config][INFO] - Quick validation succeeded.
[2025-08-05 19:46:15,407][config][INFO] - Finalization and validation completed successfully.
[2025-08-05 19:46:15,410][cesnet_dataset][INFO] - Updating config on train/val/test/all and selected time series.
100%|██████████| 54/54 [00:00<00:00, 508.22it/s]
[2025-08-05 19:46:15,524][cesnet_dataset][INFO] - Config initialized successfully.



Config Details
    Used for database: CESNET-TimeSeries24
    Aggregation: AgreggationType.AGG_1_HOUR
    Source: SourceType.INSTITUTION_SUBNETS

    Time series
        Time series IDS: [473 293 234 361 136 ...  75 382 281 450 395], Length=54
        Test time series IDS: None
    Time periods
        Train time periods: range(0, 3359)
        Val time periods: range(3359, 5374)
        Test time periods: range(5374, 6717)
        All time periods: range(0, 6717)
    Features
        Taken features: ['n_flows', 'n_packets', 'n_bytes', 'sum_n_dest_asn', 'avg_n_dest_asn', 'std_n_dest_asn', 'sum_n_dest_ports', 'avg_n_dest_ports', 'std_n_dest_ports', 'sum_n_dest_ip', 'avg_n_dest_ip', 'std_n_dest_ip', 'tcp_udp_ratio_packets', 'tcp_udp_ratio_bytes', 'dir_ratio_packets', 'dir_ratio_bytes', 'avg_duration', 'avg_ttl']
        Default values: [0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.5 0.5 0.5 0.5 0.  0. ]
        Time series ID included: True
        Time included: True    
        T

#### Setting features via list

In [11]:
config = TimeBasedConfig(ts_ids=54, train_time_period=0.5, val_time_period=0.3, test_time_period=0.2, features_to_take=["n_flows", "n_packets"])
time_based_dataset.set_dataset_config_and_initialize(config, display_config_details=True, workers=0)

[2025-08-05 19:46:15,543][config][INFO] - Quick validation succeeded.
[2025-08-05 19:46:15,561][config][INFO] - Finalization and validation completed successfully.
[2025-08-05 19:46:15,566][cesnet_dataset][INFO] - Updating config on train/val/test/all and selected time series.
100%|██████████| 54/54 [00:00<00:00, 811.17it/s]
[2025-08-05 19:46:15,639][cesnet_dataset][INFO] - Config initialized successfully.



Config Details
    Used for database: CESNET-TimeSeries24
    Aggregation: AgreggationType.AGG_1_HOUR
    Source: SourceType.INSTITUTION_SUBNETS

    Time series
        Time series IDS: [ 68 276 138 275 223 ...  70 507 187 298 367], Length=54
        Test time series IDS: None
    Time periods
        Train time periods: range(0, 3359)
        Val time periods: range(3359, 5374)
        Test time periods: range(5374, 6717)
        All time periods: range(0, 6717)
    Features
        Taken features: ['n_flows', 'n_packets']
        Default values: [0. 0.]
        Time series ID included: True
        Time included: True    
        Time format: TimeFormat.ID_TIME
    Sliding window
        Sliding window size: None
        Sliding window prediction size: None
        Sliding window step size: 1
        Set shared size: 0
    Fillers
        Filler type: None
    Transformers
        Transformer type: None
    Batch sizes
        Train batch size: 32
        Val batch size: 64
       

#### Including time and time series id

In [12]:
config = TimeBasedConfig(ts_ids=54, train_time_period=0.5, val_time_period=0.3, test_time_period=0.2, features_to_take=["n_flows", "n_packets"], include_time=True, include_ts_id=True, time_format=TimeFormat.ID_TIME)
time_based_dataset.set_dataset_config_and_initialize(config, display_config_details=True, workers=0)

[2025-08-05 19:46:15,652][config][INFO] - Quick validation succeeded.
[2025-08-05 19:46:15,718][config][INFO] - Finalization and validation completed successfully.
[2025-08-05 19:46:15,723][cesnet_dataset][INFO] - Updating config on train/val/test/all and selected time series.
100%|██████████| 54/54 [00:00<00:00, 821.66it/s]
[2025-08-05 19:46:15,795][cesnet_dataset][INFO] - Config initialized successfully.



Config Details
    Used for database: CESNET-TimeSeries24
    Aggregation: AgreggationType.AGG_1_HOUR
    Source: SourceType.INSTITUTION_SUBNETS

    Time series
        Time series IDS: [ 56 479 302 519 195 ... 386 261 353 181 461], Length=54
        Test time series IDS: None
    Time periods
        Train time periods: range(0, 3359)
        Val time periods: range(3359, 5374)
        Test time periods: range(5374, 6717)
        All time periods: range(0, 6717)
    Features
        Taken features: ['n_flows', 'n_packets']
        Default values: [0. 0.]
        Time series ID included: True
        Time included: True    
        Time format: TimeFormat.ID_TIME
    Sliding window
        Sliding window size: None
        Sliding window prediction size: None
        Sliding window step size: 1
        Set shared size: 0
    Fillers
        Filler type: None
    Transformers
        Transformer type: None
    Batch sizes
        Train batch size: 32
        Val batch size: 64
       

### Selecting which additional time series to load for test set

- Only usable when `test_time_period` is not None.
- Follows same rule as `ts_ids`.
- Is affected by `random_state`.
    - When `random_state` is set, `test_ts_ids` (and `ts_ids`) will contain same time series.
- Is affected by `nan_threshold`.

In [13]:
config = TimeBasedConfig(ts_ids=54, train_time_period=0.5, val_time_period=0.3, test_time_period=0.2, test_ts_ids=22)
time_based_dataset.set_dataset_config_and_initialize(config, display_config_details=True, workers=0)

[2025-08-05 19:46:15,807][config][INFO] - Quick validation succeeded.
[2025-08-05 19:46:15,827][config][INFO] - Finalization and validation completed successfully.
[2025-08-05 19:46:15,830][cesnet_dataset][INFO] - Updating config on train/val/test/all and selected time series.
100%|██████████| 54/54 [00:00<00:00, 515.75it/s]
[2025-08-05 19:46:15,945][cesnet_dataset][INFO] - Updating config on test_other and selected time series.
100%|██████████| 22/22 [00:00<00:00, 756.13it/s]
[2025-08-05 19:46:15,978][cesnet_dataset][INFO] - Config initialized successfully.



Config Details
    Used for database: CESNET-TimeSeries24
    Aggregation: AgreggationType.AGG_1_HOUR
    Source: SourceType.INSTITUTION_SUBNETS

    Time series
        Time series IDS: [ 51 112 415 384   0 ... 268 481 201 465 507], Length=54
        Test time series IDS: [203  50 102  69  58 ... 199 162 150  89 135], Length=22
    Time periods
        Train time periods: range(0, 3359)
        Val time periods: range(3359, 5374)
        Test time periods: range(5374, 6717)
        All time periods: range(0, 6717)
    Features
        Taken features: ['n_flows', 'n_packets', 'n_bytes', 'sum_n_dest_asn', 'avg_n_dest_asn', 'std_n_dest_asn', 'sum_n_dest_ports', 'avg_n_dest_ports', 'std_n_dest_ports', 'sum_n_dest_ip', 'avg_n_dest_ip', 'std_n_dest_ip', 'tcp_udp_ratio_packets', 'tcp_udp_ratio_bytes', 'dir_ratio_packets', 'dir_ratio_bytes', 'avg_duration', 'avg_ttl']
        Default values: [0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.5 0.5 0.5 0.5 0.  0. ]
        Time series ID incl

### Selecting all set

- Contains time series from `ts_ids`.

#### All set when other sets are None

- All set will contain whole time period of dataset.

In [14]:
config = TimeBasedConfig(ts_ids=54, train_time_period=None, val_time_period=None, test_time_period=None)
time_based_dataset.set_dataset_config_and_initialize(config, display_config_details=True, workers=0)

[2025-08-05 19:46:15,994][config][INFO] - Quick validation succeeded.
[2025-08-05 19:46:16,000][config][INFO] - Using all times for all_time_period because train_time_period, val_time_period, and test_time_period are all set to None.
[2025-08-05 19:46:16,001][config][INFO] - Finalization and validation completed successfully.
[2025-08-05 19:46:16,006][cesnet_dataset][INFO] - Updating config on train/val/test/all and selected time series.
100%|██████████| 54/54 [00:00<00:00, 544.47it/s]
[2025-08-05 19:46:16,112][cesnet_dataset][INFO] - Config initialized successfully.



Config Details
    Used for database: CESNET-TimeSeries24
    Aggregation: AgreggationType.AGG_1_HOUR
    Source: SourceType.INSTITUTION_SUBNETS

    Time series
        Time series IDS: [267  90 215 485  20 ... 356 202 446 494 136], Length=54
        Test time series IDS: None
    Time periods
        Train time periods: None
        Val time periods: None
        Test time periods: None
        All time periods: range(0, 6718)
    Features
        Taken features: ['n_flows', 'n_packets', 'n_bytes', 'sum_n_dest_asn', 'avg_n_dest_asn', 'std_n_dest_asn', 'sum_n_dest_ports', 'avg_n_dest_ports', 'std_n_dest_ports', 'sum_n_dest_ip', 'avg_n_dest_ip', 'std_n_dest_ip', 'tcp_udp_ratio_packets', 'tcp_udp_ratio_bytes', 'dir_ratio_packets', 'dir_ratio_bytes', 'avg_duration', 'avg_ttl']
        Default values: [0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.5 0.5 0.5 0.5 0.  0. ]
        Time series ID included: True
        Time included: True    
        Time format: TimeFormat.ID_TIME
    S

#### All set when at least one other set is not None

- All set will contain total time period of train + val + test.

In [15]:
config = TimeBasedConfig(ts_ids=54, train_time_period=0.5, val_time_period=0.3, test_time_period=0.2)
time_based_dataset.set_dataset_config_and_initialize(config, display_config_details=True, workers=0)

[2025-08-05 19:46:16,117][config][INFO] - Quick validation succeeded.
[2025-08-05 19:46:16,135][config][INFO] - Finalization and validation completed successfully.
[2025-08-05 19:46:16,140][cesnet_dataset][INFO] - Updating config on train/val/test/all and selected time series.
100%|██████████| 54/54 [00:00<00:00, 538.27it/s]
[2025-08-05 19:46:16,246][cesnet_dataset][INFO] - Config initialized successfully.



Config Details
    Used for database: CESNET-TimeSeries24
    Aggregation: AgreggationType.AGG_1_HOUR
    Source: SourceType.INSTITUTION_SUBNETS

    Time series
        Time series IDS: [438 344 121 176 190 ... 447 254 168 153 342], Length=54
        Test time series IDS: None
    Time periods
        Train time periods: range(0, 3359)
        Val time periods: range(3359, 5374)
        Test time periods: range(5374, 6717)
        All time periods: range(0, 6717)
    Features
        Taken features: ['n_flows', 'n_packets', 'n_bytes', 'sum_n_dest_asn', 'avg_n_dest_asn', 'std_n_dest_asn', 'sum_n_dest_ports', 'avg_n_dest_ports', 'std_n_dest_ports', 'sum_n_dest_ip', 'avg_n_dest_ip', 'std_n_dest_ip', 'tcp_udp_ratio_packets', 'tcp_udp_ratio_bytes', 'dir_ratio_packets', 'dir_ratio_bytes', 'avg_duration', 'avg_ttl']
        Default values: [0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.5 0.5 0.5 0.5 0.  0. ]
        Time series ID included: True
        Time included: True    
        T