You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First, I would like to express my gratitude for your remarkable work in consolidating various models and datasets related to time series analysis in one comprehensive platform. It's an invaluable resource.
I am currently engaged in some experiments using the PEMS datasets ('03, 04, 07, 08'), specifically referenced in the iTransformer study. In this context, I have a few queries regarding the custom data_loader implementation, and I would greatly appreciate your insights to address these:
Given that PEMS data are aggregated in 5-minute intervals, should the multiplication factor be 12 (timesteps) rather than the 4 used in the custom data class?
Considering our objective encompasses all variables (i.e., sensors), how should the data loaders be structured for both input and target?
In the iTransformer research, the time intervals {12, 24, 48, 96} are mentioned. In terms of data organized every 5 minutes, does this equate to durations of 1 hour, 2 hours, 4 hours, and 8 hours, respectively?
How does the 'label_len' in data partitioning in this code snippet affect the process:
From my observation, it appears that the results in various papers, such as iTransformer, are presented without data inversion (i.e., '--inverse' is not applied). Is this a correct understanding?
Lastly, if could you provide script files and code for how to process PEMS datasets, similar to iTransformer, it would be immensely helpful.
Just for info: I have already converted the NPZ files into CSV formats and add 'date' column using following code:
import numpy as np
import pandas as pd
def convert_npz_to_csv_with_datetime_index(npz_file_path, data_key, start_date, timestep_minutes, csv_file_path):
# Load the NPZ file
npz_file = np.load(npz_file_path)
# Extract the data array
data = npz_file[data_key]
# Reshape the data array to 2D if it has more than 2 dimensions
if data.ndim > 2 and data.shape[2] > 0:
data = data[:, :, 0]
# Number of rows in the data
num_rows = data.shape[0]
# Generate a date range with the specified start date and timestep
timestamps = pd.date_range(start=start_date, periods=num_rows, freq=f'{timestep_minutes}T')
# Create a DataFrame with the timestamps as index
df = pd.DataFrame(data, index=timestamps)
# Print shape and Index
# Print DataFrame details
print(f"DataFrame shape: {df.shape}")
print(f"Index range: {df.index.min()} to {df.index.max()}")
print(f"Index type: {type(df.index)}")
# Reset the index to make the datetime a column, and rename it to 'date'
df.reset_index(inplace=True)
df.rename(columns={'index': 'date'}, inplace=True)
# Convert 'date' column to string (object type)
df['date'] = df['date'].astype(str)
# Save to CSV
df.to_csv(csv_file_path, index=False)
print(f"File saved as '{csv_file_path}'.")
# Example usage
# convert_npz_to_csv_with_datetime_index('./dataset/PEMS/PEMS03.npz', 'data', '2012-01-05', 5, './dataset/PEMS/PEMS03.csv')
# convert_npz_to_csv_with_datetime_index('./dataset/PEMS/PEMS04.npz', 'data', '2017-01-07', 5, './dataset/PEMS/PEMS04.csv')
# convert_npz_to_csv_with_datetime_index('./dataset/PEMS/PEMS07.npz', 'data', '2017-01-05', 5, './dataset/PEMS/PEMS07.csv')
# convert_npz_to_csv_with_datetime_index('./dataset/PEMS/PEMS08.npz', 'data', '2012-01-03', 5, './dataset/PEMS/PEMS08.csv')
The text was updated successfully, but these errors were encountered:
First, I would like to express my gratitude for your remarkable work in consolidating various models and datasets related to time series analysis in one comprehensive platform. It's an invaluable resource.
I am currently engaged in some experiments using the PEMS datasets ('03, 04, 07, 08'), specifically referenced in the iTransformer study. In this context, I have a few queries regarding the custom data_loader implementation, and I would greatly appreciate your insights to address these:
Just for info: I have already converted the NPZ files into CSV formats and add 'date' column using following code:
The text was updated successfully, but these errors were encountered: