You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Your implementation demonstrates a brilliant and ingenious approach that truly stands out. However, during my examination of the code, I noticed a potential issue that I believe requires your attention.
It appears that there is a case of data leakage in your CNN classifier. Specifically, the classifier seems to be utilizing information from the same day to predict the outcome for that day. Data leakage can lead to inflated performance metrics during testing but result in poor performance when applied to real-world scenarios.
There is a data leakage issue in the training CNN section of the STOCK_Market_GAN:
# start at num_historical_days and iterate the full length of the training# data at intervals of num_historical_daysforiinrange(num_historical_days, len(df), num_historical_days):
# split the df into arrays of length num_historical_days and append# to data, i.e. array of df[curr - num_days : curr] -> a batch of valuesself.data.append(data[i-num_historical_days:i])
# appending if price went up or down in curr day of "i" we are looking# atself.labels.append(labels[i-1])
# do same for test datadata=test_df[['open','high','low','close','volume']].values
You should change self.labels.append(labels[i-1]) with self.labels.append(labels[i])
The text was updated successfully, but these errors were encountered:
Your implementation demonstrates a brilliant and ingenious approach that truly stands out. However, during my examination of the code, I noticed a potential issue that I believe requires your attention.
It appears that there is a case of data leakage in your CNN classifier. Specifically, the classifier seems to be utilizing information from the same day to predict the outcome for that day. Data leakage can lead to inflated performance metrics during testing but result in poor performance when applied to real-world scenarios.
There is a data leakage issue in the training CNN section of the STOCK_Market_GAN:
You should change
self.labels.append(labels[i-1])
withself.labels.append(labels[i])
The text was updated successfully, but these errors were encountered: