-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve documentation for saving and loading of feature extraction calculators #22
Comments
Hola Jorge! yes, indeed it is possible to save and load the extraction calculators that were used on a dataset. Actually we spend a lot time thinking about how we make that possible. The solution we came up with is based on the For you as a user there are two options on how save which features were calculated:
from tsfresh.feature_extraction.settings import FeatureExtractionSettings
from tsfresh import extract_features
# the inantiated Settings object will calculate all features
settings_new = FeatureExtractionSettings()
# Now we assume that X_old is the result of an earlier filtered extraction run
# We set our new settings object to only calculate the features from X_old
setting_new = setting_new.from_columns(X_old.columns):
# Now we only calculate the features that were contained in X_old
X_new = extract_features(df_new, settings_new) |
Hi Max, I tried to do the same by your example, but my X_new has different number of features than my X_old, my code is here https://github.com/Huandao0812/lstm_exp/blob/master/test_tsfresh.py#L46 |
update: I check the diff of 2 set of columns and this is the difference: |
Hello Max and the other contributors,
Great work on this library, It works really well.
Is it possible to save and load extraction calculators that were used in a dataset? For instance, let's say that I have a dataset and I run extract_features and then select_features. As of result, some features are being removed and the standing features were calculated based on a specific calculator. Is there a way to get those calculators?
The reasoning of this option is that I need to run future dataset with the same calculators since my model is being trained on them.
Let me know if this is possible or if am I missing something?
Thank you.
The text was updated successfully, but these errors were encountered: