You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for this wonderful library, it works really well so far. When using pat2feat.get_features to extract features for many patterns, then I get lots of
/sequential/pat2feat.py:79: PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling `frame.insert` many times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
df['feature_' + str(i)] = df.apply(lambda row: is_satisfiable_in_rolling(row['sequence'], pattern,
Would it be possible to generate the column data first and then bulk create the dataframe by concatenating, instead of adding them to the dataframe 1 by 1? Would it also be possible to directly return a numpy array by pat2feat, as pandas is often an overkill?
Thank you @jcklie for your interests and feedback to the library! The performance warning seems to be caused by inserting a large number of columns into data frame 1 by 1. It is a good suggestion to potentially speed up the process by doing a bulk concatenation instead of inserting each time. We shall look into this in future library updates.
Thank you for this wonderful library, it works really well so far. When using
pat2feat.get_features
to extract features for many patterns, then I get lots ofWould it be possible to generate the column data first and then bulk create the dataframe by concatenating, instead of adding them to the dataframe 1 by 1? Would it also be possible to directly return a numpy array by pat2feat, as pandas is often an overkill?
The text was updated successfully, but these errors were encountered: