simplified how RUL is calculated using transform method with GroupBy #8

kylejones200 · 2020-05-27T17:37:51Z

Issue #, if available:

Description of changes: Original version uses complicated approach to find the max number of cycles for each id. Using pd.DataFrame.transform with pd.Groupby, we can find the max value for each id and assign it to the the proper column. This prevents making extra copies of the DataFrame and then merging those slices.

Original:

for i, df in enumerate(train_df):
    rul = pd.DataFrame(df.groupby('id')['cycle'].max()).reset_index()
    rul.columns = ['id', 'max']
    df = df.merge(rul, on=['id'], how='left')
    df['RUL'] = df['max'] - df['cycle']
    df.drop('max', axis=1, inplace=True)
    train_df[i]=df

revised:

df['max'] = df.groupby(['id'])['cycle'].transform(max)
df['RUL'] = df['max'] - df['cycle']

This code could be further simplified by using the "names" argument to assign the labels to the columns. I didn't make this change because the way the columns list is used for the test datasets causes issues. However, the process for reading in the data for the test data is also needlessly complex.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

simplified how RUL is calcuated using transform method from pd.GroupBy()

ad85b9e

sojiadeshina force-pushed the master branch 3 times, most recently from f7c595d to 7b2dad4 Compare September 15, 2020 19:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

simplified how RUL is calculated using transform method with GroupBy #8

simplified how RUL is calculated using transform method with GroupBy #8

kylejones200 commented May 27, 2020 •

edited

simplified how RUL is calculated using transform method with GroupBy #8

Are you sure you want to change the base?

simplified how RUL is calculated using transform method with GroupBy #8

Conversation

kylejones200 commented May 27, 2020 • edited

kylejones200 commented May 27, 2020 •

edited