Skip to content

Conversation

@jeromedockes
Copy link
Member

As discussed in skrub meetings.
Currently the actual functionality of the GapEncoder is implemented by the GapEncoderColumn. The GapEncoder is a wrapper that takes care of fitting a separate GapEncoderColumn to each column and collecting the results in a 2D array.
That part is now handled by the TableVectorizer, so we only need the GapEncoderColumn.

This PR essentially removes the GapEncoder, then renames GapEncoderColumn to GapEncoder

"to_list",
"to_numpy",
"to_pandas",
"reset_index",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes in this file are take from #919 and should be reviewed there

from datetime import datetime

import numpy as np
import pandas as pd
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes in this file are take from #919 and should be reviewed there

class GapEncoderColumn(BaseEstimator, TransformerMixin):
"""GapEncoder for encoding a single column.
class GapEncoder(SingleColumnTransformer, TransformerMixin):
"""Constructs latent topics with continuous encoding.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the docstring has been moved essentially unchanged from the previous GapEncoder with 2 minor changes:

  • the docstring still mentioned 'empty_impute', but apparently the parameter has been renamed 'zero_impute'
  • the docstring mentioned 'inverse_transform' but it doesn't seem to be defined

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that this was resolved after what we discuss in the meeting.

@jeromedockes jeromedockes changed the title [WIP] Make GapEncoder a single-column transformer Make GapEncoder a single-column transformer May 31, 2024
@glemaitre
Copy link
Member

@jeromedockes could you solve the conflicts

@glemaitre glemaitre merged commit 1dae4b0 into skrub-data:main Jun 6, 2024
@jeromedockes jeromedockes deleted the make_gap_encoder_columnwise branch June 7, 2024 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants