Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unnecessary OneHotEncoder when there are nan values #725

Merged
merged 5 commits into from
Oct 24, 2023

Conversation

fealho
Copy link
Member

@fealho fealho commented Oct 17, 2023

Resolve #616.

@fealho fealho requested a review from a team as a code owner October 17, 2023 15:39
@fealho fealho requested review from frances-h and pvk-developer and removed request for a team October 17, 2023 15:39
@codecov-commenter
Copy link

codecov-commenter commented Oct 17, 2023

Codecov Report

All modified lines are covered by tests ✅

Comparison is base (1502ebd) 100.00% compared to head (d81a557) 100.00%.
Report is 5 commits behind head on main.

❗ Current head d81a557 differs from pull request most recent head d672946. Consider uploading reports for the commit d672946 to get more accurate results

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff            @@
##              main      #725   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           17        17           
  Lines         1805      1805           
=========================================
  Hits          1805      1805           
Files Coverage Δ
rdt/transformers/categorical.py 100.00% <100.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

# Run
ohe.fit(data, 'column_name')
# Create a temporary file and dump the transformer
tmp = tempfile.NamedTemporaryFile()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the pytest tmp_path fixutre instead?

@fealho fealho requested a review from frances-h October 18, 2023 15:50
Copy link
Contributor

@frances-h frances-h left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just one last small nitpick

Comment on lines 436 to 440
tmp = tmp_path / 'ohe.pkl'
with open(tmp, 'wb') as tmp:
pickle.dump(ohe, tmp)
with open(tmp.name, 'rb') as tmp:
ohe_loaded = pickle.load(tmp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: can we use different var names for the filepath and the file?

@fealho fealho merged commit 8d6ec14 into main Oct 24, 2023
45 checks passed
@fealho fealho deleted the issue-616-ohe-warning branch October 24, 2023 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unnecessary warning in OneHotEncoder when there are nan values
4 participants