Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(skorch): add an inherited class from skorch.NeuralNet that is compatible with PyTorch Frame #375

Open
wants to merge 55 commits into
base: master
Choose a base branch
from

Conversation

34j
Copy link

@34j 34j commented Mar 11, 2024

Related: #147

@MacOS Please continue from here if it helps.
Sorry for being so loud, but this took me a whole day, so I would appreciate it very much if you could make me as a co-author if you used this code.

@34j 34j marked this pull request as draft March 11, 2024 13:06
Copy link

codecov bot commented Mar 11, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 93.52%. Comparing base (ee98b87) to head (aa5484d).
Report is 6 commits behind head on master.

❗ Current head aa5484d differs from pull request most recent head cb76e8d. Consider uploading reports for the commit cb76e8d to get more accurate results

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #375   +/-   ##
=======================================
  Coverage   93.52%   93.52%           
=======================================
  Files         124      124           
  Lines        6456     6456           
=======================================
  Hits         6038     6038           
  Misses        418      418           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@MacOS
Copy link

MacOS commented Mar 11, 2024

Sure - on both!

@34j 34j marked this pull request as ready for review March 16, 2024 05:15
@34j
Copy link
Author

34j commented Mar 16, 2024

@weihua916 Would you mind reviewing if you think this is a good way to implement it?

Also, it is strange that mypy in pre-commit does not raise errors, but mypy in CI does. I don't think there is any way to deal with this.

@34j
Copy link
Author

34j commented Apr 3, 2024

@weihua916 @zechengz @yiweny Would appreciate your review, thank you very much in advance.

Copy link
Contributor

@weihua916 weihua916 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Do you mind adding unit test similar to https://github.com/pyg-team/pytorch-frame/blob/master/test/gbdt/test_gbdt.py?

examples/revisiting.py Outdated Show resolved Hide resolved
examples/revisiting.py Outdated Show resolved Hide resolved
@weihua916
Copy link
Contributor

A kind check-in. Is there any progress here?

@34j
Copy link
Author

34j commented May 2, 2024

No progress, sorry

@34j 34j requested a review from weihua916 July 5, 2024 03:28
@34j
Copy link
Author

34j commented Jul 5, 2024

@weihua916 Thanks for your patience, I've added tests.

Copy link
Member

@zechengz zechengz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments. Can @weihua916 or @yiweny helps to take a look? Thanks for your contribution!

examples/revisiting.py Outdated Show resolved Hide resolved
examples/tutorial.py Outdated Show resolved Hide resolved
def _patch_skorch_support_tenforframe() -> None:
old_to_tensor = skorch.utils.to_tensor

def to_tensor(X, device, accept_sparse=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add typing information?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As well, this is not recommended as skorch does not support typing.

torch_frame/utils/skorch.py Show resolved Hide resolved
torch_frame/utils/skorch.py Outdated Show resolved Hide resolved
def __init__(
self,
# NeuralNet parameters
module,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add the typing information here?

Copy link
Author

@34j 34j Jul 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since skorch does not support typing, doing this is tedious and not recommended. (On the other hand, it would be inconvenient for users if it was **args, **kwargs, so I compromised and made it this way.)

README.md Outdated Show resolved Hide resolved
examples/revisiting.py Outdated Show resolved Hide resolved
examples/tutorial.py Outdated Show resolved Hide resolved
test/utils/test_skorch.py Outdated Show resolved Hide resolved
@34j
Copy link
Author

34j commented Jul 11, 2024

@weihua916 Removed all tutorials and added examples/sklearn_api.py instead.
The failing tests are probably due to a pandas version update and are not related to this PR.

@34j 34j requested review from weihua916 and zechengz July 11, 2024 10:05
Copy link
Contributor

@weihua916 weihua916 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a quick question.

Comment on lines +47 to +48
# [stype.text_embedded],
# [stype.numerical, stype.numerical, stype.text_embedded],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't support these stypes?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently not supported at this time due to lack of time to understand how to use these dtypes.
However, since it probably only require changes in the arguments of the NeuralNet, it should have little trouble extending it in the future.

Comment on lines +144 to +149
if pass_dataset:
net.fit(dataset)
_ = net.predict(test_dataset)
else:
net.fit(X_train, y_train)
_ = net.predict(X_test)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we take tensor frame? It's also weird to sometimes take dataset and sometimes take data frame.

Copy link
Author

@34j 34j Jul 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • The main purpose of this PR is to allow DataFrame to be DIRECTLY fitted, as shown in examples/sklearn_api.py.
  • Since it is unclear how to create a Dataset from a TensorFrame, and if there is a TensorFrame, there should be also a Dataset, which means there is little need to implement this, and even to use skorch as the user might be familiar with deep learning.
  • Instead of "sometimes take dataset and sometimes take data frame", both are tested.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm if dataframe is directly fed, it is unclear why we need this feature within pytorch frame.
the whole point of pytorch frame is to materialize data frame into tensor frame, to be processed by pytorch.

Copy link
Author

@34j 34j Jul 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may not match your purpose but my goal is to use advanced neural networks implemented in pytorch_frame in existing sklearn pipeline.
This PR allows pytorch_frame to be used on top of existing scikit-learn code without having to heavily modify the existing code. Since many people use sklearn Pipeline, especially on Kaggle, it is easy to verify performance changes by changing or assembling the estimator in other people's code to my NeuralNetPytorchFrame. I am convinced that this will be very valuable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants