Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] FIX Unit tests on Windows #85

Merged
merged 5 commits into from Jan 3, 2018
Merged

Conversation

rth
Copy link
Contributor

@rth rth commented Jan 2, 2018

This PR aims to fix #82

It includes,

The latest Appveyor output can be seen here some of the failures will go away once #83 is merged..

@maciejkula
Copy link
Owner

I suspect both have something to do with numpy integers being 64-bit on Linux but not (apparently) on Windows.

@maciejkula
Copy link
Owner

It seems we are very efficiently duplicating each other's work :)

@rth
Copy link
Contributor Author

rth commented Jan 2, 2018

@maciejkula yes, I just realised this is a duplicate of your PR #84.

I can remove the redundant parts, up to you how to split this work, I wouldn't mind helping a bit :)

@maciejkula
Copy link
Owner

OK, I like your set up better, so let's roll with what's here and I'll close the other PR.

With respect to the int/long issue: I think I would prefer to track down the root issue (why do we have ints anywhere at all?) and leave the layers without casting. Is there a way you could roll back the embedding layer changes only so that I could see where it fails in Appveyor?

@maciejkula
Copy link
Owner

What surprises me is that I have explicit type conversions here, but it looks like this gets ints anyway.

That's probably a fruitful line of research if you have pdb handy.

@maciejkula maciejkula mentioned this pull request Jan 2, 2018
@rth
Copy link
Contributor Author

rth commented Jan 2, 2018

Great!

With respect to the int/long issue: I think I would prefer to track down the root issue (why do we have ints anywhere at all?) and leave the layers without casting. Is there a way you could roll back the embedding layer changes only so that I could see where it fails in Appveyor?

Yes, sure, done.

What surprises me is that I have explicit type conversions here, but it looks like this gets ints anyway.

I don't get it either, but I'm also not used to debugging PyTorch code. The function deeper in the traceback that raises the exception (e.g. BloomEmbedding._get_hashed_indices) also doesn't get ints all the time, it gets mostly longs and sometimes ints and then it fails. I can try to debug this some more tomorrow if needed.

@maciejkula
Copy link
Owner

Great! I'm of limited use because I don't have access to a Windows machine, so can't easily make it fail without going to Appveyor all the time.

I think the key will be dropping into a debugger where it fails and going up the stack until we hit the source of the problem.

@maciejkula
Copy link
Owner

maciejkula commented Jan 3, 2018

In hindsight, so obvious. Thanks a lot for the fixes! Are you happy for me to merge?

If so, I'll push a new version out to Conda tonight.

@rth
Copy link
Contributor Author

rth commented Jan 3, 2018

Yes, everything appears to be due to randint. Opened an issue in numpy/numpy#10317 .
Sure, feel free to merge. Thanks for the quick review!

@maciejkula
Copy link
Owner

🎉

(I have seen similar things with arange, and I'm not sure if there is an underlying rationale to defaults changing between version/platforms of numpy.)

@maciejkula maciejkula merged commit ca5d8fd into maciejkula:master Jan 3, 2018
@rth rth deleted the windows-ci branch January 3, 2018 10:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unit tests on Windows
2 participants