-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is this available on windows? #150
Comments
Hi @snaik2016, We haven't tried building it on windows. We expect some issues as we generate code at runtime and the calling convention is different with windows compiler + platform. There may be some compiler incompatibility issues as well. However, we don't expect any major changes. Currently the support on windows platform is not planned to be added in near future but we welcome external contributions. Thanks |
I tried with #238. Due to the fact that I don't have too much knowledge about assembly code, I cannot proceed any further. Could you please take a look there? |
@peterjc123 Thanks for your contributions. Let me look at your PR. I have been making some progress on this with d06ea70 c960d51 fc3dfe3 and others recently. |
@dskhudia @peterjc123 @snaik2016 Actually, I've made FBGEMM fully functional on windows. https://github.com/marian-nmt/FBGEMM I could do rebase on the latest branch, if needed. |
Summary: Pull Request resolved: pytorch#240 For pytorch#150 Reviewed By: jspark1105 Differential Revision: D19324136 fbshipit-source-id: aa6148c2993d5ab9cb9d72d7f8a0942c4f0e454a
Summary: Pull Request resolved: pytorch#241 For pytorch#150 Some changes picked from with minor modifications pytorch#238 . Thanks a lot for your contributions. Reviewed By: jspark1105 Differential Revision: D19331181 fbshipit-source-id: 08dd3a8ffd65f781e3bda8d241b9eb0221f03669
Summary: This PR includes instrinsic API based implementation of FP16 kernels for windows build. AVX2 is slightly slower or similar speed to the inline assembly version. AVX512 is slower than inline assembly version for large m size. Anyway, they are much faster than the reference implementation. I've done accuracy tests for AVX2 and AVX512. This is for pytorch#150 . Pull Request resolved: pytorch#254 Differential Revision: D19460473 Pulled By: jspark1105 fbshipit-source-id: b3dcb2e4a19cf315ae9e339959a659bd19562fa6
Summary: This PR includes instrinsic API based implementation of FP16 kernels for windows build. AVX2 is slightly slower or similar speed to the inline assembly version. AVX512 is slower than inline assembly version for large m size. Anyway, they are much faster than the reference implementation. I've done accuracy tests for AVX2 and AVX512. This is for pytorch#150 . Pull Request resolved: pytorch#254 Differential Revision: D19460996 Pulled By: jspark1105 fbshipit-source-id: 4f5248140cd2f3ee3a40c1e330c2be3085ed718a
Summary: This PR includes instrinsic API based implementation of FP16 kernels for windows build. AVX2 is slightly slower or similar speed to the inline assembly version. AVX512 is slower than inline assembly version for large m size. Anyway, they are much faster than the reference implementation. I've done accuracy tests for AVX2 and AVX512. This is for pytorch#150 . Pull Request resolved: pytorch#254 Differential Revision: D19460996 Pulled By: jspark1105 fbshipit-source-id: 4c29863aec069ce009c85d67cc42269f6acf569e
Summary: This PR includes instrinsic API based implementation of FP16 kernels for windows build. AVX2 is slightly slower or similar speed to the inline assembly version. AVX512 is slower than inline assembly version for large m size. Anyway, they are much faster than the reference implementation. I've done accuracy tests for AVX2 and AVX512. This is for pytorch#150 . Pull Request resolved: pytorch#254 Differential Revision: D19460996 Pulled By: jspark1105 fbshipit-source-id: 537766a226a3e8436bee259a1240a775e819c34b
Summary: This PR includes instrinsic API based implementation of FP16 kernels for windows build. AVX2 is slightly slower or similar speed to the inline assembly version. AVX512 is slower than inline assembly version for large m size. Anyway, they are much faster than the reference implementation. I've done accuracy tests for AVX2 and AVX512. This is for #150 . Pull Request resolved: #254 Reviewed By: shz0116 Differential Revision: D19460996 Pulled By: jspark1105 fbshipit-source-id: 3542918b4224c057a7a239cfbe4b44f6ad618b5e
@snaik2016 we are trying to port the code to Windows platform. One question is whether you need both Static and Shared FBGEMM libraries, or just static ? Do you know how important shared library is on Windows platforms compared with static ? |
@shz0116 In my use case in marian-nmt, I directly include those FBGEMM file into my marian VS project. And, the mother VS project generates a shared library (dll). So, I am using a shared library (I use all the files in FBGEMM). I use ' FBGEMM_EXPORTS' definition in the project. |
@ykim362 so you have not seen the problem in https://github.com/pytorch/FBGEMM/issues/266 ? |
@shz0116 I don't have a problem in my forked branch. ( https://github.com/marian-nmt/FBGEMM ). I might need to merge the latest FBGEMM branch into marian-nmt, soon. So, I will let you know how it goes. |
@shz0116 I can see one difference between the old version and the latest version. 'FBGEMM_EXPORTS' is forcely defined in some cc files in the latest version. FBGEMM/src/PackAWithQuantRowOffset.cc Line 7 in 27cb280
Could this cause potential mismatches between dllimport and dllexport? |
I guess the static build is okay if the size is not too big. Otherwise, it will fail in the linking stage because PyTorch is already large enough. |
The shared build should be working in #268. |
FBGEMM works on windows now. Closing this issue. |
Summary: This PR includes instrinsic API based implementation of FP16 kernels for windows build. AVX2 is slightly slower or similar speed to the inline assembly version. AVX512 is slower than inline assembly version for large m size. Anyway, they are much faster than the reference implementation. I've done accuracy tests for AVX2 and AVX512. This is for #150 . Pull Request resolved: #254 Reviewed By: shz0116 Differential Revision: D19460996 Pulled By: jspark1105 fbshipit-source-id: 3542918b4224c057a7a239cfbe4b44f6ad618b5e
Can this library be built on windows?
The text was updated successfully, but these errors were encountered: