-
Notifications
You must be signed in to change notification settings - Fork 360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSVC fixes #841
MSVC fixes #841
Conversation
@@ -159,13 +161,21 @@ def gen_function_decl(func_attrs: Dict[str, Any]) -> str: | |||
return; | |||
} | |||
// Determine stride for each input dimension | |||
{% if is_windows %} | |||
{{index_type}}* input_strides = ({{index_type}}*) malloc(input_rank * sizeof(int64_t)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @hlky !
Here, both input_rank and output_rank are actually constants. They could be codegen-ed ahead-of-time instead of being passed around at runtime. In this case, the malloc cost can be avoided. Do you want to fix in this way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ipiszy Thanks for the suggestion. input_rank and output_rank are now codegen-ed.
Thanks @hlky for the MSVC fix! |
Added another fix related to included constants. |
@ipiszy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
It seems there is an issue with your environment, please confirm:
|
okay: |
FX2AIT is different. If you did not clone with submodules you should clone again
|
when running |
It seems you do not have the submodules. |
executed these two commands; do I merge with this PR now? or do I run the setup scripts first? |
|
merged it, I'm sure I did it right. now when running |
It seems you do not have the submodules. Refer to the previous instructions. |
just did.. it's different now however. I think it works? I'm not sure if what I got was an error; it made a bunch of files and acted normally. |
You will see something like this if it is working
Initial profiling will take some time. The last text before compilation begins will be something like
Compilation will take several minutes, if you have low CPU core count it can take a while. Alternatively at this point you can cancel out and load the solution in Visual Studio |
after the |
No, when you run a compilation script. |
didn't try yet, I will now. |
I can't believe this; |
@hlky will there be an easy tutorial for using AIT on SDXL after you and Comfy figure this out? I only managed to get this to work on my linux dual boot |
tbh, when I initially implemented CMake version, I did not test whether AITemplate can be installed via setup.py, but I made sure that the binary & tests work by running ones from the AITemplate directory. Try to manually change the reference in a python file that uses aitemplate, will it work in this way, at least? |
Yep, there were definite problems with it |
@hlky |
@alexanderguzhva Malloc was replaced in this commit, it's all codegen'd instead. |
As I cannot test on Windows: What's the state of this PR? There were a lot of comments after the initial review. Should we merge this if it doesn't introduce problems on Linux based CI? @alexanderguzhva @hlky |
I tried reinstalling MSVC, did it entirely. I uninstalled all previous CUDA files, ensured I selected .......idk why this doesn't work.. tried reinstalling everything, following your instructions precisely, still gets this error. |
@kadeng It's ready for review/merge. |
oh, am I the only one having trouble with this? well, the Comfy custom node will have the modules pre-compiled, right? |
This PR comprises 5 fixes related to CMake MSVC.
builder_cmake list of source fix
Issue occurs when compiling profilers.
Also fixed for absolute path work_dir.
MSVC aligned_storage fix
Adds
-D_DISABLE_EXTENDED_ALIGNED_STORAGE
to compile options.MSVC Conv2d common narrowing conversion
error C2398: Element '11': conversion from 'int64_t' to 'int' requires a narrowing conversion
Error refers to
AITemplate/python/aitemplate/backend/cuda/conv2d/common.py
Line 70 in 048c9e7
is_windows
passed to template so that fix is only applied on windows.MSVC tensor/expand.py fix
Encountered when compiling CLIP which uses expand ops.
ModelContainerGenerator enable is_windows/main_templates.py fix windll.h include
Error before enabling
is_windows
forModelContainerGenerator
:After enabling
is_windows
:windll.h
include was in the wrong place. It is moved outside of namespace.