Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSVC fixes #841

Closed
wants to merge 10 commits into from
Closed

MSVC fixes #841

wants to merge 10 commits into from

Conversation

hlky
Copy link
Contributor

@hlky hlky commented Jul 23, 2023

This PR comprises 5 fixes related to CMake MSVC.

builder_cmake list of source fix

Issue occurs when compiling profilers.

File "aitemplate\backend\cuda\builder_cmake.py", line 262, in make_profilers
  build_dir = Path(source).parent / test_name

File "aitemplate\backend\cuda\builder_cmake.py", line 271, in make_profilers
  CMAKE_SOURCE_FILES=_files_as_str("../" + str(Path(source).name)),

TypeError: expected str, bytes or os.PathLike object, not list

Also fixed for absolute path work_dir.

MSVC aligned_storage fix

error : static assertion failed with "You've instantiated std::aligned_storage<Len, Align> with an extended alignment (in other words, Align > alignof(max_align_t)). Before VS 2017 15.8, the member "type" would non-conformingly have an alignment of only alignof(max_align_t). VS 2017 15.8 was fixed to handle this correctly, but the fix inherently changes layout and breaks binary compatibility (*only* for uses of aligned_storage with extended alignments). To suppress this error, please define either (1) _ENABLE_EXTENDED_ALIGNED_STORAGE to confirm that you want a type with an extended alignment, or (2) _DISABLE_EXTENDED_ALIGNED_STORAGE to get the old non-conforming behavior."

Adds -D_DISABLE_EXTENDED_ALIGNED_STORAGE to compile options.

MSVC Conv2d common narrowing conversion

error C2398: Element '11': conversion from 'int64_t' to 'int' requires a narrowing conversion
Error refers to

{{indent}} *out_ch, // typename LayoutC::Stride::Index ldt

is_windows passed to template so that fix is only applied on windows.

MSVC tensor/expand.py fix

Encountered when compiling CLIP which uses expand ops.

expand_4.cu(99): error : expression must have a constant value
  int64_t input_strides[input_rank];
                                   ^
expand_4.cu(99): note #2689-D: the value of parameter "input_rank" (declared at line 72) cannot be used as a constant
  int64_t input_strides[input_rank];

ModelContainerGenerator enable is_windows/main_templates.py fix windll.h include

Error before enabling is_windows for ModelContainerGenerator:

model_container_base.obj : error LNK2019: unresolved external symbol "unsigned char const * const _binary_constants_bin_start" (?_binary_constants_bin_start@@3QBEB) referenced in function "public: __cdecl ait::ModelContainerBase::ModelContainerBase(unsigned __int64,unsigned __int64,unsigned __int64,unsigned __int64,unsigned __int64,class AITemplateAllocator &)" (??0ModelContainerBase@ait@@QEAA@_K0000AEAVAITemplateAllocator@@@Z) [build\model.vcxproj]
         model_container_base.obj : error LNK2019: unresolved external symbol "unsigned char const * const _binary_constants_bin_end" (?_binary_constants_bin_end@@3QBEB) referenced in function "public: __cdecl ait::ModelContainerBase::ModelContainerBase(unsigned __int64,unsigned __int64,unsigned __int64,unsigned __int64,unsigned __int64,class AITemplateAllocator &)" (??0ModelContainerBase@ait@@QEAA@_K0000AEAVAITemplateAllocator@@@Z) [build\model.vcxproj]
         build\Release\model.dll : fatal error LNK1120: 2 unresolved externals [build\model.vcxproj]

After enabling is_windows:

error : identifier "GetConstantsBin" is undefined [build\objlib.vcxproj]

windll.h include was in the wrong place. It is moved outside of namespace.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 23, 2023
@@ -159,13 +161,21 @@ def gen_function_decl(func_attrs: Dict[str, Any]) -> str:
return;
}
// Determine stride for each input dimension
{% if is_windows %}
{{index_type}}* input_strides = ({{index_type}}*) malloc(input_rank * sizeof(int64_t));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hlky !
Here, both input_rank and output_rank are actually constants. They could be codegen-ed ahead-of-time instead of being passed around at runtime. In this case, the malloc cost can be avoided. Do you want to fix in this way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ipiszy Thanks for the suggestion. input_rank and output_rank are now codegen-ed.

@ipiszy
Copy link
Contributor

ipiszy commented Jul 24, 2023

Thanks @hlky for the MSVC fix!

@hlky
Copy link
Contributor Author

hlky commented Jul 25, 2023

Added another fix related to included constants.

@facebook-github-bot
Copy link
Contributor

@ipiszy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@Shaistrong
Copy link

Shaistrong commented Aug 2, 2023

fatal error C1083: Cannot open include file: 'dlfcn.h': No such file or directory any idea what this is??

@hlky
Copy link
Contributor Author

hlky commented Aug 2, 2023

It seems there is an issue with your environment, please confirm:

  • Windows version
  • Visual Studio version, and edition, Visual Studio 2022 Community/Professional is tested, build tools only install has issues with CMake finding CUDA .props, Desktop development with C++ selected when installing VS
  • CUDA version. If Visual Studio is installed after CUDA, reinstall CUDA for the Visual Studio integration, or extract the files from the installer and manually copy. 12.x is tested
  • You are using x64 Native Tools Command Prompt for VS 2022 and have ran set AIT_USE_CMAKE_COMPILATION=1
  • AITemplate is installed
  • Python version, and that you are using the correct python install when running compilation scripts, recommend creating a venv then explicitly specifying the path for that venv's python
  • Conda is not tested and may interact with VS environment

@Shaistrong
Copy link

Shaistrong commented Aug 2, 2023

It seems there is an issue with your environment, please confirm:

  • Windows version
  • Visual Studio version, and edition, Visual Studio 2022 Community/Professional is tested, build tools only install has issues with CMake finding CUDA .props, Desktop development with C++ selected when installing VS
  • CUDA version. If Visual Studio is installed after CUDA, reinstall CUDA for the Visual Studio integration, or extract the files from the installer and manually copy. 12.x is tested
  • You are using x64 Native Tools Command Prompt for VS 2022 and have ran set AIT_USE_CMAKE_COMPILATION=1
  • AITemplate is installed
  • Python version, and that you are using the correct python install when running compilation scripts, recommend creating a venv then explicitly specifying the path for that venv's python
  • Conda is not tested and may interact with VS environment

okay:
-windows 11
-VS2022Desktop, by x64 Native Tools Command Prompt for VS 2022 I'm assuming the developer command prompt?
-CUDA11.8
-idk if AIT is installed; I get this error when running setup.py in the FX2AIT directory.
-python3.10.9 no venv; will try in a min.

@hlky
Copy link
Contributor Author

hlky commented Aug 2, 2023

I get this error when running setup.py in the FX2AIT directory.

FX2AIT is different.

If you did not clone with submodules you should clone again
git clone --recursive https://github.com/facebookincubator/AITemplate
or update submodules git submodule update --init --recursive
This PR needs to be merged to a branch
then

cd python
python setup.py bdist_wheel
pip install dist\aitemplate-0.3.dev0-py3-none-any.whl

@Shaistrong
Copy link

python setup.py bdist_wheel

when running python setup.py bdist_wheel on a fresh install with this PR merged I get this error(s?):
shutil.Error: [('../3rdparty\\composable_kernel\\library\\src\\tensor_operation_instance\\gpu\\batched_gemm_add_relu_gemm_add\\device_batched_gemm_add_relu_gemm_add_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gno_gmo_instance.cpp', './aitemplate/3rdparty\\composable_kernel\\library\\src\\tensor_operation_instance\\gpu\\batched_gemm_add_relu_gemm_add\\device_batched_gemm_add_relu_gemm_add_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gno_gmo_instance.cpp', "[Errno 2] No such file or directory: './aitemplate/3rdparty\\\\composable_kernel\\\\library\\\\src\\\\tensor_operation_instance\\\\gpu\\\\batched_gemm_add_relu_gemm_add\\\\device_batched_gemm_add_relu_gemm_add_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gno_gmo_instance.cpp'"), ('../3rdparty\\composable_kernel\\library\\src\\tensor_operation_instance\\gpu\\batched_gemm_add_relu_gemm_add\\device_batched_gemm_add_relu_gemm_add_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gon_gmo_instance.cpp', './aitemplate/3rdparty\\composable_kernel\\library\\src\\tensor_operation_instance\\gpu\\batched_gemm_add_relu_gemm_add\\device_batched_gemm_add_relu_gemm_add_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gon_gmo_instance.cpp', "[Errno 2] No such file or directory: './aitemplate/3rdparty\\\\composable_kernel\\\\library\\\\src\\\\tensor_operation_instance\\\\gpu\\\\batched_gemm_add_relu_gemm_add\\\\device_batched_gemm_add_relu_gemm_add_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gon_gmo_instance.cpp'")]

@hlky
Copy link
Contributor Author

hlky commented Aug 2, 2023

It seems you do not have the submodules.
git clone --recursive https://github.com/facebookincubator/AITemplate
or update submodules git submodule update --init --recursive

@Shaistrong
Copy link

It seems you do not have the submodules. git clone --recursive https://github.com/facebookincubator/AITemplate or update submodules git submodule update --init --recursive

executed these two commands; do I merge with this PR now? or do I run the setup scripts first?

@hlky
Copy link
Contributor Author

hlky commented Aug 2, 2023

executed these two commands

or :)
Yes, you you merge this PR before you install otherwise you will not have the fixes from this PR.

@Shaistrong
Copy link

executed these two commands

or :) Yes, you you merge this PR before you install otherwise you will not have the fixes from this PR.

merged it, I'm sure I did it right. now when running python setup.py bdist_wheel It spits at me this:
raise Error(errors) shutil.Error: [('../3rdparty\\composable_kernel\\library\\src\\tensor_operation_instance\\gpu\\batched_gemm_add_relu_gemm_add\\device_batched_gemm_add_relu_gemm_add_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gno_gmo_instance.cpp', './aitemplate/3rdparty\\composable_kernel\\library\\src\\tensor_operation_instance\\gpu\\batched_gemm_add_relu_gemm_add\\device_batched_gemm_add_relu_gemm_add_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gno_gmo_instance.cpp', "[Errno 2] No such file or directory: './aitemplate/3rdparty\\\\composable_kernel\\\\library\\\\src\\\\tensor_operation_instance\\\\gpu\\\\batched_gemm_add_relu_gemm_add\\\\device_batched_gemm_add_relu_gemm_add_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gno_gmo_instance.cpp'"), ('../3rdparty\\composable_kernel\\library\\src\\tensor_operation_instance\\gpu\\batched_gemm_add_relu_gemm_add\\device_batched_gemm_add_relu_gemm_add_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gon_gmo_instance.cpp', './aitemplate/3rdparty\\composable_kernel\\library\\src\\tensor_operation_instance\\gpu\\batched_gemm_add_relu_gemm_add\\device_batched_gemm_add_relu_gemm_add_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gon_gmo_instance.cpp', "[Errno 2] No such file or directory: './aitemplate/3rdparty\\\\composable_kernel\\\\library\\\\src\\\\tensor_operation_instance\\\\gpu\\\\batched_gemm_add_relu_gemm_add\\\\device_batched_gemm_add_relu_gemm_add_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gon_gmo_instance.cpp'"), ('../3rdparty\\composable_kernel\\library\\src\\tensor_operation_instance\\gpu\\batched_gemm_softmax_gemm_permute\\device_batched_gemm_bias_softmax_gemm_permute_xdl_cshuffle_bf16_bf16_bf16_bf16_gmk_gnk_gno_gmo_instance.cpp', './aitemplate/3rdparty\\composable_kernel\\library\\src\\tensor_operation_instance\\gpu\\batched_gemm_softmax_gemm_permute\\device_batched_gemm_bias_softmax_gemm_permute_xdl_cshuffle_bf16_bf16_bf16_bf16_gmk_gnk_gno_gmo_instance.cpp', "[Errno 2] No such file or directory: '../3rdparty\\\\composable_kernel\\\\library\\\\src\\\\tensor_operation_instance\\\\gpu\\\\batched_gemm_softmax_gemm_permute\\\\device_batched_gemm_bias_softmax_gemm_permute_xdl_cshuffle_bf16_bf16_bf16_bf16_gmk_gnk_gno_gmo_instance.cpp'"), ('../3rdparty\\composable_kernel\\library\\src\\tensor_operation_instance\\gpu\\batched_gemm_softmax_gemm_permute\\device_batched_gemm_bias_softmax_gemm_permute_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gno_gmo_instance.cpp', './aitemplate/3rdparty\\composable_kernel\\library\\src\\tensor_operation_instance\\gpu\\batched_gemm_softmax_gemm_permute\\device_batched_gemm_bias_softmax_gemm_permute_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gno_gmo_instance.cpp', "[Errno 2] No such file or directory: '../3rdparty\\\\composable_kernel\\\\library\\\\src\\\\tensor_operation_instance\\\\gpu\\\\batched_gemm_softmax_gemm_permute\\\\device_batched_gemm_bias_softmax_gemm_permute_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gno_gmo_instance.cpp'"), ('../3rdparty\\composable_kernel\\library\\src\\tensor_operation_instance\\gpu\\batched_gemm_softmax_gemm_permute\\device_batched_gemm_softmax_gemm_permute_xdl_cshuffle_bf16_bf16_bf16_bf16_gmk_gnk_gno_gmo_instance.cpp', './aitemplate/3rdparty\\composable_kernel\\library\\src\\tensor_operation_instance\\gpu\\batched_gemm_softmax_gemm_permute\\device_batched_gemm_softmax_gemm_permute_xdl_cshuffle_bf16_bf16_bf16_bf16_gmk_gnk_gno_gmo_instance.cpp', "[Errno 2] No such file or directory: '../3rdparty\\\\composable_kernel\\\\library\\\\src\\\\tensor_operation_instance\\\\gpu\\\\batched_gemm_softmax_gemm_permute\\\\device_batched_gemm_softmax_gemm_permute_xdl_cshuffle_bf16_bf16_bf16_bf16_gmk_gnk_gno_gmo_instance.cpp'"), ('../3rdparty\\composable_kernel\\library\\src\\tensor_operation_instance\\gpu\\batched_gemm_softmax_gemm_permute\\device_batched_gemm_softmax_gemm_permute_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gno_gmo_instance.cpp', './aitemplate/3rdparty\\composable_kernel\\library\\src\\tensor_operation_instance\\gpu\\batched_gemm_softmax_gemm_permute\\device_batched_gemm_softmax_gemm_permute_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gno_gmo_instance.cpp', "[Errno 2] No such file or directory: './aitemplate/3rdparty\\\\composable_kernel\\\\library\\\\src\\\\tensor_operation_instance\\\\gpu\\\\batched_gemm_softmax_gemm_permute\\\\device_batched_gemm_softmax_gemm_permute_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gno_gmo_instance.cpp'")]

@hlky
Copy link
Contributor Author

hlky commented Aug 2, 2023

It seems you do not have the submodules. Refer to the previous instructions.

@Shaistrong
Copy link

It seems you do not have the submodules. Refer to the previous instructions.

just did.. it's different now however. I think it works? I'm not sure if what I got was an error; it made a bunch of files and acted normally.

@hlky
Copy link
Contributor Author

hlky commented Aug 2, 2023

You will see something like this if it is working

INFO:aitemplate.backend.build_cache_base:Build cache disabled
2023-08-02 23:38:07,807 INFO <aitemplate.testing.detect_target> Set target to CUDA
AIT latent_output shape: [[1, 2], [1, 640], [1, 640], [4]]
2023-08-02 23:38:21,917 INFO <aitemplate.compiler.compiler> Start to compile AIT model. test_dir='A:/unet_v1'
2023-08-02 23:38:21,918 INFO <aitemplate.backend.target> Loading profile cache from: C:\Users\user\.aitemplate\cuda.db
2023-08-02 23:38:21,931 INFO <aitemplate.backend.profiler_cache> table_name='cuda_gemm_3' exists in the db
2023-08-02 23:38:21,932 INFO <aitemplate.backend.profiler_cache> table_name='cuda_conv_3' exists in the db
2023-08-02 23:38:21,933 INFO <aitemplate.backend.profiler_cache> table_name='cuda_conv3d_3' exists in the db
2023-08-02 23:38:27,142 INFO <aitemplate.compiler.compiler> optimized graph elapsed time: 0:00:02.678736

Initial profiling will take some time. The last text before compilation begins will be something like

2023-08-02 23:40:12,977 INFO <aitemplate.backend.codegen> generated 1 function srcs
2023-08-02 23:40:13,001 INFO <aitemplate.compiler.compiler> folded constants elapsed time: 0:00:00.031982
2023-08-02 23:40:13,307 INFO <aitemplate.compiler.transform.memory_planning> Workspace shared_size=1048576000 unique_size=0
2023-08-02 23:40:13,307 INFO <aitemplate.compiler.transform.memory_planning> max_blob=6888756480 constant_offset=1719042304
2023-08-02 23:40:13,610 INFO <aitemplate.backend.codegen> generated 237 function srcs
2023-08-02 23:40:16,267 INFO <aitemplate.backend.codegen> generated 8 library srcs
2023-08-02 23:40:16,299 INFO <aitemplate.backend.cuda.builder_cmake> Executing "C:/Program Files/Microsoft Visual Studio/2022/Professional/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/bin/cmake.EXE" -B "A:/unet_v1/build" -S "A:/unet_v1"
2023-08-02 23:40:19,507 INFO <aitemplate.backend.cuda.builder_cmake> Executing msbuild "A:/unet_v1/build/unet_v1.sln" -m /property:Configuration=Release

Compilation will take several minutes, if you have low CPU core count it can take a while. Alternatively at this point you can cancel out and load the solution in Visual Studio

@Shaistrong
Copy link

You will see something like this if it is working

after the setup.py in AITemplate/python?

@hlky
Copy link
Contributor Author

hlky commented Aug 2, 2023

No, when you run a compilation script.

@Shaistrong
Copy link

Shaistrong commented Aug 2, 2023

No, when you run a compilation script.

didn't try yet, I will now.

@Shaistrong
Copy link

No, when you run a compilation script.

I can't believe this; ModuleNotFoundError: No module named 'aitemplate'
I'm done goddamit.

@Shaistrong
Copy link

@hlky will there be an easy tutorial for using AIT on SDXL after you and Comfy figure this out? I only managed to get this to work on my linux dual boot

@alexanderguzhva
Copy link

No, when you run a compilation script.

I can't believe this; ModuleNotFoundError: No module named 'aitemplate' I'm done goddamit.

tbh, when I initially implemented CMake version, I did not test whether AITemplate can be installed via setup.py, but I made sure that the binary & tests work by running ones from the AITemplate directory. Try to manually change the reference in a python file that uses aitemplate, will it work in this way, at least?

@alexanderguzhva
Copy link

  • build tools only install has issues with CMake finding CUDA .props

Yep, there were definite problems with it

@alexanderguzhva
Copy link

@hlky
In your commit MSVC tensor/expand.py fix I see a malloc, but I don't see a free, so it is a memory leak. Why don't use use std::vector instead?

@hlky
Copy link
Contributor Author

hlky commented Aug 3, 2023

@alexanderguzhva Malloc was replaced in this commit, it's all codegen'd instead.

@kadeng
Copy link
Contributor

kadeng commented Aug 3, 2023

As I cannot test on Windows: What's the state of this PR? There were a lot of comments after the initial review. Should we merge this if it doesn't introduce problems on Linux based CI? @alexanderguzhva @hlky

@Shaistrong
Copy link

It seems there is an issue with your environment, please confirm:

  • Windows version
  • Visual Studio version, and edition, Visual Studio 2022 Community/Professional is tested, build tools only install has issues with CMake finding CUDA .props, Desktop development with C++ selected when installing VS
  • CUDA version. If Visual Studio is installed after CUDA, reinstall CUDA for the Visual Studio integration, or extract the files from the installer and manually copy. 12.x is tested
  • You are using x64 Native Tools Command Prompt for VS 2022 and have ran set AIT_USE_CMAKE_COMPILATION=1
  • AITemplate is installed
  • Python version, and that you are using the correct python install when running compilation scripts, recommend creating a venv then explicitly specifying the path for that venv's python
  • Conda is not tested and may interact with VS environment

I tried reinstalling MSVC, did it entirely. I uninstalled all previous CUDA files, ensured I selected Desktop development with C++ then after MSVC was installed, I switched to CUDA12.2. then I cloned THIS pr, opened Developer Command Prompt, CD'ed to this PR, ran python -m venv venv, then ran git submodule update --init --recursive, then ran cd python, then ran python setup.py bdist_wheel it shoots this error: (venv) C:\Users\joker\OneDrive\Desktop\A.I\AITemplate\python>python setup.py bdist_wheel Traceback (most recent call last): File "C:\Users\joker\OneDrive\Desktop\A.I\AITemplate\python\setup.py", line 58, in <module> shutil.copytree("../3rdparty", "./aitemplate/3rdparty") File "C:\Users\joker\AppData\Local\Programs\Python\Python310\lib\shutil.py", line 559, in copytree return _copytree(entries=entries, src=src, dst=dst, symlinks=symlinks, File "C:\Users\joker\AppData\Local\Programs\Python\Python310\lib\shutil.py", line 513, in _copytree raise Error(errors) shutil.Error: [('../3rdparty\\composable_kernel\\library\\src\\tensor_operation_instance\\gpu\\batched_gemm_add_relu_gemm_add\\device_batched_gemm_add_relu_gemm_add_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gno_gmo_instance.cpp', './aitemplate/3rdparty\\composable_kernel\\library\\src\\tensor_operation_instance\\gpu\\batched_gemm_add_relu_gemm_add\\device_batched_gemm_add_relu_gemm_add_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gno_gmo_instance.cpp', "[Errno 2] No such file or directory: './aitemplate/3rdparty\\\\composable_kernel\\\\library\\\\src\\\\tensor_operation_instance\\\\gpu\\\\batched_gemm_add_relu_gemm_add\\\\device_batched_gemm_add_relu_gemm_add_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gno_gmo_instance.cpp'"), ('../3rdparty\\composable_kernel\\library\\src\\tensor_operation_instance\\gpu\\batched_gemm_add_relu_gemm_add\\device_batched_gemm_add_relu_gemm_add_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gon_gmo_instance.cpp', './aitemplate/3rdparty\\composable_kernel\\library\\src\\tensor_operation_instance\\gpu\\batched_gemm_add_relu_gemm_add\\device_batched_gemm_add_relu_gemm_add_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gon_gmo_instance.cpp', "[Errno 2] No such file or directory: './aitemplate/3rdparty\\\\composable_kernel\\\\library\\\\src\\\\tensor_operation_instance\\\\gpu\\\\batched_gemm_add_relu_gemm_add\\\\device_batched_gemm_add_relu_gemm_add_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gon_gmo_instance.cpp'")]

.......idk why this doesn't work.. tried reinstalling everything, following your instructions precisely, still gets this error.

@hlky
Copy link
Contributor Author

hlky commented Aug 3, 2023

@kadeng It's ready for review/merge.

@Shaistrong
Copy link

@kadeng It's ready for review/merge.

oh, am I the only one having trouble with this? well, the Comfy custom node will have the modules pre-compiled, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants