Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Cutlass][Build] Clean up temporary files after build #16700

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Lunderberg
Copy link
Contributor

Prior to this commit, compiling cutlass kernels left temporary files in the working directory. These temporary files are then left on disk after building. This commit updates the default behavior when building cutlass to produce files in a temporary directory, which is cleaned up on completion.

Since this change on its own would make it difficult to view and debug the generated cutlass code, this commit also maintains a copy of the generated source in the cutlass module. This can be inspected with for ext in mod.attrs['external_modules']: print(ext.get_source()).

Prior to this commit, compiling cutlass kernels left temporary files
in the working directory.  These temporary files are left on disk
after building.  This commit updates the default behavior when
building cutlass to produce files in a temporary directory, which is
cleaned up on completion.

Since this change on its own would make it difficult to view and debug
the generated cutlass code, this commit also maintains a copy of the
generated source in the cutlass module.  This can be inspected with
`for ext in mod.attrs['external_modules']: print(ext.get_source())`.
@masahi
Copy link
Member

masahi commented Mar 11, 2024

@Lunderberg I think it is better to keep the build dir by default. When we offload matmul or conv2d to cutlass, the build directly contains a cache and many executables for kernel profiling (like below). We want to keep them to make build faster.

$ ls tmp                                                                                                                       
cutlass.o                                                        cutlass_tensorop_h1688gemm_128x128_32x2_tn_align8             
cutlass_conv2d_cache.pickle                                      cutlass_tensorop_h1688gemm_128x128_32x2_tt_align1             
cutlass_gemm_cache.pickle                                        cutlass_tensorop_h1688gemm_128x128_32x2_tt_align2             
cutlass_tensorop_h16816fprop_optimized_128x128_32x3_nhwc_align4  cutlass_tensorop_h1688gemm_128x128_32x2_tt_align8             
cutlass_tensorop_h16816fprop_optimized_128x128_32x3_nhwc_align8  cutlass_tensorop_h1688gemm_128x256_32x2_tn_align1             
cutlass_tensorop_h16816fprop_optimized_128x128_32x4_nhwc_align4  cutlass_tensorop_h1688gemm_128x256_32x2_tn_align2             
cutlass_tensorop_h16816fprop_optimized_128x128_32x4_nhwc_align8  cutlass_tensorop_h1688gemm_128x256_32x2_tn_align8             
cutlass_tensorop_h16816fprop_optimized_128x128_32x5_nhwc_align4  cutlass_tensorop_h1688gemm_128x256_32x2_tt_align1            
cutlass_tensorop_h16816fprop_optimized_128x128_32x5_nhwc_align8  cutlass_tensorop_h1688gemm_128x256_32x2_tt_align2            
cutlass_tensorop_h16816fprop_optimized_128x128_64x3_nhwc_align4  cutlass_tensorop_h1688gemm_128x256_32x2_tt_align8             
cutlass_tensorop_h16816fprop_optimized_128x128_64x3_nhwc_align8  cutlass_tensorop_h1688gemm_128x64_32x2_tn_align1              
cutlass_tensorop_h16816fprop_optimized_128x128_64x4_nhwc_align4  cutlass_tensorop_h1688gemm_128x64_32x2_tn_align2             
cutlass_tensorop_h16816fprop_optimized_128x128_64x4_nhwc_align8  cutlass_tensorop_h1688gemm_128x64_32x2_tn_align8             
cutlass_tensorop_h16816fprop_optimized_128x256_32x3_nhwc_align4  cutlass_tensorop_h1688gemm_128x64_32x2_tt_align1             
cutlass_tensorop_h16816fprop_optimized_128x256_32x3_nhwc_align8  cutlass_tensorop_h1688gemm_128x64_32x2_tt_align2             
cutlass_tensorop_h16816fprop_optimized_128x256_64x3_nhwc_align4  cutlass_tensorop_h1688gemm_128x64_32x2_tt_align8             
cutlass_tensorop_h16816fprop_optimized_128x256_64x3_nhwc_align8  cutlass_tensorop_h1688gemm_256x128_32x2_tn_align1            
cutlass_tensorop_h16816fprop_optimized_128x64_32x6_nhwc_align4   cutlass_tensorop_h1688gemm_256x128_32x2_tn_align2            
cutlass_tensorop_h16816fprop_optimized_128x64_32x6_nhwc_align8   cutlass_tensorop_h1688gemm_256x128_32x2_tn_align8            
cutlass_tensorop_h16816fprop_optimized_128x64_64x3_nhwc_align4   cutlass_tensorop_h1688gemm_256x128_32x2_tt_align1            
cutlass_tensorop_h16816fprop_optimized_128x64_64x3_nhwc_align8   cutlass_tensorop_h1688gemm_256x128_32x2_tt_align2            
cutlass_tensorop_h16816fprop_optimized_256x128_32x3_nhwc_align4  cutlass_tensorop_h1688gemm_256x128_32x2_tt_align8
...

@Lunderberg
Copy link
Contributor Author

Good point. In that case, we should make it explicit that these are intended as cached directories. Currently, they are named tmp on disk and are named tmp_dir in the code. All indication is that they are a temporary directory that wasn't cleaned up correctly, not a cache directory which may be used next time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants