[Dev] Enhance Operator Cache to support multi-thread environments #205
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix ref to #204 #186
This pull request introduces several changes to improve thread safety, enhance scheduler functionality, and refactor code for better readability and maintainability. The most important changes include adding a lock to the
OperatorCacheclass, modifying theThreadPoolExecutorto use a variable number of workers, and introducing a new fine-grained matrix multiplication scheduler.Thread Safety Enhancements:
bitblas/cache/operator.py: Added acache_lockerusingthreading.RLockto synchronize access to the cache in methods likeadd,get,clear, andsave_into_database. [1] [2]Scheduler Improvements:
bitblas/base/utils.py: ModifiedThreadPoolExecutorto use a variable number of workers (max_workers) instead of a fixed number (4).bitblas/ops/base_scheduler.py: Added a methodget_hardware_aware_configsto raise aNotImplementedErrorfor hardware-aware tuning.bitblas/ops/general_matmul/tilelang/dense/matmul_simt.py: Introduced a newMatmulFineGrainSIMTSchedulerclass for fine-grained matrix multiplication scheduling.Code Refactoring:
bitblas/ops/general_matmul/tilelang/dense/matmul_tensorcore.py: Renamed frommatmul.pyand added imports and methods for hardware-aware configurations. [1] [2] [3]bitblas/ops/operator.py: Refactored multiple methods for better readability, includingapply_fast_tuning,hardware_aware_finetune, and_build_default_module. [1] [2]Additional Changes:
bitblas/ops/general_matmul/tilelang/dense/__init__.py: Updated imports to includematmul_simtandmatmul_tensorcore.bitblas/ops/operator.py: Added import fortl_apply_and_buildfrombitblas.tl.tuner.