New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[autoparallel] add rotor C version #1658
[autoparallel] add rotor C version #1658
Conversation
Merge ColossalAI
Daily merge
…ossalAI into feature/add_rotor_c_version
setup.py
Outdated
@@ -191,6 +192,12 @@ def cuda_ext_helper(name, sources, extra_cuda_flags, extra_cxx_flags=[]): | |||
extra_cxx_flags = ['-std=c++14', '-lcudart', '-lcublas', '-g', '-Wno-reorder', '-fopenmp', '-march=native'] | |||
ext_modules.append(cuda_ext_helper('cpu_adam', ['cpu_adam.cpp'], extra_cuda_flags, extra_cxx_flags)) | |||
|
|||
if build_auto_ckpt_ext: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no way to specify this parameter by the user.
setup.py
Outdated
@@ -191,6 +192,12 @@ def cuda_ext_helper(name, sources, extra_cuda_flags, extra_cxx_flags=[]): | |||
extra_cxx_flags = ['-std=c++14', '-lcudart', '-lcublas', '-g', '-Wno-reorder', '-fopenmp', '-march=native'] | |||
ext_modules.append(cuda_ext_helper('cpu_adam', ['cpu_adam.cpp'], extra_cuda_flags, extra_cxx_flags)) | |||
|
|||
if build_auto_ckpt_ext: | |||
print(os.path.join(this_dir, "colossalai/fx/passes/algorithms/dynamic_programs.c")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this print statement.
setup.py
Outdated
if build_auto_ckpt_ext: | ||
print(os.path.join(this_dir, "colossalai/fx/passes/algorithms/dynamic_programs.c")) | ||
ext_modules.append( | ||
Extension("c_version_dp", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not name this variable way, it is quite meaningless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've fixed those problems, currently we will build the C extension the first time user run the rotor solver, only if C extension building fail will we use the python version of dynamic programming.
I will build a unit test to compare the solution of python and C version dynamic programs. |
I will convert this PR to draft until it is ready for review again. |
try: | ||
from .dynamic_programs_C_version import persistent_compute_table | ||
CVERSION = True | ||
except ModuleNotFoundError: | ||
import subprocess | ||
import os | ||
print("dynamic_programs_C_version hasn't been built! Building library...") | ||
this_dir = os.path.dirname(os.path.abspath(__file__)) | ||
result = subprocess.Popen(f'python {os.path.join(this_dir, "build_c_ext.py")} build_ext --build-lib={this_dir}', | ||
stdout=subprocess.PIPE, | ||
stderr=subprocess.PIPE, | ||
shell=True) | ||
if result.wait() == 0: | ||
print("dynamic_programs_C_version has been built!") | ||
from .dynamic_programs_C_version import persistent_compute_table | ||
CVERSION = True | ||
else: | ||
print("dynamic_programs_C_version built failed! Using python version!") | ||
CVERSION = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be put inside the solver_rotor
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay
except ModuleNotFoundError: | ||
import subprocess | ||
import os | ||
print("dynamic_programs_C_version hasn't been built! Building library...") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can replace print with colossalai logger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay
…ossalAI into feature/add_rotor_c_version
What's New?
In this PR I add C extension of rotor dynamic programming table calculation and remove repeated
MetaInfoProp
in rotor solver, right now you should execute the following code before running solver rotorAs we might use rotor for simplified offload strategy, this modification is crucial as we might run solver over the same graph several time for auto capper heuristic, we don't need to run this MetaInfoProp over and over again. Actually with the new C version dynamic programming, the
MetaInfoProp
itself consumes more time than simply run the dynamic programming.