Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MetaSchedule] [CUDA target] Did you forget to bind? #43

Closed
Civitasv opened this issue Jul 5, 2023 · 8 comments
Closed

[MetaSchedule] [CUDA target] Did you forget to bind? #43

Civitasv opened this issue Jul 5, 2023 · 8 comments

Comments

@Civitasv
Copy link

Civitasv commented Jul 5, 2023

Currently, the parameters I am using is as follows:

def do_all_tune(mod, target):
    tunning_dir = "gpu3090"
    tunning_record = "database_tuning_record.json"
    tunning_workload = "database_workload.json"
    cooldown_interval = 150
    trial_cnt = 2000

    local_runner = ms.runner.LocalRunner(cooldown_sec=cooldown_interval, timeout_sec=10)
    database = ms.tir_integration.tune_tir(
        mod=mod,
        target=target,
        work_dir=tunning_dir,
        max_trials_global=trial_cnt,
        max_trials_per_task=2,
        runner=local_runner,
        special_space={},
    )
    if os.path.exists(tunning_record):
        os.remove(tunning_record)
    if os.path.exists(tunning_workload):
        os.remove(tunning_workload)
    database.dump_pruned(
        ms.database.JSONDatabase(
            path_workload=tunning_workload,
            path_tuning_record=tunning_record,
        )
    )

Could you kindly share the parameters you are using to generate the log? I'm curious to know.

@Civitasv Civitasv changed the title Can I know the parameters you used to tuning? Could you please provide the parameters you used for tuning? Jul 5, 2023
@Civitasv
Copy link
Author

Civitasv commented Jul 5, 2023

After finishing tuning, I use:

with args.target, db, tvm.transform.PassContext(opt_level=3):
        mod_deploy = relax.transform.MetaScheduleApplyDatabase(enable_warning=True)(mod)

It will show many warnings like:

[17:27:57] /home/wyc/husen/sandbox/tvm/src/relax/transform/meta_schedule.cc:162: Warning: Tuning record is not found for primfunc: matmul23
[17:27:57] /home/wyc/husen/sandbox/tvm/src/relax/transform/meta_schedule.cc:162: Warning: Tuning record is not found for primfunc: fused_conv2d14_add24_add25
[17:27:57] /home/wyc/husen/sandbox/tvm/src/relax/transform/meta_schedule.cc:162: Warning: Tuning record is not found for primfunc: take
[17:27:57] /home/wyc/husen/sandbox/tvm/src/relax/transform/meta_schedule.cc:162: Warning: Tuning record is not found for primfunc: fused_conv2d37_add34_add35_divide7
[17:27:57] /home/wyc/husen/sandbox/tvm/src/relax/transform/meta_schedule.cc:162: Warning: Tuning record is not found for primfunc: fused_conv2d24_add10
[17:27:57] /home/wyc/husen/sandbox/tvm/src/relax/transform/meta_schedule.cc:162: Warning: Tuning record is not found for primfunc: fused_conv2d7_add10
[17:27:57] /home/wyc/husen/sandbox/tvm/src/relax/transform/meta_schedule.cc:162: Warning: Tuning record is not found for primfunc: fused_matmul28_add27_add28
[17:27:57] /home/wyc/husen/sandbox/tvm/src/relax/transform/meta_schedule.cc:162: Warning: Tuning record is not found for primfunc: fused_matmul11_add11_strided_slice4
[17:27:57] /home/wyc/husen/sandbox/tvm/src/relax/transform/meta_schedule.cc:162: Warning: Tuning record is not found for primfunc: fused_conv2d4_add10_add12
[17:27:57] /home/wyc/husen/sandbox/tvm/src/relax/transform/meta_schedule.cc:162: Warning: Tuning record is not found for primfunc: fused_matmul9_add8_gelu

Then I use relax.build, it will show the typical Did you forget to bind? error.

I don't know why this happen.

cc @tqchen @MasterJH5574

@Civitasv Civitasv changed the title Could you please provide the parameters you used for tuning? [MetaSchedule] [CUDA target] Did you forget to bind? Jul 5, 2023
@nineis7
Copy link

nineis7 commented Jul 7, 2023

You can use diffusers==0.15.0 version and problems may all be solved. ^^

@Civitasv
Copy link
Author

Civitasv commented Jul 7, 2023

You can use diffusers==0.15.0 version and problems may all be solved. ^^

Thanks for your help! But I've tried this, still not work.

image

image

@MasterJH5574
Copy link
Collaborator

Hi @Civitasv, thanks for the question! We used meta_schedule.relax_integration.tune_relax to tune the IRModule mod_deploy.

I guess the mismatch you observed is because both the TIR extraction of tune_relax and MetaScheduleApplyDatabase will “normalize” each TIR function, while tune_tir does not. You can try tune_relax and see if it works for your case.

@MasterJH5574
Copy link
Collaborator

The use of tune_relax can be something like

ms.relax_integration.tune_relax(
    mod=mod_deploy,
    target=tvm.target.Target("apple/m1-gpu-restricted"),  # for WebGPU 256-thread limitation
    params={},
    builder=ms.builder.LocalBuilder(
        max_workers=os.cpu_count(),
    ),
    runner=ms.runner.LocalRunner(timeout_sec=60),
    work_dir="log_db",
    max_trials_global=50000,
    max_trials_per_task=2000,
)

@Civitasv
Copy link
Author

Thanks for your reply.
I've tried this, but sadly, it still cannot work. Following your advice, I've changed my configuration as follows:

def do_all_tune(mod, target):
    tunning_dir = "gpu3090_workdir"
    tunning_record = "gpu3090/database_tuning_record.json"
    tunning_workload = "gpu3090/database_workload.json"
    cooldown_interval = 0
    trial_cnt = 100
    trial_per = 2

    local_runner = ms.runner.LocalRunner(cooldown_sec=cooldown_interval, timeout_sec=60)
    database = ms.relax_integration.tune_relax(
        mod=mod,
        target=target,
        work_dir=tunning_dir,
        max_trials_global=trial_cnt,
        max_trials_per_task=trial_per,
        runner=local_runner,
        params={},
    )
    if os.path.exists(tunning_record):
        os.remove(tunning_record)
    if os.path.exists(tunning_workload):
        os.remove(tunning_workload)
    database.dump_pruned(
        ms.database.JSONDatabase(
            path_workload=tunning_workload,
            path_tuning_record=tunning_record,
        )
    )

Still saying #43 (comment).

I wonder if it is relavent to the max_trials_global and max_trials_per_task option.

@Civitasv
Copy link
Author

I wonder if it is relevent to the max_trials_global and max_trials_per_task option.

Yes, it is relevant. For 10000 and 2000 for trial_cnt and trial_per, only the take operator is wrong.

@MasterJH5574
Copy link
Collaborator

Thanks @Civitasv! Glad that it works :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants