Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AOT recording produces an empty file. #5639

Open
aespielberg opened this issue Aug 5, 2022 · 16 comments
Open

AOT recording produces an empty file. #5639

aespielberg opened this issue Aug 5, 2022 · 16 comments
Assignees
Labels
potential bug Something that looks like a bug but not yet confirmed

Comments

@aespielberg
Copy link

aespielberg commented Aug 5, 2022

Working on Ubuntu 20.04, CUDA 11.7, taichi 1.04.

When running the example descirbed here, the record.yml file is empty.

Full code:

import taichi as ti


ti.aot.start_recording('record.yml')
ti.init(arch=ti.cc)
loss = ti.field(float, (), needs_grad=True)
x = ti.field(float, 233, needs_grad=True)

@ti.kernel
def compute_loss():
   for i in x:
       loss[None] += x[i]**2

@ti.kernel
def do_some_works():
   for i in x:
       x[i] -= x.grad[i]

with ti.ad.Tape(loss):
   compute_loss()
do_some_works()

Here is the output:

[Taichi] version 1.0.4, llvm 10.0.0, commit 2827db2c, linux, python 3.9.7
[I 08/04/22 23:20:05.565 3135938] [action_recorder.cpp:start_recording@26] ActionRecorder: start recording to [record.yml]
[W 08/04/22 23:20:05.569 3135938] [misc.py:adaptive_arch_select@747] Arch=[<Arch.cc: 3>] is not supported, falling back to CPU

And the yml file is empty. There is a comment about arch.cc not being supported for some reason?

@aespielberg aespielberg added the potential bug Something that looks like a bug but not yet confirmed label Aug 5, 2022
@jim19930609
Copy link
Contributor

Hi aespielberg,
I guess you'll have to add a ti.aot.stop_recording() which triggers the serialization.

@jim19930609 jim19930609 self-assigned this Aug 5, 2022
@jim19930609
Copy link
Contributor

In addition, we also have a more up-to-date AOT interface which is well maintained:

@ti.kernel
def run():
    ......

m = ti.aot.Module(ti.cpu)
m.add_kernel(run, template_args={'arr': arr})
m.save(dir_name, 'whatever')

Was wondering if there's any specific reason that you're using ti.aot.start_recording('record.yml')? Otherwise it would be strongly recommended to switch to our latest AOT interface.

@aespielberg
Copy link
Author

I tried stop_recording which did not help; I will try the new interface soon and I will post the output of stop_recording.

@aespielberg
Copy link
Author

aespielberg commented Aug 5, 2022

Okay, adding ti.aot.stop_recording() to the end of that script yields the output:

Taichi] version 1.0.4, llvm 10.0.0, commit 2827db2c, linux, python 3.9.7
[I 08/05/22 13:43:05.420 3301647] [action_recorder.cpp:start_recording@26] ActionRecorder: start recording to [record.yml]
[W 08/05/22 13:43:05.421 3301647] [misc.py:adaptive_arch_select@747] Arch=[<Arch.cc: 3>] is not supported, falling back to CPU
[Taichi] Starting on arch=x64
[I 08/05/22 13:43:05.731 3301647] [action_recorder.cpp:stop_recording@33] ActionRecorder: stop recording

If I change the code instead to:

import taichi as ti



ti.init(arch=ti.cc)
loss = ti.field(float, (), needs_grad=True)
x = ti.field(float, 233, needs_grad=True)

@ti.kernel
def compute_loss():
   for i in x:
       loss[None] += x[i]**2

@ti.kernel
def do_some_works():
   for i in x:
       x[i] -= x.grad[i]

with ti.ad.Tape(loss):
   compute_loss()



m = ti.aot.Module(ti.cpu)
m.add_kernel(do_some_works)
m.save('.', 'record')

Then, I get three metatdata files and a .ll file. I haven't tried to compile them yet to see if they work correctly - is there documentation on how to use these and this new interface? I am sorry; I am having trouble finding it.

@jim19930609
Copy link
Contributor

Hi aespielberg,
Do apologize that there's no official documentation for the serialized AOT files yet, since AOT has not been officially released yet.

In simple words:

  1. kernels (i.e. do_some_works() and compute_loss()) and some internal helper functions get compiled into LLVM IR, written to .ll files.
  2. Data structures (like ti.field) and some descriptive information are stored in metadata files. Personally I always read and analyze metadata files for debug or validation purpose - they're fairly human-friendly though.

@aespielberg
Copy link
Author

Thank you, I understand the basic idea now, this is helpful. I guess, overall, I am also wondering if the possible is currently possible in taichi (even if not documented yet):

  1. Compile kernels and call them in Python later without having to re-compile.
  2. Compile code that calls multiple kernels in succession.
  3. Compute gradients of kernels or save gradient computations.

I know it's not officially released yet, but if there is any example code anywhere of just using the output files of simple functions, that would be super helpful.

@jim19930609
Copy link
Contributor

jim19930609 commented Aug 9, 2022

Hi aespilber,
The answer is yes! We do support all three features - although some of them haven't been release yet.

  1. We have offline-cache mechanism, to be released soon. For an early access, you can turn it on via: ti.init(arch=...., offline_cache=True). By default, the cached files are stored under ~/.cache/taichi.
    (Note that offline-cache is only implemented on CPU and CUDA backends, other backends such as Vulkan, OpenGL does not have this yet.)

  2. For AOT purpose, we do have compute graph which allows you to trace the order of kernel execution and then execute the same way in C++ code. A small example you can probably start with is: https://github.com/taichi-dev/taichi-aot-demo/tree/master/comet . You can ignore most of the codes especially the guiHelper and focus on the comet_run().

  3. Our auto diff feature is designed for this! @erizmr is an expert on this topic, and was wondering if you can provide some simple examples for @aespielberg to start with?

@erizmr
Copy link
Contributor

erizmr commented Aug 9, 2022

Hi @aespielberg , Taichi supports computing gradients of kernels using reverse mode (forward mode will be supported in v1.1 release). The computed gradients are stored in the .grad, which is a field attached to the primal field, e.g., in your example, the gradients d loss / dx are store in x.grad. There are some starting examples and guidance in the doc: https://docs.taichi-lang.org/docs/differentiable_programming which might help.

@qiao-bo qiao-bo closed this as completed Aug 12, 2022
@aespielberg
Copy link
Author

Hi @erizmr sorry, maybe there was a misunderstanding - I know how to use backward computation, I was simply wondering if there was an example on using it with AOT. (Very excited about forward mode btw.)

@jim19930609 Thank you for the cache info and comet example, those are very useful, and this looks like very cool functionality. If I may ask a few follow-up questions:

  1. I see the .ll files created in the cache folder, with name mangling. I am not sure how to navigate this. If I want to just call a particular compiled kernel again from python, in a different file (without the original definition), what is the best way to do this? Also, I am going to guess that this will throw errors if the kernels refer to globals that are not present?
  2. Is there a way to specify the file that this cache is set to? I see a reference to get_repo_dir() in https://github.com/taichi-dev/taichi/blob/master/taichi/program/compile_config.h but I'm not sure how to set this.

By the way, I know the conversation has diverged and this ticket was closed, but was the original issue resolved (and how)?

@jim19930609
Copy link
Contributor

jim19930609 commented Aug 16, 2022

Hi @aespielberg,
For the original issue, can you give a try on the following code? I was able to get "record.yaml" locally and was wondering if you can reproduce this result:

import taichi as ti


ti.aot.start_recording('record.yml')
ti.init(arch=ti.cc)
loss = ti.field(float, (), needs_grad=True)
x = ti.field(float, 233, needs_grad=True)

@ti.kernel
def compute_loss():
   for i in x:
       loss[None] += x[i]**2

@ti.kernel
def do_some_works():
   for i in x:
       x[i] -= x.grad[i]

with ti.ad.Tape(loss):
   compute_loss()
do_some_works()

ti.aot.stop_recording()

The file obtained locally: record.zip

Let me re-open this issue until it's verified to work.

@jim19930609 jim19930609 reopened this Aug 16, 2022
@jim19930609
Copy link
Contributor

As for the other 2 issues:

  1. Loading the cached kernel in a separate file then execute it sounds more like AOT, which is slightly different from OfflineCache (AOT has more flexibility). @ailzhang Please correct me if I were wrong, but I dont think we exposed AOT runtime interfaces to Python, but I feel like this is a fairly useful feature. Was wondering if you can make a feature request to us?
  2. You can set the cache directory in the following manner: ti.init(arch=ti.cpu, offline_cache_file_path="/tmp/aot/")

@jim19930609
Copy link
Contributor

Hi @erizmr sorry, maybe there was a misunderstanding - I know how to use backward computation, I was simply wondering if there was an example on using it with AOT. (Very excited about forward mode btw.)

@jim19930609 Thank you for the cache info and comet example, those are very useful, and this looks like very cool functionality. If I may ask a few follow-up questions:

  1. I see the .ll files created in the cache folder, with name mangling. I am not sure how to navigate this. If I want to just call a particular compiled kernel again from python, in a different file (without the original definition), what is the best way to do this? Also, I am going to guess that this will throw errors if the kernels refer to globals that are not present?
  2. Is there a way to specify the file that this cache is set to? I see a reference to get_repo_dir() in https://github.com/taichi-dev/taichi/blob/master/taichi/program/compile_config.h but I'm not sure how to set this.

By the way, I know the conversation has diverged and this ticket was closed, but was the original issue resolved (and how)?

As for autodiff, I haven't tried any AOT demo with autodiff for now, and I feel like this is another feature request for AOT. Was wondering if we have any thoughts or future plans regarding autodiff-AOT? @ailzhang @erizmr

@aespielberg
Copy link
Author

Hi @aespielberg, For the original issue, can you give a try on the following code? I was able to get "record.yaml" locally and was wondering if you can reproduce this result:

import taichi as ti


ti.aot.start_recording('record.yml')
ti.init(arch=ti.cc)
loss = ti.field(float, (), needs_grad=True)
x = ti.field(float, 233, needs_grad=True)

@ti.kernel
def compute_loss():
   for i in x:
       loss[None] += x[i]**2

@ti.kernel
def do_some_works():
   for i in x:
       x[i] -= x.grad[i]

with ti.ad.Tape(loss):
   compute_loss()
do_some_works()

ti.aot.stop_recording()

The file obtained locally: record.zip

Let me re-open this issue until it's verified to work.

I get an empty record.yml and the following output:

$ python test.py
[Taichi] version 1.0.4, llvm 10.0.0, commit 2827db2c, linux, python 3.9.7
[I 08/16/22 01:32:38.688 2259730] [action_recorder.cpp:start_recording@26] ActionRecorder: start recording to [record.yml]
[W 08/16/22 01:32:38.700 2259730] [misc.py:adaptive_arch_select@747] Arch=[<Arch.cc: 3>] is not supported, falling back to CPU
[Taichi] Starting on arch=x64
[I 08/16/22 01:32:39.009 2259730] [action_recorder.cpp:stop_recording@33] ActionRecorder: stop recording

@jim19930609
Copy link
Contributor

Hi @aespielberg, For the original issue, can you give a try on the following code? I was able to get "record.yaml" locally and was wondering if you can reproduce this result:

import taichi as ti


ti.aot.start_recording('record.yml')
ti.init(arch=ti.cc)
loss = ti.field(float, (), needs_grad=True)
x = ti.field(float, 233, needs_grad=True)

@ti.kernel
def compute_loss():
   for i in x:
       loss[None] += x[i]**2

@ti.kernel
def do_some_works():
   for i in x:
       x[i] -= x.grad[i]

with ti.ad.Tape(loss):
   compute_loss()
do_some_works()

ti.aot.stop_recording()

The file obtained locally: record.zip
Let me re-open this issue until it's verified to work.

I get an empty record.yml and the following output:

$ python test.py
[Taichi] version 1.0.4, llvm 10.0.0, commit 2827db2c, linux, python 3.9.7
[I 08/16/22 01:32:38.688 2259730] [action_recorder.cpp:start_recording@26] ActionRecorder: start recording to [record.yml]
[W 08/16/22 01:32:38.700 2259730] [misc.py:adaptive_arch_select@747] Arch=[<Arch.cc: 3>] is not supported, falling back to CPU
[Taichi] Starting on arch=x64
[I 08/16/22 01:32:39.009 2259730] [action_recorder.cpp:stop_recording@33] ActionRecorder: stop recording

Interesting, I was using a Linux machine with nightly Taichi wheel locally.

may I know what OS and Taichi version you are using, so as to reproduce?

@aespielberg
Copy link
Author

This is taichi 1.04; Ubuntu 20.04.3 LTS; running in Anaconda Python 3.9.7.

@jim19930609
Copy link
Contributor

Verified that ti.aot.start_recording() and ti.aot.stop_recording() generates empty yaml file with taichi 1.04 and 1.10. This is because we turned off TI_WITH_CC option when building for release package. One possible solution is to build taichi from source: https://docs.taichi-lang.org/docs/dev_install.

Or to make your life easy, I'm also able to send you a working python wheel through email (Too large to fit github's file size limit). If you'd prefer a pre-built wheel, please let me know the python version you are using.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
potential bug Something that looks like a bug but not yet confirmed
Projects
Status: Done
Development

No branches or pull requests

4 participants