Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load TorchScript modules #644

Merged
merged 12 commits into from
Jul 2, 2022
Merged

Conversation

NiklasGustafsson
Copy link
Contributor

With this PR, TorchSharp supports loading modules that were created using torch.jit.{trace,script} with Pytorch.

No support for traced or scripted functions, yet. The PR also does not support tracing or scripting in TorchSharp.

return this;
}

#if false // These functions "work," but the native code doesn't seem to find any interesting information.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if

we'll keep this code for now? what is the next step here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to try to figure out why the scripts don't seem to contain any information on element type or dimensions.

{
if (!System.IO.File.Exists(filename))
throw new System.IO.FileNotFoundException(filename);
return new ScriptModule(THSJIT_load(filename));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THSJIT_load(filename)

What happen if I pass a file pass that is exist but not containing a valid script? does the native side somehow throw? how we recognize this failing case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will blow up in some fashion. We're just interfacing with the libtorch support for TorchScript, so whatever it does. I'll add a negative unit test.

tarekgh
tarekgh previously approved these changes Jul 2, 2022
Copy link
Member

@tarekgh tarekgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few comments. LGTM otherwise.


public unsafe override Tensor forward(Tensor tensor)
{
using (var parray = new PinnedArray<IntPtr>()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tarekgh -- is there some way to avoid the new[]{} and use stackalloc here, instead?

Copy link
Member

@tarekgh tarekgh Jul 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to use PinnedArray at all too. You may try the following which will allow writing the whole method without any unsafe code and you'll not to do any managed allocations.

ReadOnlySpan<IntPtr> buffer = stackalloc IntPtr[1] { tensor.Handle };
var res = THSJIT_Module_forward(handle, (IntPtr)buffer, 1);

Added unit test for a script module formed from a function in Pytorch.
@GeorgeS2019
Copy link

GeorgeS2019 commented Jul 2, 2022

I assume this is related to this functionality: torch.jit.save

For consistency, would it make sense to rename the extension *.dat in the unit tests to *.pt?

  • test/TorchSharpTest/func.script.pt
  • test/TorchSharpTest/l1000_100_10.script.pt
  • test/TorchSharpTest/linrelu.script.pt
  • test/TorchSharpTest/scripted.script.pt

This will help separate 3 different file formats.

*.pth and *.data [Torchsharp specific format for import and export] discussed in saveload.md

With this new implementation, now *.pt is introduced to the saveload.md

Instead of using model.ts in unit test, it seems the consensus is to keep it as model.pt (for torchscript file, load or save)

@NiklasGustafsson
Copy link
Contributor Author

The file extensions used in unit tests don't really matter.

@NiklasGustafsson
Copy link
Contributor Author

NiklasGustafsson commented Jul 2, 2022

But, yes, there will now be three file formats:

  1. *.pth, which TorchSharp does not support. This is for saving Pytorch modules via pickling.
  2. *.pt, which TorchSharp will load and save, but not create from scratch. This is for saving Pytorch modules via TorchScript.
  3. *.ts/.data/.whatever, which is for saving model weights only in the TorchSharp-specific format.

@GeorgeS2019
Copy link

I think *.ts is a good TorchSharp format for those involving importsd.py and exportsd.py

@GeorgeS2019
Copy link

*.pt, which TorchSharp will load and save, but not create from scratch. This is for saving Pytorch modules via TorchScript.

Using the consensus practice used in Rust LibTorch wrapper, *.pt is meant for Torchscript, created in pytorch.jit.save, loaded in RustWrapper, save in RustWrapper. I believe they are all could be interchanged, I hope this is right.

If so, *.pt could be a good format interchange between pytorch, Torchsharp and perhaps other LibTorch wrappers

@GeorgeS2019
Copy link

Another thought!

.pt is used between pytorch and LibTorch c++ wrappers, including Torchsharp.

Since .pt includes the architecture of networks and trained weights, .pt could be converted into Onnx using TorchSharp for interchange with ML.NET.

Likewise, that interchange could be done in ML.NET.

However, I think it makes more sense to have a utility to convert .pt to onnx in Torchsharp.

@NiklasGustafsson
Copy link
Contributor Author

If so, *.pt could be a good format interchange between pytorch, Torchsharp and perhaps other LibTorch wrappers

That's what it's for. Please note that TorchSharp will lack the ability to create TorchScript files, even after this PR. We will just be able to load and save.

@GeorgeS2019
Copy link

Another thought!

.pt is used between pytorch and LibTorch c++ wrappers, including Torchsharp.

Since .pt includes the architecture of networks and trained weights, .pt could be converted into Onnx using TorchSharp for interchange with ML.NET.

Likewise, that interchange could be done in ML.NET.

However, I think it makes more sense to have a utility to convert .pt to onnx in Torchsharp.

@GeorgeS2019
Copy link

TorchSharp will lack the ability to create TorchScript files,

Agree, however, Torchsharp is able to load TorchSript files and save back TorchScript files though the libtorch c++. => that is loadable by pytorch or other libtorch wrappers

@NiklasGustafsson
Copy link
Contributor Author

Since .pt includes the architecture of networks and trained weights, .pt could be converted into Onnx

I believe that's exactly how the ONNX exporter for Pytorch works. Since you have to start with Pytorch (not TorchSharp) in order to get a TorchScript file, I don't know why you would want to export to ONNX from TorchSharp. What is the scenario you have in mind?

@GeorgeS2019
Copy link

GeorgeS2019 commented Jul 2, 2022

I don't know why you would want to export to ONNX from TorchSharp. What is the scenario you have in mind?

The exported ONNX help keep the .NET communities staying WITHIN .NET :-)

Imagine!

Many pytorch communities are sharing models using TorchScript formats.

TorchSharp could serve as the starting point for importing these TorchShript format and allowing the .NET community to do further training or inference within Torchsharp OR exporting that to ONNX for inference using ML.NET

@GeorgeS2019
Copy link

By doing that, this addresses the need for Deep Learning support in ML.NET based on a survey done in April 2021.

@NiklasGustafsson
Copy link
Contributor Author

NiklasGustafsson commented Jul 2, 2022

TorchSharp could serve as the starting point for importing these TorchShript format and allowing the .NET community to do further training or inference within Torchsharp OR exporting that to ONNX for inference using ML.NET

Not until you can start from scratch in TorchSharp. Right now, you cannot have a TorchScript file without starting in Python, where you can also create a ONNX file from the same model. Once we can do torch.jit.script() in TorchSharp, sure, but until then, this is still very limited functionality

@GeorgeS2019
Copy link

Many pytorch communities are sharing models using TorchScript formats.

We do not have to start from scratch in TorchSharp :-)

@GeorgeS2019
Copy link

Likewise, many .NET users do not care where Onnx come from as long as they are available for use in ML.NET

@NiklasGustafsson
Copy link
Contributor Author

NiklasGustafsson commented Jul 2, 2022

In order to write an ONNX exporter, we would need to go through the same work that we need to do in order to export to TorchScript. So, until we're in a position to do that, ONNX export will also have to wait. I agree that it's important, but it's separate from this functionality, which allows you to import TorchScript, modify it, and save it, but nothing else.

One thing at a time.

@GeorgeS2019
Copy link

GeorgeS2019 commented Jul 2, 2022

Agree ONE THING AT A TIME.

The final goal of keeping DEEP machine learning within the .NET is happening!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants