Suggesting a roadmap for v0.1 #12

ehsanmok · 2019-03-04T06:56:04Z

LaurentMazare · 2019-03-04T10:44:08Z

Thanks for putting this together, indeed there is quite a bit of polishing needed to reach v0.1. To me the most important point is improving tutorials/examples (and documentation). This should help defining the good patterns for the library. I think good starting points are trying to port existing pytorch tutorials or the examples that I've put together in the ocaml version - this includes GAN, RL, finetuning models, etc... I feel that design choices should be defined by usage/feedback - and hopefully the api in its current form could already be used to do pretty nice things.
Currently we have three main examples:

The MNIST examples, with some quickly written tutorial.
The CIFAR example which has quite good performance (~95% accuracy).
The char-rnn example, also with some small write-up.

As per the scope, I feel that torch-sys should be the C unsafe bindings, tch should stay as close as possible to the C++ api but provide a type safe api on top that could be used to build/train most models. There is certainly room for some higher level api that would be better typed and more rust idiomatic, e.g. the type system currently does not provide information on the type of elements in a tensor, the device, the dimensions. It would probably be quite fun to do, but to me it's a bit out of scope for now.

More specific points:

API coverage should be pretty good already (as it's automatically generated), do you have specific points in mind?
GPU training should already work, I'm using it for most examples. We should indeed consider having it supported in CI but I'm not sure if there is some free option for this.
In the ocaml version I implemented most of the vision models that can be found in the python api torch::vision. I'll probably try to add some of these soon - and will provide pre-trained weights so that they can be used out of the box.
Jupyter interaction would be a nice thing to have/experiment with.

I've also posted on users.rust-lang.org to see if we can get more feedback.

ehsanmok · 2019-03-04T17:23:24Z

I general, I'd like to help make the crate more idiomatic. Several conversions, error handling, more test coverage, Rust ndarray integration (added to the list above) etc. can be improved and I've already started it from the most important ground-level improvements :)

... tch should stay as close as possible to the C++ api

I haven't used the torch C++ API (still unstable I think!). I thought you're targeting more of Python APIs, since this is what people are more interested in and the selling point for Python-DL community to come to Rust could be much greater that C++-Pytorch coming to Rust.

... the type system currently does not provide information on the type of elements in a tensor, the device, the dimensions. It would probably be quite fun to do, but to me it's a bit out of scope for now.

Absolutely! I'm working on this part now along with other idiomatic improvements that I see.

API coverage should be pretty good already (as it's automatically generated), do you have specific points in mind?

Well, comparing to PyTorch Python API there're a lot to be covered, but step by step :)

GPU training should already work, I'm using it for most examples. We should indeed consider having it supported in CI but I'm not sure if there is some free option for this.

Great! yes, CI won't be free AFAIK.

Jupyter interaction would be a nice thing to have/experiment with.

That'd be great to test out and add some tutorials! It seems evcxr is the most promising though I haven't used it.

LaurentMazare · 2019-03-04T17:56:09Z

Re rust ndarray integration, to start with I think it should probably be done in a separate crate that would have dependencies on both ndarray and tch. I would like to only have as few dependencies as possible for now - maybe if the result is small it could be integrated though (or if tch ends up having lots of dependencies anyway).

Re being more idiomatic, that's probably a good goal. Would you have some pointers on what is not idiomatic enough yet ? I already made some change to use the failure crate but there is certainly more that can be done.

Re C++ vs Python, I think it's probably easier to convince C++ people to use Rust than Python folks but that's a minor point. The main point is that as we're binding to the C++ api, it's simpler to mimic this. That being said the two apis are not that different. Happy to know which bits of the python api you're missing the most. On tensor operations I would think that it's already decently covered, same for optimizers, torch::vision I plan to add a bit on this (although I may move it to a separate crate too).

And just to emphasize again: I think that more examples/tutorials would be a great way to show what can be done with this in rust and also give us more insights on how to structure things.

Finally a point I forgot in my previous message is writing extensions. There is a C++ tutorial on how to do this and if it's possible to do it in rust it seems like a nice selling point as it's difficult for python to compete here.

ehsanmok · 2019-03-04T19:11:01Z

One thing that can help is making tch crate have some sub-crates through workspaces in one repository. This helps with the separation of concerns and some of your legit dependency management concerns. Potentially, we can have a core crate, torchvision crate, later ffi etc. mimicing the PyTorch.

rust ndarray integration, to start with I think it should probably be done in a separate crate that would have dependencies on both ndarray and tch.

Adding some conversion supports should be enough like PyTorch, Numpy conversions. If you worry about dependencies, it can be behind a feature at least.

being more idiomatic, that's probably a good goal. Would you have some pointers on what is not idiomatic enough yet ?

I'll send you a WIP PR soon showing exactly what I mean.

I think it's probably easier to convince C++ people to use Rust than Python folks

In broader scope, yes! I think there're folks (me included) who have already used PyTorch Python but not happy when it comes to statically typed, type/memory safety, deployment issues, etc. That was also mentioned by Yann LaCun's in his recent interview about programming language, type safety etc. So my point is this crate has a great potential of adoption by those folks.

And just to emphasize again: I think that more examples/tutorials would be a great way to show what can be done with this in rust and also give us more insights on how to structure things.

Absolutely! I'm 100% with you. Examples/tutorials are great, no doubt and as much as reimplementations we can have in Rust, it'd make it even more awesome :)

Finally a point I forgot in my previous message is writing extensions. There is a C++ tutorial on how to do this and if it's possible to do it in rust it seems like a nice selling point as it's difficult for python to compete here.

Sure! that'd be great. I have some idea, though need to test their feasibility first, as it can lead to some new-ish territories.

ehsanmok · 2019-03-07T14:45:47Z

@LaurentMazare I'm now better familiar with the code base, so just updated the proposed list above. Sorry, if I wasn't specific enough initially. I started with little stuff to familiarize myself in the meantime.

Please let me know if these're helpful and whether you want me to continue on this.

LaurentMazare · 2019-03-09T06:37:42Z

@ehsanmok Thanks for all your work on this. Overall I feel that it's a bit early days to have that much of a detailed roadmap. I think I need one or two more weeks playing with the library and porting examples to understand better the proper use cases/abstractions.

LaurentMazare · 2019-05-04T19:22:25Z

Closing this as things have diverged substantially since this list was created. I think a lot of the list made it and it's pretty usable right now and up to date with PyTorch v1.1.0 so I'll craft a 0.1 release soonish.

arilou · 2021-06-30T09:36:17Z

Does tsh-rs support rayon (aka multi-threading) i.e for getting the probability of an image like in the sample from a pre-trained net?

For example can this code (ripped part from the Readme) can run from many threads or does it need a Mutex to wrap the access to resnet?

// Apply the forward pass of the model to get the logits and convert them
// to probabilities via a softmax.
let output = resnet18
.forward_t(&image.unsqueeze(0), /train=/ false)
.softmax(-1);

// Finally print the top 5 categories and their associated probabilities.
for (probability, class) in imagenet::top(&output, 5).iter() {
println!("{:50} {:5.2}%", class, 100.0 * probability)
}

Thanks,
-- Jon.

arilou · 2021-07-03T05:20:37Z

Going to answer my self here that the answer is no because Tensor is !Sync
https://docs.rs/tch/0.0.8/tch/struct.Tensor.html#synthetic-implementations

Is there any reason not to make it Sync?

LaurentMazare · 2021-07-03T06:54:21Z

I'm not sure there would much to be gained by applying something like rayon on top of a forward pass: the operations there would already be parallelized by the torch c++ api to use all available cores on your machine. As noted in this thread, it could actually get the other way around and harm performance.
As to why tensors are not Sync, an issue is that the aliasing model of pytorch is not properly represented in the rust type system so some storage may be shared between tensors and lead to unsound behavior. In practice this is not much of an issue, but doing more operations across thread is likely to trigger more race conditions (though probably having Send is already an issue here).

arilou · 2021-07-03T11:42:21Z

I see, thank you for the fast reply :) (I'm new to the ML stuff) I was wondering considering the thread you linked.

If I have a per-trained model for image classifications, then there is no need for me to parallel the work per-image (I have thousands of images i want to classify), instead the parallelism will happen during the classification phase (working with the pre-trained model) did I understand things correct?

LaurentMazare · 2021-07-03T11:49:54Z

I think that's right, all the operations done by your model (matrix multiplications, convolutions, etc) will run in parallel already using all the cores of your CPU (or potentially the execution units of your GPU), and you can also process a large number of images in the same batch to ensure that the overhead of going from an image to the other is not too large.

arilou · 2021-07-05T03:05:38Z

Thanks! I will give it a try

ehsanmok mentioned this issue Mar 5, 2019

Initial roadmap improvements #14

Closed

LaurentMazare closed this as completed May 4, 2019

grtlr mentioned this issue Jun 7, 2019

Shapes expect i64 instead of usize #51

Open

0h-n0 mentioned this issue Feb 6, 2020

Multi GPU initialization error #153

Closed

rustrust mentioned this issue Apr 27, 2020

cc unhappy #178

Closed

rustrust mentioned this issue May 14, 2020

Tensor Indexing with Cuda/GPU Device #191

Closed

cymqqqq mentioned this issue Apr 6, 2021

cargo run yolo error #337

Closed

valyagolev mentioned this issue Jun 2, 2023

SentenceEmbeddingsModel is not Sync guillaume-be/rust-bert#389

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggesting a roadmap for v0.1 #12

Suggesting a roadmap for v0.1 #12

ehsanmok commented Mar 4, 2019 •

edited

Loading

LaurentMazare commented Mar 4, 2019

ehsanmok commented Mar 4, 2019

LaurentMazare commented Mar 4, 2019

ehsanmok commented Mar 4, 2019 •

edited

Loading

ehsanmok commented Mar 7, 2019 •

edited

Loading

LaurentMazare commented Mar 9, 2019

LaurentMazare commented May 4, 2019

arilou commented Jun 30, 2021

arilou commented Jul 3, 2021

LaurentMazare commented Jul 3, 2021

arilou commented Jul 3, 2021

LaurentMazare commented Jul 3, 2021

arilou commented Jul 5, 2021

Suggesting a roadmap for v0.1 #12

Suggesting a roadmap for v0.1 #12

Comments

ehsanmok commented Mar 4, 2019 • edited Loading

LaurentMazare commented Mar 4, 2019

ehsanmok commented Mar 4, 2019

LaurentMazare commented Mar 4, 2019

ehsanmok commented Mar 4, 2019 • edited Loading

ehsanmok commented Mar 7, 2019 • edited Loading

LaurentMazare commented Mar 9, 2019

LaurentMazare commented May 4, 2019

arilou commented Jun 30, 2021

arilou commented Jul 3, 2021

LaurentMazare commented Jul 3, 2021

arilou commented Jul 3, 2021

LaurentMazare commented Jul 3, 2021

arilou commented Jul 5, 2021

ehsanmok commented Mar 4, 2019 •

edited

Loading

ehsanmok commented Mar 4, 2019 •

edited

Loading

ehsanmok commented Mar 7, 2019 •

edited

Loading