Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggesting a roadmap for v0.1 #12

Closed
3 of 26 tasks
ehsanmok opened this issue Mar 4, 2019 · 13 comments
Closed
3 of 26 tasks

Suggesting a roadmap for v0.1 #12

ehsanmok opened this issue Mar 4, 2019 · 13 comments

Comments

@ehsanmok
Copy link
Contributor

ehsanmok commented Mar 4, 2019

Hi Laurent

First of all, I wanted to thank you again for making this happen. Given the pace of the developments and I would love to see an amazing NN crate for Rust, below are my suggestions for v0.1 release.

  • Improve error handling:
  • Various idiomatic Rust improvements:
  • More unit test coverage.
  • Improve overall documentations.
    • For module level docs use //!
    • Add doc examples more important methods/functions.
    • Cross-reference modules.
  • Decouple implementations from codegen.
  • Complete tutorials at least as much as the ocaml-torch equivalent.
  • Integration with Rust ndarray.
  • GPU build and testing:
    • Local
    • CI (no free option)
  • Cover as much as PyTorch API as possible. (see how it goes?)
    • Linalg ops for dense and sparse tensors.
    • Add as much nn ops as possible in nn.
    • Initializers.
    • Data loading and augmentations.
    • Multiprocessing with rayon.
    • Distributed (though it's harder).
  • Pytorch extensions C++ <--> C <--> Rust
  • Subcrate core, vision, model_zoo, ffi inside tch through vitual workspace manifest.

Since you've put a lot of efforts so far and I guess functionality-wise you want to make this crate mimic your other similar projects, please let us know of any other plans to be on the same page.

@LaurentMazare
Copy link
Owner

Thanks for putting this together, indeed there is quite a bit of polishing needed to reach v0.1. To me the most important point is improving tutorials/examples (and documentation). This should help defining the good patterns for the library. I think good starting points are trying to port existing pytorch tutorials or the examples that I've put together in the ocaml version - this includes GAN, RL, finetuning models, etc... I feel that design choices should be defined by usage/feedback - and hopefully the api in its current form could already be used to do pretty nice things.
Currently we have three main examples:

As per the scope, I feel that torch-sys should be the C unsafe bindings, tch should stay as close as possible to the C++ api but provide a type safe api on top that could be used to build/train most models. There is certainly room for some higher level api that would be better typed and more rust idiomatic, e.g. the type system currently does not provide information on the type of elements in a tensor, the device, the dimensions. It would probably be quite fun to do, but to me it's a bit out of scope for now.

More specific points:

  • API coverage should be pretty good already (as it's automatically generated), do you have specific points in mind?
  • GPU training should already work, I'm using it for most examples. We should indeed consider having it supported in CI but I'm not sure if there is some free option for this.
  • In the ocaml version I implemented most of the vision models that can be found in the python api torch::vision. I'll probably try to add some of these soon - and will provide pre-trained weights so that they can be used out of the box.
  • Jupyter interaction would be a nice thing to have/experiment with.

I've also posted on users.rust-lang.org to see if we can get more feedback.

@ehsanmok
Copy link
Contributor Author

ehsanmok commented Mar 4, 2019

I general, I'd like to help make the crate more idiomatic. Several conversions, error handling, more test coverage, Rust ndarray integration (added to the list above) etc. can be improved and I've already started it from the most important ground-level improvements :)

... tch should stay as close as possible to the C++ api

I haven't used the torch C++ API (still unstable I think!). I thought you're targeting more of Python APIs, since this is what people are more interested in and the selling point for Python-DL community to come to Rust could be much greater that C++-Pytorch coming to Rust.

... the type system currently does not provide information on the type of elements in a tensor, the device, the dimensions. It would probably be quite fun to do, but to me it's a bit out of scope for now.

Absolutely! I'm working on this part now along with other idiomatic improvements that I see.

API coverage should be pretty good already (as it's automatically generated), do you have specific points in mind?

Well, comparing to PyTorch Python API there're a lot to be covered, but step by step :)

GPU training should already work, I'm using it for most examples. We should indeed consider having it supported in CI but I'm not sure if there is some free option for this.

Great! yes, CI won't be free AFAIK.

Jupyter interaction would be a nice thing to have/experiment with.

That'd be great to test out and add some tutorials! It seems evcxr is the most promising though I haven't used it.

@LaurentMazare
Copy link
Owner

Re rust ndarray integration, to start with I think it should probably be done in a separate crate that would have dependencies on both ndarray and tch. I would like to only have as few dependencies as possible for now - maybe if the result is small it could be integrated though (or if tch ends up having lots of dependencies anyway).

Re being more idiomatic, that's probably a good goal. Would you have some pointers on what is not idiomatic enough yet ? I already made some change to use the failure crate but there is certainly more that can be done.

Re C++ vs Python, I think it's probably easier to convince C++ people to use Rust than Python folks but that's a minor point. The main point is that as we're binding to the C++ api, it's simpler to mimic this. That being said the two apis are not that different. Happy to know which bits of the python api you're missing the most. On tensor operations I would think that it's already decently covered, same for optimizers, torch::vision I plan to add a bit on this (although I may move it to a separate crate too).

And just to emphasize again: I think that more examples/tutorials would be a great way to show what can be done with this in rust and also give us more insights on how to structure things.

Finally a point I forgot in my previous message is writing extensions. There is a C++ tutorial on how to do this and if it's possible to do it in rust it seems like a nice selling point as it's difficult for python to compete here.

@ehsanmok
Copy link
Contributor Author

ehsanmok commented Mar 4, 2019

One thing that can help is making tch crate have some sub-crates through workspaces in one repository. This helps with the separation of concerns and some of your legit dependency management concerns. Potentially, we can have a core crate, torchvision crate, later ffi etc. mimicing the PyTorch.

rust ndarray integration, to start with I think it should probably be done in a separate crate that would have dependencies on both ndarray and tch.

Adding some conversion supports should be enough like PyTorch, Numpy conversions. If you worry about dependencies, it can be behind a feature at least.

being more idiomatic, that's probably a good goal. Would you have some pointers on what is not idiomatic enough yet ?

I'll send you a WIP PR soon showing exactly what I mean.

I think it's probably easier to convince C++ people to use Rust than Python folks

In broader scope, yes! I think there're folks (me included) who have already used PyTorch Python but not happy when it comes to statically typed, type/memory safety, deployment issues, etc. That was also mentioned by Yann LaCun's in his recent interview about programming language, type safety etc. So my point is this crate has a great potential of adoption by those folks.

And just to emphasize again: I think that more examples/tutorials would be a great way to show what can be done with this in rust and also give us more insights on how to structure things.

Absolutely! I'm 100% with you. Examples/tutorials are great, no doubt and as much as reimplementations we can have in Rust, it'd make it even more awesome :)

Finally a point I forgot in my previous message is writing extensions. There is a C++ tutorial on how to do this and if it's possible to do it in rust it seems like a nice selling point as it's difficult for python to compete here.

Sure! that'd be great. I have some idea, though need to test their feasibility first, as it can lead to some new-ish territories.

@ehsanmok
Copy link
Contributor Author

ehsanmok commented Mar 7, 2019

@LaurentMazare I'm now better familiar with the code base, so just updated the proposed list above. Sorry, if I wasn't specific enough initially. I started with little stuff to familiarize myself in the meantime.

Please let me know if these're helpful and whether you want me to continue on this.

@LaurentMazare
Copy link
Owner

@ehsanmok Thanks for all your work on this. Overall I feel that it's a bit early days to have that much of a detailed roadmap. I think I need one or two more weeks playing with the library and porting examples to understand better the proper use cases/abstractions.

@LaurentMazare
Copy link
Owner

Closing this as things have diverged substantially since this list was created. I think a lot of the list made it and it's pretty usable right now and up to date with PyTorch v1.1.0 so I'll craft a 0.1 release soonish.

@arilou
Copy link
Contributor

arilou commented Jun 30, 2021

Does tsh-rs support rayon (aka multi-threading) i.e for getting the probability of an image like in the sample from a pre-trained net?

For example can this code (ripped part from the Readme) can run from many threads or does it need a Mutex to wrap the access to resnet?

// Apply the forward pass of the model to get the logits and convert them
// to probabilities via a softmax.
let output = resnet18
.forward_t(&image.unsqueeze(0), /train=/ false)
.softmax(-1);

// Finally print the top 5 categories and their associated probabilities.
for (probability, class) in imagenet::top(&output, 5).iter() {
println!("{:50} {:5.2}%", class, 100.0 * probability)
}

Thanks,
-- Jon.

@arilou
Copy link
Contributor

arilou commented Jul 3, 2021

Going to answer my self here that the answer is no because Tensor is !Sync
https://docs.rs/tch/0.0.8/tch/struct.Tensor.html#synthetic-implementations

Is there any reason not to make it Sync?

@LaurentMazare
Copy link
Owner

I'm not sure there would much to be gained by applying something like rayon on top of a forward pass: the operations there would already be parallelized by the torch c++ api to use all available cores on your machine. As noted in this thread, it could actually get the other way around and harm performance.
As to why tensors are not Sync, an issue is that the aliasing model of pytorch is not properly represented in the rust type system so some storage may be shared between tensors and lead to unsound behavior. In practice this is not much of an issue, but doing more operations across thread is likely to trigger more race conditions (though probably having Send is already an issue here).

@arilou
Copy link
Contributor

arilou commented Jul 3, 2021

I see, thank you for the fast reply :) (I'm new to the ML stuff) I was wondering considering the thread you linked.

If I have a per-trained model for image classifications, then there is no need for me to parallel the work per-image (I have thousands of images i want to classify), instead the parallelism will happen during the classification phase (working with the pre-trained model) did I understand things correct?

@LaurentMazare
Copy link
Owner

I think that's right, all the operations done by your model (matrix multiplications, convolutions, etc) will run in parallel already using all the cores of your CPU (or potentially the execution units of your GPU), and you can also process a large number of images in the same batch to ensure that the overhead of going from an image to the other is not too large.

@arilou
Copy link
Contributor

arilou commented Jul 5, 2021

Thanks! I will give it a try

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants