-
Notifications
You must be signed in to change notification settings - Fork 3
What do we want to build? #1
Comments
I want to note that, while it works great, https://github.com/twistedfall/opencv-rust is not particularly user-friendly or 'clean' in rust terms. Maybe we could have a look at it? |
I think the use case that coud make Rust shine, is deployment. Currently the de-facto "mainstream" stack is python based ( scikit-learn, np, pandas + your DL framework of choice, TensorFlow, Torch .. ). It shines for fast prototyping, because python, but it sucks for industrialization ( and deployment ), because ... python. I really think rust would do great. In that area, I kinda like TensorFlow serving, but it forces you to have a separate service ( that you call with their protobuf/RPC).
|
I'm currently building a large project with rust (i mention it here: https://users.rust-lang.org/t/interest-for-nlp-in-rust/15331/9), where I am doing the data engineering in rust (lots of string metrics). [tldr; I found lots of disparate projects with 50% of what I needed for string metrics but instead rolled my own, trying to incorporate previous work and give credit] I want to feed the feature vectors to Julia to experiment with what I want to use for classification and modelling, and then I'll want to be able to use rust for inference/classification etc.... I had to pause development for business reasons, but I'm starting again: one of my biggest issues was not ML related but finding a nice pattern for parallel file download (seems like it should be simple, but maybe I'm spoiled by go's simplicity lol). From this real-world project point of view, as well from my time spent thinking in the abstract and surveying the ML ecosystem in rust (about a year), I would think that a focus on data engineering in general and serving models is the way to go (this also seems to be a widely shared sentiment). In a practical sense I would like to see rust jobs for data engineer and machine learning engineer... that is, the bookends of a typical data science project; serving the data and serving the model. That is, targeting software developers, infrastructure, computational math, and data people. Trying to convince research scientists to use Rust would be wasted effort; for most of these people software is a secondary skill and so they need something easy to learn, dynamically typed, with a REPL... I've watched this play out in the Python/R/Matlab versus Julia world... and while IMO Julia has a lot to offer current python/r/matlab devs and is similar enough to those languages, trying to get that group of people to use Julia is not easy, i can't imagine what it'd be like proposing Rust. Here are some challenges I see:
|
Really looking forward to digging into ndarray. Though I've had a slight delay, I'm writing up ndarray examples for the Grokking Deep Learning book where andrew trask introduces deep learning with only numpy. He's expressed interest and welcomed the examples... :) |
A standardized tokenization implementation! Tokenization fills the role of "turn the text into fixed vectors" that you'd feed into standard models. As an NLP practitioner and Rust user, tokenization is an incredibly important step in the pipeline, a big barrier to new people trying to apply NLP, and a place where lots of small bugs creep in due to non-standard implementations that take forever to find. Having a standard implementation for the simpler tokenization methods (like regex matching) would make NLP problems much more approachable in Rust. |
One part of machine learning where Rust could shine right now is simulation for Reinforcement learning. For instance if I training an agent to play blackjack, the biggest bottleneck here is the "playing" blackjack over and over by the agent to collect enough data for training. Rayon and Actix could be used to create fast and performant game "environments" now, without need for an established ML ecosystem. |
I agree with @DhruvDh, using Rust to simulate environments for RL agents would be great. Having something akin to OpenAI's gym interface would be really nice. Many RL researchers are going to still want to use Python and all the associated deep learning libraries. So, I would love to see RL environments rendered in Rust that could be interfaced with both Python and Rust for agents. Edit: I imagine that algorithms like Monte Carlo Tree Search would be really useful if they were written in Rust. I would not want to wait on Python to handle that bit. |
Along these lines, I am working on a hobby project (link), which does this. It isn't quite ready for even an alpha release yet, but I am in the final stages of cleaning up the API with the intent to publish it. |
Things Rust definitely needs: const generics Things that we might want but I'm not sure: |
I've been thinking of building a rust deep learning / GPU compute library on top of the TVM framework for a while now. I think it could address a lot of the things @flo-dhalluin is talking about. TVM's an amazing project that's currently flying a bit under the radar. It's an open source deep learning compiler - it compiles deep neural nets / large array operations to run on the GPU (or on OpenCL, or FPGA, or TPU, or WebGL...). You define an AST of computations via its API, and it spits out a small (<5mb) shared library containing just the operations you wanted, on whatever acceleration framework and target platform you want. It currently has a working Rust runtime library, which lets you call a compiled model from Rust. It integrates with That's pretty neat, and I don't think it would be too hard to build some really cool tools on top of it. My dream is something like:
// a crate based on tvm
// `cargo build` will (by default) download + checksum a prebuilt TVM library
// that this links to, so that you don't have to wait for a whole compiler to compile.
// The download will only be ~50mb -- way smaller and easier than lots of other deep
// learning frameworks. It will also support running code on things besides cuda!
// The output binary won't need to link the compiler (by default) and will therefore be
// only a few megabytes.
extern crate tvmrs;
// a procedural macro that converts Rust code to Relay IR.
// Relay IR is TVM's high-level IR for defining neural networks / computation chains,
// sorta like a tensorflow Graph. It's also not too dissimilar to Rust.
// The macro will compile the IR with TVM at build-time, and link the resulting artifacts
// to this rust library.
tvmrs::accelerate! {
// stateless operation
fn relu_downsample(x: Tensor[c, n, h, w]) -> Tensor[c, n, h/2, w/2] {
relu(downsample(x))
}
// stateful operation
struct Block<oc> {
conv: Conv2d<3,3,oc>,
elu: Elu
}
impl Op for Block<oc> {
fn run(self, input: Tensor[c, n, h, w]) -> Tensor[oc, n, h, w] {
self.elu(self.conv(input))
}
}
fn swap_channels(x: Tensor[2, n, h, w]) -> Tensor[2, n, h, w] {
// a low-level tensor operation defined as a TVM Tensor expression.
let out = compute!(x.shape, |cc, nn, hh, ww| x[(c + 1) % 2, nn, hh, ww)]);
out
}
// a sequential network container.
sequential! Network {
#[opencl] Block<3,3,5>, // run on opencl
#[opencl] relu_downsample,
#[opencl] Conv2d::new(1,1,2),
#[rust] debug, // call a normal rust function
#[cpu] swap_channels // run this part on CPU to maximize throughput
}
// Compute a derivative of the network.
// Relay IR is designed to be differentiable.
derivative! NetworkDerivative (Network);
}
// a normal rust function
fn debug(x: Tensor) {
...
}
fn main() {
tvmrs::training_loop! {
net: Network,
dnet: NetworkDerivative,
epochs: 37,
training_data: dataset! {...},
valid_data: dataset! {...},
...
}
}
fn main() {
let input = tvmrs::ndarray_from_stdin();
let output = Network::load_params("params.bin").run(input);
println!("{:?}", output);
} (Further reading: Introduction to Relay, TVM Tensor expressions) All of this is of course pending mountains of bikeshedding, i have no idea what the final API will look like. One of the nifty things here is that this isn't limited to deep learning models. TVM can handle pretty much any algorithm made of large array operations. So if you wanted to run your SVM on GPU, you can do that pretty easily! Steps to take here:
If people are interested in this implementation path we could throw a repo together and start work. I mainly want this because I'm don't want to be stuck using Python and Cuda all the time for my deep learning research :))) |
A few months ago I have started a crate of my own for deep learning. My goal is to have a library which:
It's currently totally useless. Right now I'm in the process of adding a Vulkan backend (I have a few thousand lines of work-in-progress code on my disk which I've not pushed yet.); once I finish that in a few weeks I plan further build it up so that I can train CIFAR-10 up to at least ~90% accuracy, add some model import/export functionality (probably from/to the ONNX format) and only then it will be actually usable for something practical. Some people would call this a waste of time and effort, and, well, I do agree that it would be probably more productive to not do this completely from scratch as I'm doing (e.g. by using TVM as kazimuth said), but I don't really care - I'm just trying to scratch my own itch. |
@kazimuth while I love the snippets you've shown here, a lot of my love for Rust exists because of the all the compile time checks the compiler does, and the wonderfully easy to comprehend error messages. I feel that if one is using Rust just as a way to compose, and run functionality defined in other languages then there isn't much to gain here. Might as well just use Python. And TVM looks more like a tool for deploying neural nets rather than training them; which is very useful but I would prefer to do both in Rust. There's also tch-rs - bindings to PyTorch's libtorch. Something else that is also interesting is dual_num, which as I best understand it is some fancy math that might eventually let us to automatic differentiation. |
@koute the long term road-map is amazing but I don't get why bother putting effort into the tensorflow backend. Admittedly I don't have enough know-how to imagine what a native backend would look like and the kind of work it would need. |
@DhruvDh The TensorFlow backend will be most likely removed in the future. Currently it is there for a few reasons:
|
Some cool stuff coming to light. Is anyone familiar with work presented at c4ML? https://www.c4ml.org/ Here are some references to work being done in Swift and Julia (Note, Rust, Swift, Julia were all top of the list for google's tensorflow project that eventually became swift-for-tf). I don't know of any projects in Rust along these lines ^^ ... of course, they are also all funded (google, and julia computing). |
@DhruvDh that's a fair criticism, but really that's a problem whenever you want to use a hardware accelerator. You're always going to be calling into a language with different semantics from the host. Using Rust for glue gives you type-safety, performance, and lovely tooling. e.g. it's dead-simple to write a parallel image preprocessing pipeline in Rust, whereas with python you need a load of hacks (FFI, multiprocessing) to get acceptable performance. Also, you're free to define new low-level operations in Rust; users shouldn't ever need to use another language :) And yeah, currently TVM's publicity is oriented around deployment, because that's where there's a gap in the python ecosystem. There's no reason their compiler wouldn't work for training too, though. @jbowles I've worked with some of those projects; see my comment, I think we can borrow some of that work. also CC @nhynes |
Other thought: I wonder what interactive scientific programming would look like in Rust? There's a jupyter kernel but i'm not sure how usable it is. It might be that rust should just be used for high-performance kernels and stuff, and be easy to call from other languages like you lay out in your presentation @LukeMathWalker. |
Wow, there really is a lurking interest 😛 This is just great. The discussion has explored several different directions, I'd like to give more details on what I do envision (and where that need comes from). I strongly align with @flo-dhalluin: I think Rust can really shine in delivering an end-to-end production workflow. Tackling this challenge requires the building blocks I mentioned (n-dimensional arrays, dataframes) and some others that have been brought up (e.g. running code on different types of hardware, easy interop, reading/writing to a lot of different formats). Certain capabilities can be borrowed from other languages, others we should probably port and develop natively in Rust (a sufficiently large zoo of preprocessing techniques and standard models). While I do understand the interest in the Deep Learning area, I don't think it's realistic to kickstart an effort to make Rust a primary language for NN development: we should definitely be able to deploy and run NN models (the TVM project is an excellent example here), but I don't think we would be adding a lot of value by chasing huge projects like TensorFlow or PyTorch. Summing it up, the minimum working prototype that I have in mind to show off what Rust can do goes along these lines:
If you could manage to get the experience right, I am quite sure the interest in Rust for this kind of use cases would skyrocket. |
I agree, however, you're looking at it from a perspective of a data scientist who wants to fill in the gaps of their existing workflow and augment their ML pipeline with Rust. I'm looking at it from a perspective of a Rust developer who just wants to augment their existing application with a little ML without going through the hoops of exporting their data, processing it through a mainstream ML framework, and serializing it back so that it can be used by the application again. In other words - my personal interest lies in not filling a gap in the existing ML ecosystem (although that's also most certainly worthwhile!), but in filling a gap in the Rust ecosystem by creating value for existing Rust users (and perhaps the users of other languages) so that they could take advantage of ML in a plug-and-play fashion with minimal amount of fuss. (Which is why things like wide hardware and platform support, simplicity, lack of non-Rust dependencies so it's easy to build and cross-compile, etc. is important.) |
I can volunteer work to rust-ml for tokenizers, string distance metrics, and/or onehot encoding package. I've already been working on the first two as I have real-world projects that need these so I can double up. As far as a onehot package I'm interested to learn more how efficient onehot encoding is done under the hood and have a use for the package as well.
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn on_word_splitter() {
fn word_split(c: char) -> bool {
match c {
'\n' | '|' | '-' => true,
_ => false,
}
}
let res = TokenizerNaive::word_splitter("HelLo|tHere", &word_split);
assert_eq!(res, vec!["HelLo", "tHere"])
}
#[test]
fn on_tokens_lower_filter() {
fn tokens_filter(c: char) -> bool {
match c {
'-' | '|' | '*' | ')' | '(' | '&' => true,
_ => false,
}
}
let res = TokenizerNaive::tokens_lower_with_filter("|HelLo tHere", &tokens_filter);
assert_eq!(res, " hello there");
let res1 = TokenizerNaive::tokens_lower_with_filter("HelLo|tHere", &tokens_filter);
assert_eq!(res1, "hello there");
let res2 = TokenizerNaive::tokens_lower_with_filter("HelLo tHere", &tokens_filter);
assert_eq!(res2, "hello there");
let res6 =
TokenizerNaive::tokens_lower_with_filter("****HelLo *() $& )(tH*ere", &tokens_filter);
assert_eq!(res6, " hello $ th ere");
}
#[test]
fn on_pre_process() {
let res = TokenizerNaive::pre_process("Hotel & Ristorante Bellora");
assert_eq!(res, "hotel ristorante bellora");
let res1 = TokenizerNaive::pre_process("Auténtico Hotel");
assert_eq!(res1, "auténtico hotel");
let res2 = TokenizerNaive::pre_process("Residence Chalet de l'Adonis");
assert_eq!(res2, "residence chalet de l adonis");
let res6 = TokenizerNaive::pre_process("HOTEL EXCELSIOR");
assert_eq!(res6, "hotel excelsior");
let res6 = TokenizerNaive::pre_process("Kotedzai Trys pusys,Pylimo ");
assert_eq!(res6, "kotedzai trys pusys pylimo");
let res6 = TokenizerNaive::pre_process("Inbursa Cancún Las Américas");
assert_eq!(res6, "inbursa cancún las américas");
}
#[test]
fn on_tokens_alphanumeric() {
let res3 = TokenizerNaive::tokens_alphanumeric("|HelLo tHere");
assert_eq!(res3, " HelLo tHere");
let res4 = TokenizerNaive::tokens_alphanumeric("HelLo|tHere");
assert_eq!(res4, "HelLo tHere");
let res5 = TokenizerNaive::tokens_alphanumeric("HelLo * & )(tHere");
assert_eq!(res5, "HelLo tHere");
}
#[test]
fn on_tokens_lower() {
let res = TokenizerNaive::tokens_lower_str("HelLo tHerE");
assert_eq!(res, "hello there")
}
#[test]
fn on_tokens_simple() {
assert_eq!(
TokenizerNaive::chars("hello there"),
["h", "e", "l", "l", "o", " ", "t", "h", "e", "r", "e"]
);
assert_eq!(
TokenizerNaive::chars("hello there").concat(),
String::from("hello there")
)
}
#[test]
fn on_similarity_identity() {
assert_eq!(TokenCmp::new_from_str("hello", "hello").similarity(), 100);
}
#[test]
fn on_similarity_high() {
assert_eq!(TokenCmp::new_from_str("hello b", "hello").similarity(), 83);
assert_eq!(
TokenCmp::new_from_str("this is a test", "this is a test!").similarity(),
97
);
assert_eq!(
TokenCmp::new_from_str("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear").similarity(),
91
);
}
#[test]
fn on_token_sequencer() {
let an = AlphaNumericTokenizer;
let one = an.sequencer("Marriot &Beaches Resort|").join(" ");
let two = an.sequencer("Marriot& Beaches^ Resort").join(" ");
assert_eq!(one, two);
}
#[test]
fn on_token_sort() {
let s1 = "Marriot Beaches Resort foo";
let s2 = "Beaches Resort Marriot bar";
assert_eq!(TokenCmp::new_from_str(s1, s2).similarity(), 62);
let sim = token_sort(s1, s2, &TokenCmp::new_sort, &TokenCmp::similarity);
assert_eq!(sim, 87);
}
#[test]
fn on_token_sort_again() {
let s1 = "great is scala";
let s2 = "java is great";
assert_eq!(TokenCmp::new_from_str(s1, s2).similarity(), 37);
let sim = token_sort(s1, s2, &TokenCmp::new_sort_join, &TokenCmp::similarity);
assert_eq!(sim, 81);
}
#[test]
fn on_amstel_match_for_nate() {
let sabre = "INTERCONTINENTAL AMSTEL AMS";
let ean = "InterContinental Amstel Amsterdam";
assert_eq!(TokenCmp::new_from_str(sabre, ean).similarity(), 20);
assert_eq!(TokenCmp::new_from_str(sabre, ean).partial_similarity(), 14);
assert_eq!(
token_sort(sabre, ean, &TokenCmp::new_sort, &TokenCmp::similarity),
79
);
assert_eq!(
token_sort(
sabre,
ean,
&TokenCmp::new_sort,
&TokenCmp::partial_similarity
),
78
);
}
#[test]
fn on_partial_similarity_identity() {
let t = TokenCmp::new_from_str("hello", "hello");
assert_eq!(t.partial_similarity(), 100);
}
#[test]
fn on_partial_similarity_high() {
let t = TokenCmp::new_from_str("hello b", "hello");
assert_eq!(t.partial_similarity(), 100);
}
#[test]
fn on_similarity_and_whitespace_difference() {
let t1 = TokenCmp::new_from_str("hello bar", "hello");
let t2 = TokenCmp::new_from_str("hellobar", "hello");
let sim1 = t1.similarity();
let sim2 = t2.similarity();
assert_ne!(sim1, sim2);
assert!(sim1 < sim2);
assert_eq!(sim1, 71);
assert_eq!(sim2, 77);
} |
This is a very cool idea :) Question: what would a general We might want to brainstorm a list of goals / requirements for the design, before we start writing code. Maybe in another issue?
Do you think it would be possible to do something with a trait-based approach here? Like, the rust pattern of building up a stack of combinators, you get |
@kazimuth yes i think that would be the way; allow the user to compose a tokenizer. The I do have a trait-based approach (ideas i got from this Text-Analysis-in-Rust-Tokenization) in my current project but those are in service of tokenizing for comparing token similarity. With full-blown tokenization an API should support allowing a user to compose the various things they need (e.g., a char filter, normalizing text, etc...) like your example. The hard part I'm really referring to is the output of the tokenization. For example, I have function Vec<std::borrow::Cow<'a, str>>; First, I'm new enough to rust to still not totally understand all the consequences of using texts = ["foo bar", "bar foo zaz", "did bar", "zaz bar jazz", "good jazz zaxx"]
tfidf = TfidfVectorizer(min_df=2, max_df=0.5, ngram_range=(1, 2))
features = tfidf.fit_transform(texts)
pd.DataFrame(features.todense(), columns=tfidf.get_feature_names())
d_vtz = CountVectorizer()
print(d_vtz.fit_transform(texts))
h_vtz = HashingVectorizer()
print(h_vtz.fit_transform(texts) It seems what one would want in rust is a tokenizer that returns vectors of tokens that can just be "plugged in" to lots of different ways to turn text into numbers. |
I'd like to see more individual specialised components that form part of an ML pipeline, rather than anything monolithic attempting to implement too much at once. This gives Rust a chance to build up its ML strengths, slowly replacing individual parts of a mature ML pipeline. Having e.g. python bindings to those components would also allow them to start getting used and proving benefit, without needing a 100% switch to Rust. Modules/components I'd love to see:
|
An article I found very interesting, from 2 years ago, is this one: http://athemathmo.github.io/2016/09/07/typesystem-machine-learning.html
My loyalty is divided, to say the least: I'd love to be able to host 100% of my workflow in Rust because I strongly believe in the language potential and in the potential of the tooling around it.
Thanks to the strong packaging and distribution story provided by Rust, the effort of flashing out algorithms and preprocessing tools can be extremely distributed: once there is a set of agreed-upon traits as interfaces, we can leverage the influx of people who are fascinated and allow them to be productive and develop new crates without having to worry about the fundamentals. |
@kazimuth that Jupyter kernel is usable; I'm starting to learn ai with it in here: |
Seems to me one of the more difficult problems doing this in rust is getting common types and traits and types defined for different packages to interface with. If I'm not mistaken @LukeMathWalker you seem to point towards using Ndarray as basically numpy. I'm all on board with that. What if there were something like a core package that defined some of the core traits and structs and types? I can see lots of pros/cons for doing that. |
@jbowles RE: tokenizer API This gets at a broader problem with a simple function-y It might make sense to just start implementing without a core crate of traits, and once we've smacked into enough walls in the design space, we can figure out what the interfaces to our systems tend to look like, and retrofit a core design around that. |
Although I'm not sure that Rust is going to usurp Python and C++ as the de-facto ML programming model, it's definitely a worthy goal. Along those lines, I think that flashlight (and the underlying arrayfire library) has an interface that we might want to emulate. In any case, the real key feature of PyTorch and JAX is the expressivity of Python backed by a high-performance JIT tensor compiler. I'm pretty sure it's possible to do something similar in Rust by writing a compiler plugin that tracks the types+ops of ndarrays and provides the data to a JIT compiler. Maybe something like #[jit]
fn mlp(
data: &Array<2, f32>,
weights: Vec<&Array<2, f32>,
labels: &Array<1, u8>
) -> f32 {
let fc1 = data.dot(weights[0]); // fn dot -> Array<D, T, Op=gemm>
Array::pointwise_max(0, fc1) // Array<D, T, Op=Max<0, fc1>>
} This is just a sketch and depends on how cost generics actually pan out, but the idea is that a compiler plugin can find the |
Hey everyone just an FYI a few weeks ago I proposed a new working group for machine learning on the internal calls for new working groups. This discussion has me very excited for the future of Rust! Let me know if there's anything I can help with. |
Very interesting discussion, looking forward to see more ML libraries in rust. The biggest drawback is that the current api doesn't enforce immutability properly (see this issue) and so is unsafe in that respect. It hasn't been much of an issue so far but it would be nicer to fix this, hopefully this could be done with a bit of an api change. I guess the goal is less ambitious than what has been proposed so far in this thread but maybe this could provide a decent alternative inbetween, especially for cases where interop with the python api of PyTorch is important. |
Just wanted to say: keep up the good work! The binding looks very promising. I played a bit with it a few weeks ago and it was still a bit hard to get data in and out of tensors (for someone new to the crate), so Anyway, really looking forward to using tch-rs! |
@danieldk just to mention that |
@LaurentMazare that's really nice! However, the current API requires a copy of the tensor data to go from a Torch tensor to an https://docs.rs/ndarray-tensorflow/0.2.0/ndarray_tensorflow/ This allows one to use and modify Tensorflow |
The whole approach of working with graphs of operations is horrible. It's in effect a badly-typed guts of a compiler. Humans shouldn't ever touch that DAG. It's an internal artefact of rewriting terms describing the user domain into something the computer can run efficiently. I would really like to see good support for differentiating expressions. Here is a nice bit of work done in haskell: https://arxiv.org/pdf/1804.00746.pdf I'm not sure it is directly applicable to rust, but perhaps it is. The key intuition is that complex differentiable expressions can be built up from a small core language of trivially differentiable expressions and combinators over these. |
I think there are some good reasons to expose the computation DAG, and most of them have something to do with data locality. While there are many semantically equivalent ways to distribute the computations of a DAG, the programmer may know which ops to keep on the same device and which can be efficiently moved. I'd love a system that does that optimization for me, but barring that, we need to be able to say things like "these two ops go on the same device" and I think that means we need to have an exposed DAG at some point. Also, I just skimmed that whole paper and I don't really understand what it's promising to improve. Admittedly I'm not a mathematician and couldn't really follow proofs, but the central assertion seems to be that one can use a type system like Haskell's to make it nicer to implement FAD and RAD. It's a cleaner implementation, way less code! But that doesn't seem like the hard part to me, or the important part. If the implementation of my ops required 100x as much code to write and use dirty mutation tricks under the hood, but the final product runs 4% faster, I'm completely happy to make that tradeoff. I'm probably missing something important though, please let me know if that's the case! |
I'm not a good enough coder to write 100x more code and have any confidence that it is working correctly ;) The issue of where an operation executes is an interesting one. It is somewhat orthogonal to what operation is executing, although clearly some families of operations will run faster in one place than another, and there are locality issues to reduce latency from moving contexts. I think in a more "get the computer to work it out" framework, you'd have first-class things that are the devices that expose operations, and then compose these operations. You could also have some enum of providers to be compiled down to dyns minimizing some expression cost function perhaps. This is where tagless encodings of operations really shine, since you can "run" transformations on the expressions themselves. But I've not yet tried anything like this in rust. As I say, it may not be the right abstraction for the language, or may unavoidably require higher kinded types or intrusive macros to make it work, which gets us back towards the 100x extra code. |
Actually, 100x is extreme. I think at that point I might be getting interested in trading off a few points for a code reduction. Usually though I am willing to write a lot more code, and go through a lot of tedium, to drag out a few more points of speedup. The machine time just costs a lot more than my time; the tradeoff is almost always worth it, and any system that leaves perf on the table is not an option for my use case. Though for a lot of use cases I'm sure it would be fine. Probably a more important question than which side of the tradeoff you want take is: can we find a zero cost abstraction for AD? IMO that is the true spirit of Rust; we won't really be done until we've undermined the whole idea that there is convenience/efficiency tradeoff. [1] As for AD, I actually don't see that "how you derive your AD" and "whether the end-user API is to construct a computation graph or a computable expression" are much related. It feels like the idea in the paper is that there is a trait or monad that each comptuable expression can implement which would allow the automatic recovery of exact derivatives. I suppose the same idea could be applied "ops" (edges of a computation graph). [1] https://boats.gitlab.io/blog/post/zero-cost-abstractions/ |
I think one of things to keep in mind is this isn’t an either or scenario for this kind of work. We could easily expose the underlying structures and have crates that are owned by a WG that abstract those layers away, or vice versa. |
I'm very exciting to hear about the information you discussed since I'm a fun of rust. |
@deg4uss3r Just joining the discussion at the moment, but I'm a little curious about plans moving forward, since this issue has been fairly quiet since this past year. I saw that the the machine learning WG application got put off a little bit, since GameDev was given priority in the queue (although got started and seems to be moving along as of late last year), but wanted to see if there was a timeline for reaching back out to the larger Rust organization regarding resubmitting or getting new consideration of the existing application. Also, to toss in my own two cents, since I haven't seen anyone else with quite my direction of interest: as opposed to a data science or NLP background, my primary interest in the application of ML to control systems, particularly soft real-time control with respect to visual or stereo imagery. Rust is great for a lot of this. It's extremely quick, cross-compiling for embedded ARM boards is really easy, and the error handling is typically fairly ergonomic for catching and dealing with problems with I also want to point this out: while Rust has a reputation for being difficult for new programmers, I consider it to be my first real programming language, apart from writing some bash scripts. One of the reasons I was able to do this is because the documentation surrounding Rust in terms of both the language and standard library, as well as most of the core crates through the ecosystem are superb, and the tooling is both simple and powerful. As a result, I've been able to pick up topics like image processing and file serialization much more quickly than if I was trying to sort through either the C++11/14/17 standard library and be able to brute-force things that would have taken much too long otherwise in a language like Python or MATLAB. From first installing the language to flashing In that thread, I think that potentially makes Rust a really good choice for anyone interested in integrating some machine learning into their systems, and would love to see an end result that enables a similar story. My personal idea would be something similar to Embedded Rust's "Discovery" Book, with starting from "hi there, please add these 1-2 dependencies to your Obviously, much, much, easier said than done, but hey, that's what I'd like to (help) build. |
@quietlychris, I have a lot of same ideas as you and have really been looking to getting something regarding this. To me Edge ML and soft real-time control systems are something rust can excel in. I have actually been meaning to write a blog post about it, as imo this is the place where a killer rust framework can be created (like go has docker, ruby has rails etc.). |
@quietlychris I definitely still plan on leading the working group but, I haven't heard anything from the Governance WG either. As I understood they would ping us on the ticket when they were ready to help guide the next official WG into this space. I know they had a a lot of things to figure out on the Governance side (as I check in on their channel from time to time) so I am unsure if this is still on their list just after they establish themselves or if it has accidentally fallen to the wayside. As for your follow on points, I believe that is the direction I would like to do as well. Start with "nothing" and slowly start writing the Rust ML Book to do the classic first classifying MNIST problem. As we go along and find that libraries are out of date/missing/could be improved we do so, hopefully taking ownership or at minimum revitalize them as we go along. @Shock-1 I also agree with edge since it's been blogged about a lot and as @quietlychris said) it's fairly easy to get Rust onto some micro-controllers, and even mobile devices. I am also REALLY excited about using Rust's error handling or utilizing the My plan forward from today is to ping Florian on the ticket about the creation of the working group, but I want to give that a week or so since I know Mozilla just had it's winter all-hands and it's best to let the dust settle before asking for new/more work ;). I hope to have something for this discussion group in about a week, no less than two. And, for what it is worth, I do not think the other proposed group (cryptography) has moved forward officially yet either. |
@deg4uss3r That sounds like a plan to me. I don't see any particular rush in terms of getting the WG structure up and running right at this very moment, but touching base with the Governance group when things are little quieter does seem like a solid next step to me. As I mentioned, I'm still fairly new to ML, so am planning on personally laying out the MNIST classification problem in Python (and maybe MATLAB/Octave or Julia) to make sure I have a strong handle on what the somewhat canonical workflows and results look like with those tools. Your comment about working on or building libraries along the way in the process is right along the lines of what I was thinking as well, and something I would definitely be on board with. I believe that while constant generics aren't going to be stable in the immediate future, they're fairly well into the RFC pipeline (rust-lang #44580), which I seem to understand could be helpful for writing numerical backends. @Shock-1 It's wonderful to see other people interested in the same sort of thing, and coming from a similar background (my formal education was also in mechanical engineering, although my control systems courses mostly focused on classical theory like PID controllers and dynamical systems, not neural nets or other ML components). I honestly hadn't much considered edge deployments in terms of smartphones, but it makes a lot of sense in leveraging Rust's multithreading to pull performance out of some of those hexa- and octacore processors showing up. Your insights in new Rust users coming in from Python is valuable too--I hadn't seen that information before. As I mentioned a little earlier, I'm going to be spending some time brushing up on other language's frameworks, but will hopefully have something available for you to contribute to at some point moving forward. |
Hello, it's been a while since I last checked up on the discussion here, but I'm glad it kept going and many people are interested in both hybrid and pure Rust DL. I've written my thoughts about the current state of DL and what I think would be required for a pure Rust DL MVP. It maybe of interest to someone here. |
@tiberiusferreira Thanks for posting that write-up! It was great to see a fairly comprehensive look at some of those topics. I think your insights into moving computations onto GPUs make a lot of sense in terms of the options that are available at the moment. I had looked into some of the Rust-native code (like the I'm just starting to work through Pytorch's system in Python, but once I get a little more comfortable with understanding that, I'm definitely going to be looking into replicating that with |
Hi All, |
It was mentioned before by @davechallis but has any work been done on scaling and normalisation. I'm about to start implementing a new machine learning based crate and I find myself reaching for L1/L2 norm again and thinking there should be a crate with some fundamentals like that in |
I work on data analysis. I am a fan of Rust and Julia. I recently wrote the hnsw_rs crate and considering writing a parallel stochastic gradient.
Just now my point of view is to write the base block using Rust and keep the upper part in Julia. Interfacing both languages is easy and keeping the interactive is really fine.
Surely some collaboration will occur.
Get BlueMail for Android
…On Feb 5, 2020, 07:01, at 07:01, chris m ***@***.***> wrote:
@deg4uss3r Just joining the discussion at the moment, but I'm a little
curious about plans moving forward, since this issue has been fairly
quiet since this past year. I saw that the the machine learning WG
application got put off a little bit, since GameDev was given priority
in the queue (although got started and seems to be moving along as of
late last year), but wanted to see if there was a timeline for reaching
back out to the larger Rust organization regarding resubmitting or
getting new consideration of the existing application.
Also, to toss in my own two cents, since I haven't seen anyone else
with quite my direction of interest: as opposed to a data science or
NLP background, my primary interest in the application of ML to control
systems, particularly soft real-time control with respect to visual or
stereo imagery. Rust is great for a lot of this. It's extremely quick,
cross-compiling for embedded ARM boards is really easy, and the error
handling is typically fairly ergonomic for catching and dealing with
problems with `?` and or `Result` into `match` statements. I've done a
little work writing clustering algorithms for RGB images and for
cross-library compatibility between image formats to integrate with
OpenCV both for the `opencv-rust` binding and directly across the FFI
boundary, but am really interested in being able to deploy a control
system enabled by neural nets, deep learning, etc. natively from Rust.
I also want to point this out: while Rust has a reputation for being
difficult for new programmers, I consider it to be my first real
programming language, apart from writing some bash scripts. One of the
reasons I was able to do this is because the documentation surrounding
Rust in terms of both the language and standard library, as well as
most of the core crates through the ecosystem are superb, and the
tooling is both simple and powerful. As a result, I've been able to
pick up topics like image processing and file serialization much more
quickly than if I was trying to sort through either the C++11/14/17
standard library and be able to brute-force things that would have
taken much too long otherwise in a language like Python or MATLAB. From
first installing the language to flashing `#![no_std]`programs onto a
microcontroller, the structures in place for learning how to write code
that is safe, fast, and easy to understand in a number of domains have
made a world of difference.
In that thread, I think that potentially makes Rust a really good
choice for anyone interested in integrating some machine learning into
their systems, and would love to see an end result that enables a
similar story. My personal idea would be something similar to Embedded
Rust's "Discovery"
[__Book__](https://rust-embedded.github.io/discovery/index.html), with
starting from "hi there, please add these 1-2 dependencies to your
`Cargo.toml` file`, through building a classifer that does a decent job
at classifying the MNIST data set, or something else along those lines.
It also seems like having a Book like that would be really helpful for
giving both members of the Rust ML community and programmers at large a
jumping-off point at which they can start branching into other areas,
and in the immediate future, present a concrete objective around which
to organize and drive towards.
Obviously, much, much, easier said than done, but hey, that's what I'd
like to (help) build.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
#1 (comment)
|
@jean-pierreBoth Nice! I'd love to see how you are using Julia + Rust... hope to look through HnswAnn.jl soon. |
@jean-pierreBoth have you seen the rust-cv org, they did a hnsw crate and might be working on other areas you're interested in https://github.com/rust-cv/hnsw could be worth collaborating. Many hands light work etc |
@deg4uss3r I saw your post on the WG governance issue and @XAMPPRocky's response about emailing the core team, but was wondering if you've heard anything back through another channel that I might have missed. Obviously, there's a lot going at the moment, but in large part because of that, my own availability for working on projects like this has increased a bit for the foreseeable future. Of course, it's totally understandable that while that might be the case for some, other people in the chain might not be in the same position, so if this needs to take a backseat for the moment, not a problem. |
@quietlychris I don't have a follow up on that response, but I wouldn't block your work on the core approval. There's no reason you can't organise and start having meetings and discussions. I'm not part of the approval process but I think it would probably go quicker if it could be shown a group was already regularly active and working. Since one of the concerns that is often brought up around creating new teams/groups is whether they will remain active after initial interest. So being able to show that a group is regularly productive, would largely address that. |
@quietlychris Sorry for the delay but @XAMPPRocky is correct, have not heard anything. I'd say let's start hashing out what we want to group to look like, where to organize, how etc on another thread. |
@XAMPPRocky @deg4uss3r Okay, that's good to know, and makes a lot of sense! I started a new issue with some of those points for discussion. I'm looking forward to hearing what people have to say. Thanks! |
Just noticed that my ideas around building out a high level neural net framework similar to Keras were not at all original :) @koute - how much progress have you made on your end with your high level interface? @deg4uss3r and @quietlychris - My main area of interest/focus is in a pure Rust implementation (as opposed to bindings for Tensorflow et al.) given the guarantees on memory safety implicit in Rust as well as the ability to start using next-gen cross platform CPU/GPU interfaces directly (i.e Vulkan Compute and SPIR-V) without having to use intermediate languages based on C/C++ (like SYCL). I'm aware that things are still getting organised but noticed that there is already a Deep Learning workstream defined. I've always viewed the neural net as the deep learning base structure so wondered if this would naturally fit into that workstream or if it should be captured as a separate (but dependent) one. What's your view? |
Welcome!
I created this repository as a discussion hub for the ML ecosystem in Rust, "following" a talk I gave at the Rust meetup in London (slides).
I do believe that Rust has great potential in this area, but to fully realize this potential we need to provide building blocks: we need to tackle those shared challenges that, once removed, will enable more and more people to just come to Rust and build what they want to build.
The three building blocks I do see as fundamental for an ML ecosystem are:
I have spent the last year, when it comes to open-source contributions, enhancing n-dimensional arrays: direct contributions to
ndarray
, statistical routines on top of it (ndarray-stats
) and tutorials to help people to get into the Rust scientific ecosystem from Python, Julia or R. I do believe thatndarray
is in more than a good shape when it comes to fulfil NumPy's role in the Rust ecosystem.There is now movement as well when it comes to dataframes - a discussion is taking place at rust-dataframe/discussion#1 to explore use cases and potential designs. (The idea of opening this repository comes directly from this experiment of community-led design for dataframes).
Given that one of the two data structures that are usually consumed by ML models is ready (n-dimensional arrays) and the other one is baking (dataframes) I think it's time to start thinking about what to do with the ML-specific piece.
I don't want to steer the debate too much with the opening post (I'll chip in once the discussion starts), but the questions I'd like to see tackled are:
The text was updated successfully, but these errors were encountered: