Skip to content
This repository has been archived by the owner. It is now read-only.

What do we want to build? #1

Open
LukeMathWalker opened this issue Apr 28, 2019 · 108 comments
Open

What do we want to build? #1

LukeMathWalker opened this issue Apr 28, 2019 · 108 comments

Comments

@LukeMathWalker
Copy link
Member

@LukeMathWalker LukeMathWalker commented Apr 28, 2019

Welcome!

I created this repository as a discussion hub for the ML ecosystem in Rust, "following" a talk I gave at the Rust meetup in London (slides).

I do believe that Rust has great potential in this area, but to fully realize this potential we need to provide building blocks: we need to tackle those shared challenges that, once removed, will enable more and more people to just come to Rust and build what they want to build.

The three building blocks I do see as fundamental for an ML ecosystem are:

  • n-dimensional arrays;
  • dataframes;
  • an ML model interface.

I have spent the last year, when it comes to open-source contributions, enhancing n-dimensional arrays: direct contributions to ndarray, statistical routines on top of it (ndarray-stats) and tutorials to help people to get into the Rust scientific ecosystem from Python, Julia or R. I do believe that ndarray is in more than a good shape when it comes to fulfil NumPy's role in the Rust ecosystem.

There is now movement as well when it comes to dataframes - a discussion is taking place at rust-dataframe/discussion#1 to explore use cases and potential designs. (The idea of opening this repository comes directly from this experiment of community-led design for dataframes).

Given that one of the two data structures that are usually consumed by ML models is ready (n-dimensional arrays) and the other one is baking (dataframes) I think it's time to start thinking about what to do with the ML-specific piece.

I don't want to steer the debate too much with the opening post (I'll chip in once the discussion starts), but the questions I'd like to see tackled are:

  • what use-cases could make Rust shine in the ML ecosystem?
  • what are the basic capabilities that have to be built to enable the usage of Rust for ML workloads?
  • how should we structure such a project? A core library with few traits and a set of separate crates tackling different aspects? A large battery-included scikit-learn equivalent?
  • why do you want to use Rust for ML?
@Kibouo
Copy link

@Kibouo Kibouo commented May 1, 2019

I want to note that, while it works great, https://github.com/twistedfall/opencv-rust is not particularly user-friendly or 'clean' in rust terms.

Maybe we could have a look at it?

@flo-dhalluin
Copy link

@flo-dhalluin flo-dhalluin commented May 1, 2019

I think the use case that coud make Rust shine, is deployment. Currently the de-facto "mainstream" stack is python based ( scikit-learn, np, pandas + your DL framework of choice, TensorFlow, Torch .. ). It shines for fast prototyping, because python, but it sucks for industrialization ( and deployment ), because ... python. I really think rust would do great. In that area, I kinda like TensorFlow serving, but it forces you to have a separate service ( that you call with their protobuf/RPC).
So :

  • nice conventions for training/inference
  • standard ways of serializing, loading models, and expose them to more "entreprisey" stacks, either with some kind of FFI ( for ex. jvm <-> jni ..) or RPC.
  • with all the goodies required for a industrial setups ( monitoring, robustness, ease of deployment ...).

@jbowles
Copy link
Member

@jbowles jbowles commented May 1, 2019

I'm currently building a large project with rust (i mention it here: https://users.rust-lang.org/t/interest-for-nlp-in-rust/15331/9), where I am doing the data engineering in rust (lots of string metrics). [tldr; I found lots of disparate projects with 50% of what I needed for string metrics but instead rolled my own, trying to incorporate previous work and give credit] I want to feed the feature vectors to Julia to experiment with what I want to use for classification and modelling, and then I'll want to be able to use rust for inference/classification etc.... I had to pause development for business reasons, but I'm starting again: one of my biggest issues was not ML related but finding a nice pattern for parallel file download (seems like it should be simple, but maybe I'm spoiled by go's simplicity lol).

From this real-world project point of view, as well from my time spent thinking in the abstract and surveying the ML ecosystem in rust (about a year), I would think that a focus on data engineering in general and serving models is the way to go (this also seems to be a widely shared sentiment). In a practical sense I would like to see rust jobs for data engineer and machine learning engineer... that is, the bookends of a typical data science project; serving the data and serving the model.

That is, targeting software developers, infrastructure, computational math, and data people. Trying to convince research scientists to use Rust would be wasted effort; for most of these people software is a secondary skill and so they need something easy to learn, dynamically typed, with a REPL... I've watched this play out in the Python/R/Matlab versus Julia world... and while IMO Julia has a lot to offer current python/r/matlab devs and is similar enough to those languages, trying to get that group of people to use Julia is not easy, i can't imagine what it'd be like proposing Rust.

Here are some challenges I see:

  • Dataframes: figuring out what to do with missing data is a challenge (i watched julia community struggle with that this last year).
  • LinearAlgebra: ndarray, nalgebra are both active projects... is there duplicated effort? (there are others as well).
  • Rust types more friendly for math: I've seen the power in Julia of being able to specify AbstractArray as a type, or have a Real as a type, that allows you to build generic functions that accept a vector of float32 or float64.
  • Swift: google and numerous well-known people (chris lattner[LLVM, Swift], jeremy howard[fast.ai]) have put their support behind swift for tensorflow. IMO swift has really long way to go. But for Rust, tackling areas where the swift-for-tf project are not focusing on is good.
  • Support for Julia: integration with Python is a necessity; but if there is a competitor for the research-scientist in the Python world it is Julia and I'd imagine keeping an eye on playing well with Julia could be a benefit. Competition here is hard to forecast and julia/rust are on really different ends of the spectrum; while julia pushes solving the "two language" problem i see no problem using rust and julia in project; I doubt competition is an issue, not like with Swift.

@jbowles
Copy link
Member

@jbowles jbowles commented May 1, 2019

I do believe that ndarray is in more than a good shape when it comes to fulfil NumPy's role in the Rust ecosystem.

Really looking forward to digging into ndarray. Though I've had a slight delay, I'm writing up ndarray examples for the Grokking Deep Learning book where andrew trask introduces deep learning with only numpy. He's expressed interest and welcomed the examples... :)

@soaxelbrooke
Copy link

@soaxelbrooke soaxelbrooke commented May 1, 2019

A standardized tokenization implementation!

Tokenization fills the role of "turn the text into fixed vectors" that you'd feed into standard models. As an NLP practitioner and Rust user, tokenization is an incredibly important step in the pipeline, a big barrier to new people trying to apply NLP, and a place where lots of small bugs creep in due to non-standard implementations that take forever to find. Having a standard implementation for the simpler tokenization methods (like regex matching) would make NLP problems much more approachable in Rust.

@DhruvDh
Copy link

@DhruvDh DhruvDh commented May 1, 2019

One part of machine learning where Rust could shine right now is simulation for Reinforcement learning.

For instance if I training an agent to play blackjack, the biggest bottleneck here is the "playing" blackjack over and over by the agent to collect enough data for training.

Rayon and Actix could be used to create fast and performant game "environments" now, without need for an established ML ecosystem.

@yngtodd
Copy link

@yngtodd yngtodd commented May 1, 2019

I agree with @DhruvDh, using Rust to simulate environments for RL agents would be great.

Having something akin to OpenAI's gym interface would be really nice. Many RL researchers are going to still want to use Python and all the associated deep learning libraries. So, I would love to see RL environments rendered in Rust that could be interfaced with both Python and Rust for agents.

Edit: I imagine that algorithms like Monte Carlo Tree Search would be really useful if they were written in Rust. I would not want to wait on Python to handle that bit.

@masonk
Copy link

@masonk masonk commented May 1, 2019

if I training an agent to play blackjack, the biggest bottleneck here is the "playing" blackjack over and over by the agent to collect enough data for training.

Along these lines, I am working on a hobby project (link), which does this. It isn't quite ready for even an alpha release yet, but I am in the final stages of cleaning up the API with the intent to publish it.

@masonk
Copy link

@masonk masonk commented May 1, 2019

Things Rust definitely needs:

const generics
16-bit floats
GATs (for efficient, non-copying iterators)

Things that we might want but I'm not sure:
Standard Inference + Train traits
Standard data frames trait

@kazimuth
Copy link

@kazimuth kazimuth commented May 1, 2019

I've been thinking of building a rust deep learning / GPU compute library on top of the TVM framework for a while now. I think it could address a lot of the things @flo-dhalluin is talking about. TVM's an amazing project that's currently flying a bit under the radar. It's an open source deep learning compiler - it compiles deep neural nets / large array operations to run on the GPU (or on OpenCL, or FPGA, or TPU, or WebGL...). You define an AST of computations via its API, and it spits out a small (<5mb) shared library containing just the operations you wanted, on whatever acceleration framework and target platform you want.

It currently has a working Rust runtime library, which lets you call a compiled model from Rust. It integrates with ndarray, and will let you e.g. take in an ndarray::Array, move it to a GPU, run whatever numerical operations you want on it, and get the result back as an ndarray::Array again.

That's pretty neat, and I don't think it would be too hard to build some really cool tools on top of it. My dream is something like:

lib.rs:

// a crate based on tvm
// `cargo build` will (by default) download + checksum a prebuilt TVM library
// that this links to, so that you don't have to wait for a whole compiler to compile.
// The download will only be ~50mb -- way smaller and easier than lots of other deep
// learning frameworks. It will also support running code on things besides cuda!
// The output binary won't need to link the compiler (by default) and will therefore be
// only a few megabytes.
extern crate tvmrs;

// a procedural macro that converts Rust code to Relay IR.
// Relay IR is TVM's high-level IR for defining neural networks / computation chains,
// sorta like a tensorflow Graph. It's also not too dissimilar to Rust.
// The macro will compile the IR with TVM at build-time, and link the resulting artifacts
// to this rust library.
tvmrs::accelerate! {

  // stateless operation
  fn relu_downsample(x: Tensor[c, n, h, w]) -> Tensor[c, n, h/2, w/2] {
     relu(downsample(x))
  }

  // stateful operation
  struct Block<oc> {
    conv: Conv2d<3,3,oc>,
    elu: Elu
  }
  impl Op for Block<oc> {
    fn run(self, input: Tensor[c, n, h, w]) -> Tensor[oc, n, h, w] {
       self.elu(self.conv(input))
    }
  }

  fn swap_channels(x: Tensor[2, n, h, w]) -> Tensor[2, n, h, w] {
    // a low-level tensor operation defined as a TVM Tensor expression.
    let out = compute!(x.shape, |cc, nn, hh, ww| x[(c + 1) % 2, nn, hh, ww)]);
    out
  }
 
  // a sequential network container.
  sequential! Network {
     #[opencl] Block<3,3,5>, // run on opencl
     #[opencl] relu_downsample,
     #[opencl] Conv2d::new(1,1,2),
     #[rust] debug,    // call a normal rust function
     #[cpu] swap_channels // run this part on CPU to maximize throughput
  }

  // Compute a derivative of the network.
  // Relay IR is designed to be differentiable.
  derivative! NetworkDerivative (Network);
}

// a normal rust function
fn debug(x: Tensor) {
  ...
}

train.rs:

fn main() {
  tvmrs::training_loop! {
    net: Network,
    dnet: NetworkDerivative,
    epochs: 37,
    training_data: dataset! {...},
    valid_data: dataset! {...},
    ...
  }
}

run.rs:

fn main() {
   let input = tvmrs::ndarray_from_stdin();
   let output = Network::load_params("params.bin").run(input);
   println!("{:?}", output);
}

(Further reading: Introduction to Relay, TVM Tensor expressions)

All of this is of course pending mountains of bikeshedding, i have no idea what the final API will look like.

One of the nifty things here is that this isn't limited to deep learning models. TVM can handle pretty much any algorithm made of large array operations. So if you wanted to run your SVM on GPU, you can do that pretty easily!

Steps to take here:

  • Talk to the TVM people and see what they think of all this. We could do this work under their umbrella or in a fresh project.
  • Write Rust bindings to the TVM compiler (instead of just the runtime). TVM is written in C++ but is designed to be easy to bind, a lot of the work has already been done here.
  • Design an API like my sketch above that wraps the bindings in some way that makes them easy to use for training + deployment.
  • Build up cargo tooling to allow e.g. prebuilt binary downloads, TVM's auto-tuner support, etc.
  • Beef up TVM's autodifferentiation support. TVM can differentiate Relay IR, but a lot of derivatives aren't actually implemented yet. We could also roll our own autodifferentiation system and just use TVM for compilation; I'd prefer to avoid duplicating work tho.
  • Start writing non-deep-learning algorithms with this system as well, to kick the tires.

If people are interested in this implementation path we could throw a repo together and start work.

I mainly want this because I'm don't want to be stuck using Python and Cuda all the time for my deep learning research :)))

@koute
Copy link

@koute koute commented May 1, 2019

A few months ago I have started a crate of my own for deep learning. My goal is to have a library which:

  • Supports both inference and training.
  • Supports the most common deep neural network architectures.
  • Is GPU accelerated.
  • Doesn't use CUDA.
  • Supports every mainstream platform (Linux, MacOS, iOS, Android, Windows, WebAssembly) and hardware (AMD, NVIDIA, Intel GPUs) with a single codebase, and uses the same kernels for consistent results.
  • Is written in pure Rust so that it's trivial to cross-compile.
  • Has a simple to use Keras-like API.
  • Is small and simple enough that it can be reasonably understood and tested end-to-end. (Otherwise you risk situations like e.g. with TensorFlow where for two whole versions their dropout layer was completely broken.)

It's currently totally useless. Right now I'm in the process of adding a Vulkan backend (I have a few thousand lines of work-in-progress code on my disk which I've not pushed yet.); once I finish that in a few weeks I plan further build it up so that I can train CIFAR-10 up to at least ~90% accuracy, add some model import/export functionality (probably from/to the ONNX format) and only then it will be actually usable for something practical.

Some people would call this a waste of time and effort, and, well, I do agree that it would be probably more productive to not do this completely from scratch as I'm doing (e.g. by using TVM as kazimuth said), but I don't really care - I'm just trying to scratch my own itch.

@DhruvDh
Copy link

@DhruvDh DhruvDh commented May 1, 2019

@kazimuth while I love the snippets you've shown here, a lot of my love for Rust exists because of the all the compile time checks the compiler does, and the wonderfully easy to comprehend error messages. I feel that if one is using Rust just as a way to compose, and run functionality defined in other languages then there isn't much to gain here. Might as well just use Python.

And TVM looks more like a tool for deploying neural nets rather than training them; which is very useful but I would prefer to do both in Rust.

There's also tch-rs - bindings to PyTorch's libtorch.

Something else that is also interesting is dual_num, which as I best understand it is some fancy math that might eventually let us to automatic differentiation.

@DhruvDh
Copy link

@DhruvDh DhruvDh commented May 1, 2019

@koute the long term road-map is amazing but I don't get why bother putting effort into the tensorflow backend. Admittedly I don't have enough know-how to imagine what a native backend would look like and the kind of work it would need.

@koute
Copy link

@koute koute commented May 1, 2019

@DhruvDh The TensorFlow backend will be most likely removed in the future. Currently it is there for a few reasons:

  • I wanted to quickly get something working to experiment with, and to be able to first work on the general interface of the library (e.g. defining the neural network graph, getting data in and out, etc.)
  • I can use it to write a comprehensive test suite and then cross-check that with my own backend. ML algorithms are very hard to write correctly, so I want the extra insurance not only that my algorithms match with what I have on paper, but also with another widely used framework. (Although from the amount of bugs I've encountered when dealing with TensorFlow it'd probably would have been better to pick a different framework...)

@jbowles
Copy link
Member

@jbowles jbowles commented May 1, 2019

Some cool stuff coming to light. Is anyone familiar with work presented at c4ML? https://www.c4ml.org/
I don't think any of the presentations were using Rust... but certainly this is a space Rust could be competitive with. With that in mind, are any of the Rust compiler team interested in ML?

Here are some references to work being done in Swift and Julia (Note, Rust, Swift, Julia were all top of the list for google's tensorflow project that eventually became swift-for-tf).
(e.g., automatic differentiation, differentiable programming... https://github.com/tensorflow/swift/blob/master/docs/AutomaticDifferentiation.md, https://juliacomputing.com/blog/2019/02/19/growing-a-compiler.html). Swift MLIR (https://drive.google.com/file/d/1hUeAJXcAXwz82RXA5VtO5ZoH8cVQhrOK/view) and Julia Zygote (https://www.julialang.org/blog/2018/12/ml-language-compiler).

I don't know of any projects in Rust along these lines ^^ ... of course, they are also all funded (google, and julia computing).

@DhruvDh
Copy link

@DhruvDh DhruvDh commented May 1, 2019

@koute yeah makes sense.

@jbowles There was this internals thread about Automatic Differentiation here.

@jbowles
Copy link
Member

@jbowles jbowles commented May 1, 2019

@ehsanmok may be interested in this discussion ^^

thanks @DhruvDh

@kazimuth
Copy link

@kazimuth kazimuth commented May 1, 2019

@DhruvDh that's a fair criticism, but really that's a problem whenever you want to use a hardware accelerator. You're always going to be calling into a language with different semantics from the host. Using Rust for glue gives you type-safety, performance, and lovely tooling. e.g. it's dead-simple to write a parallel image preprocessing pipeline in Rust, whereas with python you need a load of hacks (FFI, multiprocessing) to get acceptable performance. Also, you're free to define new low-level operations in Rust; users shouldn't ever need to use another language :)

And yeah, currently TVM's publicity is oriented around deployment, because that's where there's a gap in the python ecosystem. There's no reason their compiler wouldn't work for training too, though.

@jbowles I've worked with some of those projects; see my comment, I think we can borrow some of that work.

also CC @nhynes

@kazimuth
Copy link

@kazimuth kazimuth commented May 1, 2019

Other thought: I wonder what interactive scientific programming would look like in Rust? There's a jupyter kernel but i'm not sure how usable it is.

It might be that rust should just be used for high-performance kernels and stuff, and be easy to call from other languages like you lay out in your presentation @LukeMathWalker.

@LukeMathWalker
Copy link
Member Author

@LukeMathWalker LukeMathWalker commented May 1, 2019

Wow, there really is a lurking interest 😛 This is just great.

The discussion has explored several different directions, I'd like to give more details on what I do envision (and where that need comes from).

I strongly align with @flo-dhalluin: I think Rust can really shine in delivering an end-to-end production workflow.
Rust has incredible potential when it comes to the beginning (data pipelines, preprocessing) and the end (performance web servers, using multiple protocols) of the ML workflow.
Establishing early on a way to get the whole workflow is going to be a key prerequisite for adoption - filling a painful gap in the ML ecosystem at large, delivering a top-notch experience with great tooling.

Tackling this challenge requires the building blocks I mentioned (n-dimensional arrays, dataframes) and some others that have been brought up (e.g. running code on different types of hardware, easy interop, reading/writing to a lot of different formats).

Certain capabilities can be borrowed from other languages, others we should probably port and develop natively in Rust (a sufficiently large zoo of preprocessing techniques and standard models).

While I do understand the interest in the Deep Learning area, I don't think it's realistic to kickstart an effort to make Rust a primary language for NN development: we should definitely be able to deploy and run NN models (the TVM project is an excellent example here), but I don't think we would be adding a lot of value by chasing huge projects like TensorFlow or PyTorch.
There are a lot of things in the TensorFlow ecosystem, instead, that are extremely interesting (e.g. TensorFlow serving) but they do end up locking you into TensorFlow itself: if we could replicate those conveniences in a framework-agnostic fashion, we could definitely capture a need in that space.

Summing it up, the minimum working prototype that I have in mind to show off what Rust can do goes along these lines:

  • Huge datasets as input;
  • Heavy-weight, massively parallel data preprocessing pipeline (e.g. NLP or images would be good candidates);
  • Very simple model to be trained on top of the pipeline output;
  • Configuration-based deployment of the serialized model using Rocket: you just define very basic things in a YAML file (e.g. HTTP vs gRPC, monitoring, logging, etc.) and you get a fully working web server that serves your model. This will have to rely on a sufficiently general Model trait.

If you could manage to get the experience right, I am quite sure the interest in Rust for this kind of use cases would skyrocket.

@koute
Copy link

@koute koute commented May 1, 2019

While I do understand the interest in the Deep Learning area, I don't think it's realistic to kickstart an effort to make Rust a primary language for NN development: we should definitely be able to deploy and run NN models (the TVM project is an excellent example here), but I don't think we would be adding a lot of value by chasing huge projects like TensorFlow or PyTorch.

I agree, however, you're looking at it from a perspective of a data scientist who wants to fill in the gaps of their existing workflow and augment their ML pipeline with Rust. I'm looking at it from a perspective of a Rust developer who just wants to augment their existing application with a little ML without going through the hoops of exporting their data, processing it through a mainstream ML framework, and serializing it back so that it can be used by the application again.

In other words - my personal interest lies in not filling a gap in the existing ML ecosystem (although that's also most certainly worthwhile!), but in filling a gap in the Rust ecosystem by creating value for existing Rust users (and perhaps the users of other languages) so that they could take advantage of ML in a plug-and-play fashion with minimal amount of fuss. (Which is why things like wide hardware and platform support, simplicity, lack of non-Rust dependencies so it's easy to build and cross-compile, etc. is important.)

@jbowles
Copy link
Member

@jbowles jbowles commented May 1, 2019

I can volunteer work to rust-ml for tokenizers, string distance metrics, and/or onehot encoding package. I've already been working on the first two as I have real-world projects that need these so I can double up. As far as a onehot package I'm interested to learn more how efficient onehot encoding is done under the hood and have a use for the package as well.

  • string distance metrics (jaro, jaro-winkler, ngram, qgram, ratcliff-obershelp)

  • tokenizers: for one, rust is awesome for writing tokenizers. But IME it's kinda hard to write general tokenizers since their use is often highly dependent on per-project needs (for example I wrote this [https://github.com/jbowles/nlpt-tkz] and used it for a project and its not found much use since). Or if there were consensus on using something ntlk tokenizers as a guide I don't mind working on those either. If there is a need for things like the examples below then I can cherry pick these out of my current project (a hotel and product matching thing) for a rust-ml package... these were written specifically for string comparison and not typical tokenization found in nlp pipelines but it would not be to hard adapt them to accept and return a specific data type...

#[cfg(test)]
mod tests {
    use super::*;
    #[test]
    fn on_word_splitter() {
        fn word_split(c: char) -> bool {
            match c {
                '\n' | '|' | '-' => true,
                _ => false,
            }
        }
        let res = TokenizerNaive::word_splitter("HelLo|tHere", &word_split);
        assert_eq!(res, vec!["HelLo", "tHere"])
    }
    #[test]
    fn on_tokens_lower_filter() {
        fn tokens_filter(c: char) -> bool {
            match c {
                '-' | '|' | '*' | ')' | '(' | '&' => true,
                _ => false,
            }
        }
        let res = TokenizerNaive::tokens_lower_with_filter("|HelLo tHere", &tokens_filter);
        assert_eq!(res, " hello there");

        let res1 = TokenizerNaive::tokens_lower_with_filter("HelLo|tHere", &tokens_filter);
        assert_eq!(res1, "hello there");

        let res2 = TokenizerNaive::tokens_lower_with_filter("HelLo tHere", &tokens_filter);
        assert_eq!(res2, "hello there");

        let res6 =
            TokenizerNaive::tokens_lower_with_filter("****HelLo *() $& )(tH*ere", &tokens_filter);
        assert_eq!(res6, "    hello     $    th ere");
    }

    #[test]
    fn on_pre_process() {
        let res = TokenizerNaive::pre_process("Hotel & Ristorante Bellora");
        assert_eq!(res, "hotel ristorante bellora");

        let res1 = TokenizerNaive::pre_process("Auténtico Hotel");
        assert_eq!(res1, "auténtico hotel");

        let res2 = TokenizerNaive::pre_process("Residence Chalet de l'Adonis");
        assert_eq!(res2, "residence chalet de l adonis");

        let res6 = TokenizerNaive::pre_process("HOTEL EXCELSIOR");
        assert_eq!(res6, "hotel excelsior");

        let res6 = TokenizerNaive::pre_process("Kotedzai Trys pusys,Pylimo ");
        assert_eq!(res6, "kotedzai trys pusys pylimo");

        let res6 = TokenizerNaive::pre_process("Inbursa Cancún Las Américas");
        assert_eq!(res6, "inbursa cancún las américas");
    }

    #[test]
    fn on_tokens_alphanumeric() {
        let res3 = TokenizerNaive::tokens_alphanumeric("|HelLo tHere");
        assert_eq!(res3, " HelLo tHere");

        let res4 = TokenizerNaive::tokens_alphanumeric("HelLo|tHere");
        assert_eq!(res4, "HelLo tHere");

        let res5 = TokenizerNaive::tokens_alphanumeric("HelLo * & )(tHere");
        assert_eq!(res5, "HelLo       tHere");
    }

    #[test]
    fn on_tokens_lower() {
        let res = TokenizerNaive::tokens_lower_str("HelLo tHerE");
        assert_eq!(res, "hello there")
    }

    #[test]
    fn on_tokens_simple() {
        assert_eq!(
            TokenizerNaive::chars("hello there"),
            ["h", "e", "l", "l", "o", " ", "t", "h", "e", "r", "e"]
        );
        assert_eq!(
            TokenizerNaive::chars("hello there").concat(),
            String::from("hello there")
        )
    }

    #[test]
    fn on_similarity_identity() {
        assert_eq!(TokenCmp::new_from_str("hello", "hello").similarity(), 100);
    }

    #[test]
    fn on_similarity_high() {
        assert_eq!(TokenCmp::new_from_str("hello b", "hello").similarity(), 83);
        assert_eq!(
            TokenCmp::new_from_str("this is a test", "this is a test!").similarity(),
            97
        );
        assert_eq!(
            TokenCmp::new_from_str("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear").similarity(),
            91
        );
    }
    #[test]
    fn on_token_sequencer() {
        let an = AlphaNumericTokenizer;
        let one = an.sequencer("Marriot &Beaches Resort|").join(" ");
        let two = an.sequencer("Marriot& Beaches^ Resort").join(" ");
        assert_eq!(one, two);
    }
    #[test]
    fn on_token_sort() {
        let s1 = "Marriot Beaches Resort foo";
        let s2 = "Beaches Resort Marriot bar";
        assert_eq!(TokenCmp::new_from_str(s1, s2).similarity(), 62);
        let sim = token_sort(s1, s2, &TokenCmp::new_sort, &TokenCmp::similarity);
        assert_eq!(sim, 87);
    }
    #[test]
    fn on_token_sort_again() {
        let s1 = "great is scala";
        let s2 = "java is great";
        assert_eq!(TokenCmp::new_from_str(s1, s2).similarity(), 37);
        let sim = token_sort(s1, s2, &TokenCmp::new_sort_join, &TokenCmp::similarity);
        assert_eq!(sim, 81);
    }
    #[test]
    fn on_amstel_match_for_nate() {
        let sabre = "INTERCONTINENTAL AMSTEL AMS";
        let ean = "InterContinental Amstel Amsterdam";
        assert_eq!(TokenCmp::new_from_str(sabre, ean).similarity(), 20);
        assert_eq!(TokenCmp::new_from_str(sabre, ean).partial_similarity(), 14);
        assert_eq!(
            token_sort(sabre, ean, &TokenCmp::new_sort, &TokenCmp::similarity),
            79
        );

        assert_eq!(
            token_sort(
                sabre,
                ean,
                &TokenCmp::new_sort,
                &TokenCmp::partial_similarity
            ),
            78
        );
    }

    #[test]
    fn on_partial_similarity_identity() {
        let t = TokenCmp::new_from_str("hello", "hello");
        assert_eq!(t.partial_similarity(), 100);
    }

    #[test]
    fn on_partial_similarity_high() {
        let t = TokenCmp::new_from_str("hello b", "hello");
        assert_eq!(t.partial_similarity(), 100);
    }

    #[test]
    fn on_similarity_and_whitespace_difference() {
        let t1 = TokenCmp::new_from_str("hello bar", "hello");
        let t2 = TokenCmp::new_from_str("hellobar", "hello");
        let sim1 = t1.similarity();
        let sim2 = t2.similarity();
        assert_ne!(sim1, sim2);
        assert!(sim1 < sim2);
        assert_eq!(sim1, 71);
        assert_eq!(sim2, 77);
    }

@kazimuth
Copy link

@kazimuth kazimuth commented May 2, 2019

Summing it up, the minimum working prototype that I have in mind to show off what Rust can do goes along these lines:

This is a very cool idea :)

Question: what would a general Model trait look like? I think the challenge is striking a balance between generality and specificity; you don't want to tie people down too much, but you need some sort of understanding of what you're doing to be able to use it in a general context.

We might want to brainstorm a list of goals / requirements for the design, before we start writing code. Maybe in another issue?

@jbowles

But IME it's kinda hard to write general tokenizers since their use is often highly dependent on per-project needs

Do you think it would be possible to do something with a trait-based approach here? Like, the rust pattern of building up a stack of combinators, you get Parallel<Lower<UnicodeSplitter<...>>> and it ends up near-handwritten performance? I don't know much about NLP, so forgive me if I'm missing stuff here.

@jbowles
Copy link
Member

@jbowles jbowles commented May 2, 2019

@kazimuth yes i think that would be the way; allow the user to compose a tokenizer.

The TokenizerNaive i showed above is naive specifically because it is not trait based; it does some text normalization for the user, allowing the user to build and pass in a function for char matching/filtering.

I do have a trait-based approach (ideas i got from this Text-Analysis-in-Rust-Tokenization) in my current project but those are in service of tokenizing for comparing token similarity.

With full-blown tokenization an API should support allowing a user to compose the various things they need (e.g., a char filter, normalizing text, etc...) like your example. The hard part I'm really referring to is the output of the tokenization. For example,

I have functionsequencer that returns Vec of tokens

Vec<std::borrow::Cow<'a, str>>;

First, I'm new enough to rust to still not totally understand all the consequences of using Cow :) ... and also instead of a Vec<> it likely needs to return a different kind of vector that plays well with onehot or word embeddings, etc... If you are familiar with python scikit-learn think of the "Vectorizers" it has for turning arrays of strings into arrays of numbers [IMO this is always the hardest part of NLP]

texts = ["foo bar", "bar foo zaz", "did bar", "zaz bar jazz", "good jazz zaxx"]

tfidf = TfidfVectorizer(min_df=2, max_df=0.5, ngram_range=(1, 2))
features = tfidf.fit_transform(texts)
pd.DataFrame(features.todense(), columns=tfidf.get_feature_names())

d_vtz = CountVectorizer()
print(d_vtz.fit_transform(texts))

h_vtz = HashingVectorizer()
print(h_vtz.fit_transform(texts)

It seems what one would want in rust is a tokenizer that returns vectors of tokens that can just be "plugged in" to lots of different ways to turn text into numbers.

@davechallis
Copy link

@davechallis davechallis commented May 2, 2019

I'd like to see more individual specialised components that form part of an ML pipeline, rather than anything monolithic attempting to implement too much at once.

This gives Rust a chance to build up its ML strengths, slowly replacing individual parts of a mature ML pipeline. Having e.g. python bindings to those components would also allow them to start getting used and proving benefit, without needing a 100% switch to Rust.

Modules/components I'd love to see:

  • text vectorisation (e.g. fast/parallel versions of count/tfidf vectorisers)
  • dimensionality reduction (e.g. PCA, tSNE)
  • scaling/normalisation
  • hyperparameter optimisation
  • data stucture interop (e.g. to/from pandas/arrow/parquet etc.)

@LukeMathWalker
Copy link
Member Author

@LukeMathWalker LukeMathWalker commented May 2, 2019

Question: what would a general Model trait look like? I think the challenge is striking a balance between generality and specificity; you don't want to tie people down too much, but you need some sort of understanding of what you're doing to be able to use it in a general context.

We might want to brainstorm a list of goals / requirements for the design, before we start writing code. Maybe in another issue?

An article I found very interesting, from 2 years ago, is this one: http://athemathmo.github.io/2016/09/07/typesystem-machine-learning.html
It's from the author of rusty-machine if I am not mistaken. We should definitely brainstorm a list of goals and requirements here before starting to write code out. It would also be worthwhile to see what features in the lang team pipeline could be useful for us.

I agree, however, you're looking at it from a perspective of a data scientist who wants to fill in the gaps of their existing workflow and augment their ML pipeline with Rust. I'm looking at it from a perspective of a Rust developer who just wants to augment their existing application with a little ML without going through the hoops of exporting their data, processing it through a mainstream ML framework, and serializing it back so that it can be used by the application again.

In other words - my personal interest lies in not filling a gap in the existing ML ecosystem (although that's also most certainly worthwhile!), but in filling a gap in the Rust ecosystem by creating value for existing Rust users (and perhaps the users of other languages) so that they could take advantage of ML in a plug-and-play fashion with minimal amount of fuss. (Which is why things like wide hardware and platform support, simplicity, lack of non-Rust dependencies so it's easy to build and cross-compile, etc. is important.)

My loyalty is divided, to say the least: I'd love to be able to host 100% of my workflow in Rust because I strongly believe in the language potential and in the potential of the tooling around it.
I wouldn't say though that our goals are at odds @koute : it's just a matter of deciding in which order we should be tackling challenges.
A good set of crates for preprocessing and deployment is going to be just as necessary for a purely Rust-based workflow as they are for a mixed-language workflow.
Once they are established, we can then shift focus on porting more and more models and algorithm to Rust.
I wholeheartedly agree with @davechallis:

I'd like to see more individual specialised components that form part of an ML pipeline, rather than anything monolithic attempting to implement too much at once.
This gives Rust a chance to build up its ML strengths, slowly replacing individual parts of a mature ML pipeline. Having e.g. python bindings to those components would also allow them to start getting used and proving benefit, without needing a 100% switch to Rust.

Thanks to the strong packaging and distribution story provided by Rust, the effort of flashing out algorithms and preprocessing tools can be extremely distributed: once there is a set of agreed-upon traits as interfaces, we can leverage the influx of people who are fascinated and allow them to be productive and develop new crates without having to worry about the fundamentals.
That's why I think it's strategical to have a pure Rust implementation of DataFrames and n-dimensional arrays, for instance.
We don't need a huge monolith like SciPy or Scikit-learn.

@swfsql
Copy link

@swfsql swfsql commented May 2, 2019

@kazimuth that Jupyter kernel is usable; I'm starting to learn ai with it in here:
https://github.com/swfsql/deep-learning-coursera (by oxidizing python code)
(Currently, only the first assignment is in Rust)

@jbowles
Copy link
Member

@jbowles jbowles commented May 2, 2019

This gives Rust a chance to build up its ML strengths, slowly replacing individual parts of a mature ML pipeline. Having e.g. python bindings to those components would also allow them to start getting used and proving benefit, without needing a 100% switch to Rust.
💯

Seems to me one of the more difficult problems doing this in rust is getting common types and traits and types defined for different packages to interface with. If I'm not mistaken @LukeMathWalker you seem to point towards using Ndarray as basically numpy. I'm all on board with that.

What if there were something like a core package that defined some of the core traits and structs and types? I can see lots of pros/cons for doing that.

@kazimuth
Copy link

@kazimuth kazimuth commented May 2, 2019

@jbowles RE: tokenizer API
Hm, I see the challenge there. Well, for one thing you should probably use Iterators in between operations instead of Vecs, or design a similar trait to Iterator; that should reduce the problem of having to have big buffers between each transformation. Then I think the path would be to pick-and-choose input requirements for each operation, and then operations output whatever they want. E.g. HashVectorizer takes impl Iterator<impl Deref<str>>, and then users can pass in Iterator<&str>, Iterator<String>, Iterator<Cow>, whatever.

This gets at a broader problem with a simple function-y Model(Input) -> Output trait; it works for in-memory datasets, but once your dataset is large enough that you want to start streaming / distributing work over multiple machines, the abstraction sorta breaks down. We could instead do something graphy, where you just have nodes that ingest and spit out streams of data... but then we'll have to work with something graphy, with nodes that ingest and spit out streams of data :P

It might make sense to just start implementing without a core crate of traits, and once we've smacked into enough walls in the design space, we can figure out what the interfaces to our systems tend to look like, and retrofit a core design around that.

@nhynes
Copy link

@nhynes nhynes commented May 2, 2019

Although I'm not sure that Rust is going to usurp Python and C++ as the de-facto ML programming model, it's definitely a worthy goal. Along those lines, I think that flashlight (and the underlying arrayfire library) has an interface that we might want to emulate.

In any case, the real key feature of PyTorch and JAX is the expressivity of Python backed by a high-performance JIT tensor compiler. I'm pretty sure it's possible to do something similar in Rust by writing a compiler plugin that tracks the types+ops of ndarrays and provides the data to a JIT compiler.

Maybe something like

#[jit]
fn mlp(
    data: &Array<2, f32>,
    weights: Vec<&Array<2, f32>,
    labels: &Array<1, u8>
) -> f32 {
    let fc1 = data.dot(weights[0]); // fn dot -> Array<D, T, Op=gemm>
    Array::pointwise_max(0, fc1) // Array<D, T, Op=Max<0, fc1>> 
}

This is just a sketch and depends on how cost generics actually pan out, but the idea is that a compiler plugin can find the #[jit] functions and either pre-compile them or add them to a runtime cache and replace the original definition with a call into the cache. This is not quite dissimilar to TVM's hybrid mode. We probably don't want to write a tensor compiler, so we could offload that to TVM and link in the static library.

@deg4uss3r
Copy link
Member

@deg4uss3r deg4uss3r commented Jun 11, 2019

Hey everyone just an FYI a few weeks ago I proposed a new working group for machine learning on the internal calls for new working groups.

This discussion has me very excited for the future of Rust! Let me know if there's anything I can help with.

@LaurentMazare
Copy link

@LaurentMazare LaurentMazare commented Jun 18, 2019

Very interesting discussion, looking forward to see more ML libraries in rust.
As a bit of a shameless plug I worked on PyTorch bindings for rust in the tch-rs crate.
On the pro side it supports a large part of PyTorch API (the binding code being automatically generated). Examples include various computer vision architectures (resnet, densenet, inception, etc), some RL (DQN and A2C to play atari games), yolo-v3 for object detection, etc... These can either be trained in pure rust (with gpu support) or weights from the python implementation can be imported.
So far working with Rust has been pretty enjoyable - it's certainly nice to have a good type system compared to implementing the same thing in python.

The biggest drawback is that the current api doesn't enforce immutability properly (see this issue) and so is unsafe in that respect. It hasn't been much of an issue so far but it would be nicer to fix this, hopefully this could be done with a bit of an api change.
Another missing bit is interop with ndarray, this would make exchanging data with rust a lot easier so I hope we'll get this pretty quickly.

I guess the goal is less ambitious than what has been proposed so far in this thread but maybe this could provide a decent alternative inbetween, especially for cases where interop with the python api of PyTorch is important.

@danieldk
Copy link

@danieldk danieldk commented Jun 18, 2019

As a bit of a shameless plug I worked on PyTorch bindings for rust in the tch-rs crate.

Just wanted to say: keep up the good work! The binding looks very promising.

I played a bit with it a few weeks ago and it was still a bit hard to get data in and out of tensors (for someone new to the crate), so ndarray interop would be awesome.

Anyway, really looking forward to using tch-rs!

@LaurentMazare
Copy link

@LaurentMazare LaurentMazare commented Jul 1, 2019

I played a bit with it a few weeks ago and it was still a bit hard to get data in and out of tensors (for someone new to the crate), so ndarray interop would be awesome.

@danieldk just to mention that ndarray interop has just been added to the main branch - it's not in crates.io yet though (and kudos to @grtlr for the PR).

@danieldk
Copy link

@danieldk danieldk commented Jul 16, 2019

@LaurentMazare that's really nice! However, the current API requires a copy of the tensor data to go from a Torch tensor to an Array (and vice versa). For Tensorflow, we implemented a small wrapper around Tensor that provides an ndarray ArrayView:

https://docs.rs/ndarray-tensorflow/0.2.0/ndarray_tensorflow/

This allows one to use and modify Tensorflow Tensors in-place using the ndarray API.

@drdozer
Copy link

@drdozer drdozer commented Sep 22, 2019

The whole approach of working with graphs of operations is horrible. It's in effect a badly-typed guts of a compiler. Humans shouldn't ever touch that DAG. It's an internal artefact of rewriting terms describing the user domain into something the computer can run efficiently.

I would really like to see good support for differentiating expressions. Here is a nice bit of work done in haskell: https://arxiv.org/pdf/1804.00746.pdf

I'm not sure it is directly applicable to rust, but perhaps it is. The key intuition is that complex differentiable expressions can be built up from a small core language of trivially differentiable expressions and combinators over these.

@masonk
Copy link

@masonk masonk commented Sep 23, 2019

I think there are some good reasons to expose the computation DAG, and most of them have something to do with data locality. While there are many semantically equivalent ways to distribute the computations of a DAG, the programmer may know which ops to keep on the same device and which can be efficiently moved.

I'd love a system that does that optimization for me, but barring that, we need to be able to say things like "these two ops go on the same device" and I think that means we need to have an exposed DAG at some point.

Also, I just skimmed that whole paper and I don't really understand what it's promising to improve. Admittedly I'm not a mathematician and couldn't really follow proofs, but the central assertion seems to be that one can use a type system like Haskell's to make it nicer to implement FAD and RAD. It's a cleaner implementation, way less code! But that doesn't seem like the hard part to me, or the important part. If the implementation of my ops required 100x as much code to write and use dirty mutation tricks under the hood, but the final product runs 4% faster, I'm completely happy to make that tradeoff.

I'm probably missing something important though, please let me know if that's the case!

[1] https://www.tensorflow.org/guide/extend/op

@drdozer
Copy link

@drdozer drdozer commented Sep 23, 2019

I'm not a good enough coder to write 100x more code and have any confidence that it is working correctly ;) The issue of where an operation executes is an interesting one. It is somewhat orthogonal to what operation is executing, although clearly some families of operations will run faster in one place than another, and there are locality issues to reduce latency from moving contexts.

I think in a more "get the computer to work it out" framework, you'd have first-class things that are the devices that expose operations, and then compose these operations.

You could also have some enum of providers to be compiled down to dyns minimizing some expression cost function perhaps. This is where tagless encodings of operations really shine, since you can "run" transformations on the expressions themselves. But I've not yet tried anything like this in rust. As I say, it may not be the right abstraction for the language, or may unavoidably require higher kinded types or intrusive macros to make it work, which gets us back towards the 100x extra code.

@masonk
Copy link

@masonk masonk commented Sep 23, 2019

Actually, 100x is extreme. I think at that point I might be getting interested in trading off a few points for a code reduction. Usually though I am willing to write a lot more code, and go through a lot of tedium, to drag out a few more points of speedup. The machine time just costs a lot more than my time; the tradeoff is almost always worth it, and any system that leaves perf on the table is not an option for my use case. Though for a lot of use cases I'm sure it would be fine.

Probably a more important question than which side of the tradeoff you want take is: can we find a zero cost abstraction for AD? IMO that is the true spirit of Rust; we won't really be done until we've undermined the whole idea that there is convenience/efficiency tradeoff. [1]

As for AD, I actually don't see that "how you derive your AD" and "whether the end-user API is to construct a computation graph or a computable expression" are much related. It feels like the idea in the paper is that there is a trait or monad that each comptuable expression can implement which would allow the automatic recovery of exact derivatives. I suppose the same idea could be applied "ops" (edges of a computation graph).

[1] https://boats.gitlab.io/blog/post/zero-cost-abstractions/

@deg4uss3r
Copy link
Member

@deg4uss3r deg4uss3r commented Sep 23, 2019

I think one of things to keep in mind is this isn’t an either or scenario for this kind of work. We could easily expose the underlying structures and have crates that are owned by a WG that abstract those layers away, or vice versa.

@dearsxx0918
Copy link

@dearsxx0918 dearsxx0918 commented Oct 22, 2019

I'm very exciting to hear about the information you discussed since I'm a fun of rust.
But back to real world, it's hard to change the view of people who are using Python/C++ to develop ML related applications. I think the most disappointing thing is that there is no basic tools till now.
I would like to contribute to these basic tools like opencv/cuda/tensorflow etc. Please contact me if any needs.

@quietlychris
Copy link
Member

@quietlychris quietlychris commented Feb 5, 2020

@deg4uss3r Just joining the discussion at the moment, but I'm a little curious about plans moving forward, since this issue has been fairly quiet since this past year. I saw that the the machine learning WG application got put off a little bit, since GameDev was given priority in the queue (although got started and seems to be moving along as of late last year), but wanted to see if there was a timeline for reaching back out to the larger Rust organization regarding resubmitting or getting new consideration of the existing application.

Also, to toss in my own two cents, since I haven't seen anyone else with quite my direction of interest: as opposed to a data science or NLP background, my primary interest in the application of ML to control systems, particularly soft real-time control with respect to visual or stereo imagery. Rust is great for a lot of this. It's extremely quick, cross-compiling for embedded ARM boards is really easy, and the error handling is typically fairly ergonomic for catching and dealing with problems with ? and or Result into match statements. I've done a little work writing clustering algorithms for RGB images and for cross-library compatibility between image formats to integrate with OpenCV both for the opencv-rust binding and directly across the FFI boundary, but am really interested in being able to deploy a control system enabled by neural nets, deep learning, etc. natively from Rust.

I also want to point this out: while Rust has a reputation for being difficult for new programmers, I consider it to be my first real programming language, apart from writing some bash scripts. One of the reasons I was able to do this is because the documentation surrounding Rust in terms of both the language and standard library, as well as most of the core crates through the ecosystem are superb, and the tooling is both simple and powerful. As a result, I've been able to pick up topics like image processing and file serialization much more quickly than if I was trying to sort through either the C++11/14/17 standard library and be able to brute-force things that would have taken much too long otherwise in a language like Python or MATLAB. From first installing the language to flashing #![no_std]programs onto a microcontroller, the structures in place for learning how to write code that is safe, fast, and easy to understand in a number of domains have made a world of difference.

In that thread, I think that potentially makes Rust a really good choice for anyone interested in integrating some machine learning into their systems, and would love to see an end result that enables a similar story. My personal idea would be something similar to Embedded Rust's "Discovery" Book, with starting from "hi there, please add these 1-2 dependencies to your Cargo.toml file`, through building a classifer that does a decent job at classifying the MNIST data set, or something else along those lines. It also seems like having a Book like that would be really helpful for giving both members of the Rust ML community and programmers at large a jumping-off point at which they can start branching into other areas, and in the immediate future, present a concrete objective around which to organize and drive towards.

Obviously, much, much, easier said than done, but hey, that's what I'd like to (help) build.

@Shock-1
Copy link

@Shock-1 Shock-1 commented Feb 5, 2020

@quietlychris, I have a lot of same ideas as you and have really been looking to getting something regarding this. To me Edge ML and soft real-time control systems are something rust can excel in. I have actually been meaning to write a blog post about it, as imo this is the place where a killer rust framework can be created (like go has docker, ruby has rails etc.).
Rust also was one of my first languages, and honestly the ownership system according to me, mostly creates difficulty for programmers used to another language.
Its been said again and again by various people how rust married high level abstractions while allowing you to produce performant code by nudging you in the right direction.AI and ML fields that have been dominated by python doesn't really work when taken to embedded level. If someone needs to take their skills to embedded, the most likely will have to face c or c++.
On the other hand, if one looks over at the rust surveys, majority of users are coming from python and finding themselves comfortable. We can create an ecosystem that incentivizes them to try rust out and we can take significant market share here.
Not only this, ML algorithms that have to run locally on mobile phones is also an important market that checks the boxes of being resource constrained and something that will benefit highly from multithreaded execution (it's a trend among recent phone processors of increasing cores).
Also I am an ME undergraduate student with most of his work in control systems and has experience with the NN sort. Start something and I will surely help/join.

@deg4uss3r
Copy link
Member

@deg4uss3r deg4uss3r commented Feb 5, 2020

@quietlychris I definitely still plan on leading the working group but, I haven't heard anything from the Governance WG either. As I understood they would ping us on the ticket when they were ready to help guide the next official WG into this space. I know they had a a lot of things to figure out on the Governance side (as I check in on their channel from time to time) so I am unsure if this is still on their list just after they establish themselves or if it has accidentally fallen to the wayside.

As for your follow on points, I believe that is the direction I would like to do as well. Start with "nothing" and slowly start writing the Rust ML Book to do the classic first classifying MNIST problem. As we go along and find that libraries are out of date/missing/could be improved we do so, hopefully taking ownership or at minimum revitalize them as we go along.

@Shock-1 I also agree with edge since it's been blogged about a lot and as @quietlychris said) it's fairly easy to get Rust onto some micro-controllers, and even mobile devices. I am also REALLY excited about using Rust's error handling or utilizing the std::result::Result or std::option::Option enums system to create transparent models that you can inspect at any stage to see/understand what the model is doing.

My plan forward from today is to ping Florian on the ticket about the creation of the working group, but I want to give that a week or so since I know Mozilla just had it's winter all-hands and it's best to let the dust settle before asking for new/more work ;). I hope to have something for this discussion group in about a week, no less than two. And, for what it is worth, I do not think the other proposed group (cryptography) has moved forward officially yet either.

@quietlychris
Copy link
Member

@quietlychris quietlychris commented Feb 5, 2020

@deg4uss3r That sounds like a plan to me. I don't see any particular rush in terms of getting the WG structure up and running right at this very moment, but touching base with the Governance group when things are little quieter does seem like a solid next step to me. As I mentioned, I'm still fairly new to ML, so am planning on personally laying out the MNIST classification problem in Python (and maybe MATLAB/Octave or Julia) to make sure I have a strong handle on what the somewhat canonical workflows and results look like with those tools.

Your comment about working on or building libraries along the way in the process is right along the lines of what I was thinking as well, and something I would definitely be on board with. I believe that while constant generics aren't going to be stable in the immediate future, they're fairly well into the RFC pipeline (rust-lang #44580), which I seem to understand could be helpful for writing numerical backends.

@Shock-1 It's wonderful to see other people interested in the same sort of thing, and coming from a similar background (my formal education was also in mechanical engineering, although my control systems courses mostly focused on classical theory like PID controllers and dynamical systems, not neural nets or other ML components). I honestly hadn't much considered edge deployments in terms of smartphones, but it makes a lot of sense in leveraging Rust's multithreading to pull performance out of some of those hexa- and octacore processors showing up. Your insights in new Rust users coming in from Python is valuable too--I hadn't seen that information before. As I mentioned a little earlier, I'm going to be spending some time brushing up on other language's frameworks, but will hopefully have something available for you to contribute to at some point moving forward.

@tiberiusferreira
Copy link

@tiberiusferreira tiberiusferreira commented Feb 11, 2020

Hello, it's been a while since I last checked up on the discussion here, but I'm glad it kept going and many people are interested in both hybrid and pure Rust DL.

I've written my thoughts about the current state of DL and what I think would be required for a pure Rust DL MVP. It maybe of interest to someone here.

https://tiberiusferreira.github.io/blog/posts/current_deep_learning_ecosystem_from_a_rust_developer_perspective/

@quietlychris
Copy link
Member

@quietlychris quietlychris commented Feb 13, 2020

@tiberiusferreira Thanks for posting that write-up! It was great to see a fairly comprehensive look at some of those topics. I think your insights into moving computations onto GPUs make a lot of sense in terms of the options that are available at the moment. I had looked into some of the Rust-native code (like the emu macro system written by calebwin), but the most of the ecosystem is still at an very early stage of development. In general, my particular preferences are not hugely partial to calling to system installations since I've had some bad experiences ending up with OS-level dependency hell, but for large backends like the ones I usually associate with ML, it might make more sense than pulling in fresh instances with Cargo.toml every time you wanted to start a new project. arrayfire-rust in particular caught my eye, since it looks like it's being maintained by the Arrayfire team themselves.

I'm just starting to work through Pytorch's system in Python, but once I get a little more comfortable with understanding that, I'm definitely going to be looking into replicating that with tch-rs to see how my own experience matches up with what you've described.

@aeroaks
Copy link

@aeroaks aeroaks commented Feb 20, 2020

Hi All,
Maybe we should get yourselves noticed here -> https://discuss.ossdata.org/

@xd009642
Copy link

@xd009642 xd009642 commented Feb 26, 2020

It was mentioned before by @davechallis but has any work been done on scaling and normalisation. I'm about to start implementing a new machine learning based crate and I find myself reaching for L1/L2 norm again and thinking there should be a crate with some fundamentals like that in

@jean-pierreBoth
Copy link

@jean-pierreBoth jean-pierreBoth commented Feb 27, 2020

@jbowles
Copy link
Member

@jbowles jbowles commented Feb 27, 2020

@jean-pierreBoth Nice! I'd love to see how you are using Julia + Rust... hope to look through HnswAnn.jl soon.
Any other examples or maybe a blog post?

@xd009642
Copy link

@xd009642 xd009642 commented Feb 27, 2020

@jean-pierreBoth have you seen the rust-cv org, they did a hnsw crate and might be working on other areas you're interested in https://github.com/rust-cv/hnsw could be worth collaborating. Many hands light work etc

@quietlychris
Copy link
Member

@quietlychris quietlychris commented Mar 13, 2020

@deg4uss3r I saw your post on the WG governance issue and @XAMPPRocky's response about emailing the core team, but was wondering if you've heard anything back through another channel that I might have missed. Obviously, there's a lot going at the moment, but in large part because of that, my own availability for working on projects like this has increased a bit for the foreseeable future. Of course, it's totally understandable that while that might be the case for some, other people in the chain might not be in the same position, so if this needs to take a backseat for the moment, not a problem.

@XAMPPRocky
Copy link

@XAMPPRocky XAMPPRocky commented Mar 14, 2020

@quietlychris I don't have a follow up on that response, but I wouldn't block your work on the core approval. There's no reason you can't organise and start having meetings and discussions. I'm not part of the approval process but I think it would probably go quicker if it could be shown a group was already regularly active and working. Since one of the concerns that is often brought up around creating new teams/groups is whether they will remain active after initial interest. So being able to show that a group is regularly productive, would largely address that.

@deg4uss3r
Copy link
Member

@deg4uss3r deg4uss3r commented Mar 14, 2020

@quietlychris Sorry for the delay but @XAMPPRocky is correct, have not heard anything. I'd say let's start hashing out what we want to group to look like, where to organize, how etc on another thread.

@quietlychris
Copy link
Member

@quietlychris quietlychris commented Mar 15, 2020

@XAMPPRocky @deg4uss3r Okay, that's good to know, and makes a lot of sense! I started a new issue with some of those points for discussion. I'm looking forward to hearing what people have to say. Thanks!

@xflowXen
Copy link

@xflowXen xflowXen commented Apr 4, 2020

Just noticed that my ideas around building out a high level neural net framework similar to Keras were not at all original :)

@koute - how much progress have you made on your end with your high level interface?

@deg4uss3r and @quietlychris - My main area of interest/focus is in a pure Rust implementation (as opposed to bindings for Tensorflow et al.) given the guarantees on memory safety implicit in Rust as well as the ability to start using next-gen cross platform CPU/GPU interfaces directly (i.e Vulkan Compute and SPIR-V) without having to use intermediate languages based on C/C++ (like SYCL).

I'm aware that things are still getting organised but noticed that there is already a Deep Learning workstream defined. I've always viewed the neural net as the deep learning base structure so wondered if this would naturally fit into that workstream or if it should be captured as a separate (but dependent) one. What's your view?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet