Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Things I've had to look up starting to use SciRust #20

Open
daniel-vainsencher opened this issue Aug 6, 2015 · 9 comments
Open

Things I've had to look up starting to use SciRust #20

daniel-vainsencher opened this issue Aug 6, 2015 · 9 comments

Comments

@daniel-vainsencher
Copy link
Contributor

I've contributed a little code, so I'm not as fresh as a typical new user, but I've still had to hunt around for the follow as I tried to convert some numerical code to use SciRust. I think our goal should be to be friendly towards someone coming to do numerical code from Python or R or Matlab; the intersection of numerical developers and rust experts is pretty small.

  • Where is there a simple usage example? best I found is the tests in src/matrix/matrix.rs but the imports there look different.
  • extern crate scirust; // This is easy enough. But what else do I need? hmm, not recognizing Matrix.
  • use scirust::matrix::matrix::Matrix; // Ugh. Lots of repetition. The core datastructures should be easier to access than this. Maybe they are, but I didn't find the shortcut.
  • use scirust::matrix::traits::Shape; // Just to use num_rows, which is pretty basic. If I want to use matrices, I definitely want to know how many rows they have. This wasn't hard to find though, because rustc tells me.
  • Matrix::ones // I wanted to use this function, and it wasn't trivial. For example, adding a use statement for it ("use scirust::matrix::matrix::Matrix::ones;") gave me "error: ones is not directly importable [E0253]". Are there more friendly ways to expose this function? this is a very common function in my usage, and we should consider this as a UI issue.
  • use scirust::matrix::random::rand_std_normal; // Should have figured this one out faster by now, but I still doubt it needs to be this deep. Again, for me this is extremely common to want lots of Gaussians, almost like ones. Also, is there a logic to when we use "rand" vs. "random"?
  • cell_iter // Why is this not just plain iter? sounds reasonable to me that the elements of a matrix are the cells (though it is true that NumPy iterates over rows by default).
  • I was also surprised that X[i] actually compiles Is there no way to require two indices to access a matrix? would it be better to forgo indexing syntax, just to avoid the potential for bugs? or maybe implement Index for pair types?
@shailesh1729
Copy link
Contributor

Will try to answer some of the questions.

Online documentation hosted at : http://indigits.github.io/scirust/doc/scirust/index.html .
Though it's not always up to date as generation is not automatic and I have to generate it afresh from time to time and commit in gh-pages branch.

Actually you don't have to import so many things. There is a small module called scirust.api. So typically, I write code like use scirust::api::*; and I am good to go. See an example in main.rs.

use scirust::api::*;
let a = matrix_cw_f64(2,2, [1., 4., 2., 8.].as_slice());
println!("{}", a);
let b = vector_f64([3.0, 6.0].as_slice());
let result = GaussElimination::new(&a, &b).solve();
match result {
    Ok(x) => println!("{}", x),
    Err(e) => println!("{}", e),
}
let m = matrix_cw_c64(2,2, [].as_slice());
println!("{}", m);

All the necessary traits are automatically imported the moment u add scirust::api::*. Even using Matrix::ones becomes easy after this.

Now, in order to take care of typical cases of construction, I have introduced a whole lot of constructor functions in constructors.rs. Using them is much more convenient than trying to struggle with the static methods of a generic structure.

@shailesh1729
Copy link
Contributor

I have been thinking that general usage examples and the design principles behind the SciRust should be captured in the form of an open book. I started a bit of work on it but haven't yet invested much time on it. http://shailesh1729.gitbooks.io/scirust-design-guide/content/. Since the design itself is currently quite fluid, hence less motivation to write stuff which may have to be rewritten later. But we can discuss this further.

@shailesh1729
Copy link
Contributor

Yes, indexing is a problem at the moment. In Python, they provided special support for multi-indices at the language level for numpy. I think Rust allows the concept of compiler plugins which can be used to introduce additional library specific syntax. If one could work on that, then a[i, j] type of syntax should be possible to support. But I haven't had the heart to experiment with compiler plugins yet.

@shailesh1729
Copy link
Contributor

On cell_iter, since there are different kinds of iterators possible, it might be a better idea to have explicit long names for their implementation. We can then introduce shorter names using typedef type stuff. A simple

type Iterator  = ColumnWiseCellIterator;

would do the job.

Similarly, consider calculating mean over a data matrix. Standard MATLAB convention is to store each data sample in one row of data matrix where the columns are different components of a data vector. Thus, default mean computes mean over each column separately. There is an optional argument through which you can compute mean over each row.

I have been using convention of mean_cw and mean_rw for column wise or row wise operation. You will see _cw and _rw appearing at a lot of places. After this, an additional function mean can be defined which simply calls mean_cw for simplicity.

Rust doesn't support default values for function arguments. It suggests to go for different functions in such cases (till I last checked).

MATLAB lets specify the dimension over which work is to be done as an argument to most functions. Since MATLAB fundamental data type is N-D array, this is okay.

For matrices, most operations are typically either over rows or over columns. So I feel that specifying _cw and _rw could be a good convention. This is off course up to debate and discussion.

I didn't have any plans to extend the Matrix data structure to become an N-D array. If we have to introduce N-D array at a later point of time, I was thinking that they could be developed as a separate structure which could seamlessly interact with matrices but would not provide operations which are applicable only for matrices (like linear algebra operations).

@shailesh1729
Copy link
Contributor

I will try to add a set of simple usage examples in the main documentation at the front page itself. Since examples in the documentation are verified while running unit tests, they are going to be maintained quite well.

@daniel-vainsencher
Copy link
Contributor Author

Thanks for the write up!
IMO, readme.md should be the root of all documentation, and have a minimal example itself.

@daniel-vainsencher
Copy link
Contributor Author

About indexing, I think we can already support syntax like a[(i,j)] simply by implementing Index<(isize, isize)>. Then change Index to panic unless we have a single row or a single column. What do you think?

@daniel-vainsencher
Copy link
Contributor Author

It isn't very pretty, but much nicer than get, I think

@shailesh1729
Copy link
Contributor

The documentation has been updated with couple of examples
http://indigits.github.io/scirust/doc/scirust/index.html.

README.md also points to it already. Though readme itself doesn't have examples yet. Will do that also some time.

scirust_main.rs has also been restored. It was commented out at a time when Rust language was going through last minute changes and the code was breaking all over the places. And then I missed to restore it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants