Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert SciPy sparse matrix to R #158

Merged
merged 7 commits into from
Aug 8, 2017
Merged

Convert SciPy sparse matrix to R #158

merged 7 commits into from
Aug 8, 2017

Conversation

binxiangni
Copy link
Contributor

I did some minor modification to support the conversion of SciPy sparse matrix to R. It seems that I mistook the setup in my unit test so that compiler errors occurred. Any hints to correct it?

@eddelbuettel
Copy link
Member

eddelbuettel commented Aug 7, 2017

Any hints to correct it?

Yes. We now use the reticulate package, but in the (older) Travis setup I still use we have to explicitly tell Travis about it in the .travis.yml file. We may of course also need SciPy and all that. But first step reticulate,

@eddelbuettel
Copy link
Member

I get the tests to run if I

  • make sure R package reticulate is installed
  • Debian/Ubuntu package python-scipy is installed

But one test still fails:

>   coo <- sp$coo_matrix(mat)
>   dgT <- methods::as(M, "dgTMatrix")
>   checkEquals(dgT, py2r(coo), msg="coo2dgt")
Error in checkEquals(dgT, py2r(coo), msg = "coo2dgt") : 
  Attributes: < Componenti: Mean relative difference: 1.2 >
Attributes: < Componentj: Mean relative difference: 1.2 >
Attributes: < Componentx: Mean relative difference: 0.5714286 >
coo2dgt
> 

even though

> py2r(coo)
3 x 3 sparse Matrix of class "dgTMatrix"
          
[1,] 1 . 4
[2,] . . 5
[3,] 2 3 6
> dgT
3 x 3 sparse Matrix of class "dgTMatrix"
          
[1,] 1 . 4
[2,] . . 5
[3,] 2 3 6
> M
     [,1] [,2] [,3]
[1,]    1    0    4
[2,]    0    0    5
[3,]    2    3    6
> 

If I comment the corresponding test out, and make some corrections to DESCRIPTION and NAMESPACE then the package tests.

@binxiangni
Copy link
Contributor Author

binxiangni commented Aug 7, 2017

Well, I'm checking what happens in coo2dgt.

> str(py2r(coo))
Formal class 'dgTMatrix' [package "Matrix"] with 6 slots
  ..@ i       : int [1:6] 0 0 1 2 2 2
  ..@ j       : int [1:6] 0 2 2 0 1 2
  ..@ Dim     : int [1:2] 3 3
  ..@ Dimnames:List of 2
  .. ..$ : NULL
  .. ..$ : NULL
  ..@ x       : num [1:6] 1 4 5 2 3 6
  ..@ factors : list()
> str(dgT)
Formal class 'dgTMatrix' [package "Matrix"] with 6 slots
  ..@ i       : int [1:6] 0 2 2 0 1 2
  ..@ j       : int [1:6] 0 0 1 2 2 2
  ..@ Dim     : int [1:2] 3 3
  ..@ Dimnames:List of 2
  .. ..$ : NULL
  .. ..$ : NULL
  ..@ x       : num [1:6] 1 2 3 4 5 6
  ..@ factors : list()

The problem exists because coo is row-oriented, while dgT(created by new) is column-oriented.

@eddelbuettel
Copy link
Member

I have a commit pending which "works here" in the sense of

  • having uncommented that failing test
  • made corrections to NAMESPACE and DESCRIPTION
  • made py2r() internal for now
  • add python-scipy to Travis

But I fear we may get an old SciPy on the older Travis setup -- so even if it "works for me here" we may not get a working setup at Travis.

@eddelbuettel
Copy link
Member

eddelbuettel commented Aug 7, 2017

Ok, see this branch started from your PR. It just passed a test run on Travis.

There are still a number of open issues. Let's not rush this.

@binxiangni
Copy link
Contributor Author

binxiangni commented Aug 7, 2017

After checking the unit tests for sparse matrix conversion in RcppArmadillo I wrote before, str for dgTMatrix in Matrix might have 3 cases:

  1. column-oriented;
  2. row-oriented;
  3. repeated (i, j) pairs or non-sorted.

Personally, I think the current conversion result for SciPy coo sparse matrix is fine and whether to pass the unit test just depends on the generation method of dgTMatrix and its str. So maybe we can just leave the conversion as is now and pick up another dgTMatrix with the same str in the unit test.

@eddelbuettel
Copy link
Member

The three cases are not randomly selected though, are they?

Could we do a weaker comparison like this one:

R> methods::as(M, "dgRMatrix") == methods::as(M, "dgTMatrix")
3 x 3 Matrix of class "lgeMatrix"
     [,1] [,2] [,3]
[1,] TRUE TRUE TRUE
[2,] TRUE TRUE TRUE
[3,] TRUE TRUE TRUE
R>
R> all(methods::as(M, "dgRMatrix") == methods::as(M, "dgTMatrix"))
[1] TRUE
R>  

@binxiangni
Copy link
Contributor Author

Most dgTMatrix is column-oriented, including created by as and operations; new can create a dgTMatrix by specifying a dgTMatrix's i, j, x, so the str of dgTMatrix created this way can be any one of cases listed above.

Not quite understood the reason for using weaker comparison. Do you mean comparing the non-zero values for different kinds of sparse matrix in this SciPy sparse matrix conversion? like the conversion results of cscmatrix and coomatrix?

@eddelbuettel
Copy link
Member

eddelbuettel commented Aug 7, 2017

So can you create one that would match what we get via SciPy ?

@binxiangni
Copy link
Contributor Author

binxiangni commented Aug 7, 2017

Done. And local tests pass.

@eddelbuettel
Copy link
Member

Are you ok with my other changes? Ie py2r() is not exported as it is a) a little specialized and b) has not manual page yet.

We can leave things the way they are now, or set this up as a demo/ example instead. No strong feelings either way.

@binxiangni
Copy link
Contributor Author

Yes, that's fine.

R/py2r.R Outdated
## You should have received a copy of the GNU General Public License
## along with RcppArmadillo. If not, see <http://www.gnu.org/licenses/>.

requireNamespace("reticulate", quietlty=TRUE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be removed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Also did some gentle renaming as py2r as bit too generic.

@eddelbuettel
Copy link
Member

eddelbuettel commented Aug 7, 2017

@binxiangni Can you rebase / pull one more time to get my last commit over here too?

Otherwise I can merge as is and then PR from my branch to just get its final commit over.

@binxiangni
Copy link
Contributor Author

Done. Sorry for the late action. Still in China, so there is a time difference.

@eddelbuettel
Copy link
Member

Oh, I was unaware you were away....

@eddelbuettel eddelbuettel merged commit 19f2ced into RcppCore:master Aug 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants