Skip to content

Conversation

@mengxr
Copy link
Contributor

@mengxr mengxr commented Mar 3, 2015

I don't know why this was removed in #353. It would be nice to add it back for users to construct CSCMatrix directly.

@gabeos
Copy link
Contributor

gabeos commented Mar 3, 2015

hey @mengxr, that constructor was initially something I put in to overcome an issue in creating CSCMatrices in some of the UFunc operators, but removed it when it became unnecessary (although I forget exactly what changed..)

Regardless, there are two issues:

  1. That constructor assumes that the data array given contains only active values, and doesn't check for 0's. Perhaps fine for an internal constructor, but I don't think it should be exposed the general code as part of the interface because people will misuse it.
  2. I actually think that in general CSCMatrices shouldn't really expose a direct constructor in general. In addition to the above-mentioned assumption, getting the correct indices and pointers for the array is more complex by far than what a user should have to consider, and I think leaving the constructor in will prompt people to try to use it (and likely bang their head on a desk) rather than CSCBuilder, or one of the constructor methods in the companion object.

Do you have a specific use case for the constructor that wouldn't be better handled by a companion object contructor method, added CSCBuilder functionality, or a separate UFunc altogether?

@dlwh
Copy link
Member

dlwh commented Mar 3, 2015

In general, in breeze I want to have escape hatches for when users who
"know what they are doing" need access to the underlying representation for
efficiency. Hence, DenseVector exposes the data array, even though it's
possible to mess up the indexing.

So, if there's a good use case for assembling the pieces of a CSCMatrix
back into a CSC matrix, I'm ok with that. Serialization seems like a
reasonable reason.

On Tue, Mar 3, 2015 at 3:09 PM, gabeos notifications@github.com wrote:

hey @mengxr https://github.com/mengxr, that constructor was initially
something I put in to overcome an issue in creating CSCMatrices in some of
the UFunc operators, but removed it when it became unnecessary (although I
forget exactly what changed..)

Regardless, there are two issues:

  1. That constructor assumes that the data array given contains only active
    values, and doesn't check for 0's. Perhaps fine for an internal
    constructor, but I don't think it should be exposed the general code as
    part of the interface because people will misuse it.
  2. I actually think that in general CSCMatrices shouldn't really expose a
    direct constructor in general. In addition to the above-mentioned
    assumption, getting the correct indices and pointers for the array is more
    complex by far than what a user should have to consider, and I think
    leaving the constructor in will prompt people to try to use it (and likely
    bang their head on a desk) rather than CSCBuilder, or one of the
    constructor methods in the companion object.

Do you have a specific use case for the constructor that wouldn't be
better handled by a companion object contructor method, added CSCBuilder
functionality, or a separate UFunc altogether?


Reply to this email directly or view it on GitHub
#375 (comment).

@gabeos
Copy link
Contributor

gabeos commented Mar 3, 2015

okay, makes sense. any thoughts on the zero issue? I guess since it's expected to only be used by people who know what they're doing, they can choose the trade-off between the up front cost of extracting the empty values, and the higher cost of operations with more active values.
On Tue, Mar 03, 2015 at 3:16 PM, David Hall notifications@github.com wrote:In general, in breeze I want to have escape hatches for when users who
"know what they are doing" need access to the underlying representation for
efficiency. Hence, DenseVector exposes the data array, even though it's
possible to mess up the indexing.

So, if there's a good use case for assembling the pieces of a CSCMatrix
back into a CSC matrix, I'm ok with that. Serialization seems like a
reasonable reason.

On Tue, Mar 3, 2015 at 3:09 PM, gabeos notifications@github.com wrote:

hey @mengxr https://github.com/mengxr, that constructor was initially
something I put in to overcome an issue in creating CSCMatrices in some of
the UFunc operators, but removed it when it became unnecessary (although I
forget exactly what changed..)

Regardless, there are two issues:

  1. That constructor assumes that the data array given contains only active
    values, and doesn't check for 0's. Perhaps fine for an internal
    constructor, but I don't think it should be exposed the general code as
    part of the interface because people will misuse it.
  2. I actually think that in general CSCMatrices shouldn't really expose a
    direct constructor in general. In addition to the above-mentioned
    assumption, getting the correct indices and pointers for the array is more
    complex by far than what a user should have to consider, and I think
    leaving the constructor in will prompt people to try to use it (and likely
    bang their head on a desk) rather than CSCBuilder, or one of the
    constructor methods in the companion object.

Do you have a specific use case for the constructor that wouldn't be
better handled by a companion object contructor method, added CSCBuilder
functionality, or a separate UFunc altogether?


Reply to this email directly or view it on GitHub
#375 (comment).

—Reply to this email directly or view it on GitHub.

@dlwh
Copy link
Member

dlwh commented Mar 3, 2015

yeah, I think we just let people shoot themselves in the foot if they want.
Maybe we add a scaladoc saying "don't use this constructor unless you know
what you are doing?"

On Tue, Mar 3, 2015 at 3:39 PM, gabeos notifications@github.com wrote:

okay, makes sense. any thoughts on the zero issue? I guess since it's
expected to only be used by people who know what they're doing, they can
choose the trade-off between the up front cost of extracting the empty
values, and the higher cost of operations with more active values.
On Tue, Mar 03, 2015 at 3:16 PM, David Hall notifications@github.com
wrote:In general, in breeze I want to have escape hatches for when users who
"know what they are doing" need access to the underlying representation for
efficiency. Hence, DenseVector exposes the data array, even though it's
possible to mess up the indexing.

So, if there's a good use case for assembling the pieces of a CSCMatrix
back into a CSC matrix, I'm ok with that. Serialization seems like a
reasonable reason.

On Tue, Mar 3, 2015 at 3:09 PM, gabeos notifications@github.com wrote:

hey @mengxr https://github.com/mengxr, that constructor was initially
something I put in to overcome an issue in creating CSCMatrices in some
of
the UFunc operators, but removed it when it became unnecessary (although
I
forget exactly what changed..)

Regardless, there are two issues:

  1. That constructor assumes that the data array given contains only
    active
    values, and doesn't check for 0's. Perhaps fine for an internal
    constructor, but I don't think it should be exposed the general code as
    part of the interface because people will misuse it.
  2. I actually think that in general CSCMatrices shouldn't really expose a
    direct constructor in general. In addition to the above-mentioned
    assumption, getting the correct indices and pointers for the array is
    more
    complex by far than what a user should have to consider, and I think
    leaving the constructor in will prompt people to try to use it (and
    likely
    bang their head on a desk) rather than CSCBuilder, or one of the
    constructor methods in the companion object.

Do you have a specific use case for the constructor that wouldn't be
better handled by a companion object contructor method, added CSCBuilder
functionality, or a separate UFunc altogether?


Reply to this email directly or view it on GitHub
#375 (comment).

—Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub
#375 (comment).

@mengxr
Copy link
Contributor Author

mengxr commented Mar 3, 2015

Agree with @dlwh. CSC is a standard format for sparse matrix. We should just let people shoot themselves in the foot if they don't follow the specification. Having no public constructor makes it hard to convert data already in CSC format to breeze's CSCMatrix. I'll put a doc to warn users.

@mengxr mengxr force-pushed the csc-constructor branch from 36fa164 to f3c452f Compare March 4, 2015 00:02
dlwh added a commit that referenced this pull request Mar 4, 2015
add CSCMatrix's constructor back
@dlwh dlwh merged commit 7adc0a4 into scalanlp:master Mar 4, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants