Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support to sparse linear algebra. #85

Merged
merged 19 commits into from
Nov 15, 2020
Merged

add support to sparse linear algebra. #85

merged 19 commits into from
Nov 15, 2020

Conversation

nychiang
Copy link
Collaborator

@nychiang nychiang commented Sep 23, 2020

Adds an NLP general sparse problem formulation and interface, new sparse linear algebra, and backend linear solvers (MA57 and StrumPack). The PR focuses on software and some math aspects and does not fully support device computations. These will be addressed by a future PR.

Tidx *irow, *jcol;
Tval *values;
Tval *values;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Tval *values;
Tval* values_;

Perhaps it would be good to follow HiOp style guidelines.

* @pre 'this' must have exactly, or more than 'n_rows' rows
* @pre 'this' must have exactly, or more cols than 'src'
*/
void hiopMatrixSparseTriplet::copyRowsFromSrcToDest(const hiopMatrix& src_gen,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a unit test for this kernel.

@@ -141,14 +141,45 @@ void hiopMatrixSparseTriplet::addSubDiagonal(const double& alpha, long long star
assert(false && "not needed");
}

void hiopMatrixSparseTriplet::copySubDiagonalEleFromVec(const long long& start_on_dest_diag, const long long& num_elems,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this kernel needs a unit test.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a unit test into test/testMatrixSparse.cpp.
The corresponding code is added into test/LinAlg/matrixTestsSparseTriplet.cpp
However, the corresponding function verifyAnser(...) hasn't been implemented. Will set a follow-up meeting with Asher @ashermancinelli to discuss this issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: @nychiang and I discussed our unit test conventions, so he should now be able to update accordingly.

@pelesh
Copy link
Collaborator

pelesh commented Sep 23, 2020

It would be good to take a look at RAJA snapshot branch to get a heads up on possible merge conflicts. RAJA PR will be submitted soon, so it would be good to coordinate.

@cnpetra
Copy link
Collaborator

cnpetra commented Sep 23, 2020

@pelesh RAJA PR will be merged first in master

const double* x, bool new_x, double* cons)
{
const double* s = x+ns_;
const double* y = x+2*ns_;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nychiang : end of line is changed in very many places. This will make merging extremely difficult and laborious. I know why it happened ;) I suggest you somehow fix this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to run command dos2unit, but it seems that I need to fix this issue manually. :-)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ouch... maybe core.autocrlf settings of git will help?

//sum 0.5 {x_i*(x_{i}-1) : i=1,...,ns} + 0.5 y'*Qd*y + 0.5 s^T s
bool eval_grad_f(const long long& n, const double* x, bool new_x, double* gradf)
{
//! assert(ns>=4); assert(Q->n()==ns/4); assert(Q->m()==ns/4);
//x_i - 0.5
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nychiang : also here, I guess \n changed with \r\n

@nychiang
Copy link
Collaborator Author

Just push another commit, in which I fix the bug and white-space issue.
Ex6 now is working fine. Its usage is:
nlpSparse_ex6.exe n
where n>=3 is a the number of variables provided by the user

@nychiang
Copy link
Collaborator Author

NOTE: need to fix the hard-coded path for Metis/Ma57 in CMake file before merging into the master branch

virtual const local_ordinal_type* getRowIndices(const hiop::hiopMatrixSparse* a) = 0;
virtual const local_ordinal_type* getColumnIndices(const hiop::hiopMatrixSparse* a) = 0;
virtual local_ordinal_type getLocalSize(const hiop::hiopVector* x) = 0;
virtual int verifyAnswer(hiop::hiopMatrix* A, real_type answer) = 0;
virtual int verifyAnswer(hiop::hiopMatrix* A, local_ordinal_type nnz_st, local_ordinal_type nnz_ed, const double answer) = 0;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashermancinelli Please verify. I add one function to verify the answers for a sparse matrix, since the other "verifyAnswer" requires a dense input matrix "A". See file matrixTestsSparseTriplet.cpp for examples. NOTE that since this branch is developed based on 'master', it doesn't has the RAJA implementation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if you're just checking a single element, probably better to use getLocalElement and check result

}
}

assert(A.n() >= B.n());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be assertions or failure conditions for the test?

tests/LinAlg/matrixTestsSparse.hpp Outdated Show resolved Hide resolved
@@ -432,10 +500,12 @@ class MatrixTestsSparse : public TestBase
virtual real_type getLocalElement(const hiop::hiopMatrix* a, local_ordinal_type i, local_ordinal_type j) = 0;
virtual real_type getLocalElement(const hiop::hiopVector* x, local_ordinal_type i) = 0;
virtual real_type* getMatrixData(hiop::hiopMatrixSparse* a) = 0;
virtual real_type getMatrixData(hiop::hiopMatrixSparse* a, local_ordinal_type i, local_ordinal_type j) = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why create another function instead of using getLocalElement?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getLocalElement in class matrixTestsSparseTriplet assumes that the destination matrix "hiop::hiopMatrix* a" is a dense matrix.
See

auto mat = dynamic_cast<const hiop::hiopMatrixDense*>(A);

tests/LinAlg/matrixTestsSparseTriplet.cpp Show resolved Hide resolved
virtual const local_ordinal_type* getRowIndices(const hiop::hiopMatrixSparse* a) = 0;
virtual const local_ordinal_type* getColumnIndices(const hiop::hiopMatrixSparse* a) = 0;
virtual local_ordinal_type getLocalSize(const hiop::hiopVector* x) = 0;
virtual int verifyAnswer(hiop::hiopMatrix* A, real_type answer) = 0;
virtual int verifyAnswer(hiop::hiopMatrix* A, local_ordinal_type nnz_st, local_ordinal_type nnz_ed, const double answer) = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if you're just checking a single element, probably better to use getLocalElement and check result


auto val = getMatrixData(&B);

fail += verifyAnswer(&B,0,B_nnz_st,B_val);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be more thorough, we should have another verifyAnswer that takes a lambda for sparse mats, eg

fail += verifyAnswer(&B,
  [=] (local_ordinal_type i, local_ordinal_type j) -> real_type
  {
    if(i==0 && j==B_nnz_st) return B_val;
    else if // other two conditions...
    else if // other two conditions...
    else return 0.;
  });

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to check multiple values in a sparse matrix, not a single value.
The nonzero indices for these nonzero starts from nnz_st to nnz_ed.
I can change it to a lambda function like you mentioned, but I need to assign it to a different name, or use different declaration.
This is because the current implementation assumes the input matrix is dense

int MatrixTestsSparseTriplet::verifyAnswer(
hiop::hiopMatrix* Amat,
std::function<real_type(local_ordinal_type, local_ordinal_type)> expect)
{
auto A = dynamic_cast<hiop::hiopMatrixDense*>(Amat);
assert(A->get_local_size_n() == A->n() && "Matrix should not be distributed");
const local_ordinal_type M = A->get_local_size_m();
const local_ordinal_type N = A->get_local_size_n();
int fail = 0;
for (local_ordinal_type i=0; i<M; i++)
{
for (local_ordinal_type j=0; j<N; j++)
{
if (!isEqual(getLocalElement(A, i, j), expect(i, j)))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll submit a pr against your branch to fix this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure. I tried to not touch your existing code and only add my code into the repository.
I believer it will be really helpful if we can fix this issue.
About verifying the answer for a sparse matrix, I do have some different idea. Currently we only verify the nonzero values given by the nonzero index, but I think we may also want to verify the nonzero pattern, For example, verify the kth nonzero is located on some particular row and col. This may be not urgent, since now we assume the sparsity pattern won't change. If we use sparse matrices correctly, we can skip check the pattern. However, if there is an error in the sparsity pattern when copy to/from another matrix/vector, it will be hard to detect. Please let me know your opinion about this. Cheers,

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I see it there is little use for a getElement method for a sparse matrix simply because such method is cumbersome to implement and use. Method verifyAnswer is always implementation specific, so it should access matrix data directly, no need to use getElement or similar.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verifying sparsity pattern should not be difficult either. Again, I would implement that by accessing matrix data directly and not try to implement getElement method for that. Method getElement is design to retrieve and check only one single element in a vector or matrix. If you are checking multiple elements, you should be accessing matrix data directly. This is also true for dense matrices.

@nychiang
Copy link
Collaborator Author

nychiang commented Oct 7, 2020 via email

@ashermancinelli
Copy link
Contributor

Could you respond via github in the future? Email responses create a lot of garbage. It would be appropriate to pass num entries per row to the test from the test driver.

@nychiang
Copy link
Collaborator Author

nychiang commented Oct 7, 2020

I noticed it. :-)
just deleted all the email garbage in my reply.

@pelesh
Copy link
Collaborator

pelesh commented Nov 4, 2020

I believe this PR should target develop rather than master branch.

@cnpetra cnpetra marked this pull request as ready for review November 10, 2020 23:52
@cnpetra cnpetra changed the base branch from master to develop November 10, 2020 23:57
@cnpetra
Copy link
Collaborator

cnpetra commented Nov 14, 2020

Does this also address issue #79 ?

@nychiang
Copy link
Collaborator Author

NO. The patch is ready. Do you want to address issue #79 in this PR?

@cnpetra
Copy link
Collaborator

cnpetra commented Nov 14, 2020

NO. The patch is ready. Do you want to address issue #79 in this PR?

let's address it after the merge as this PR blocks other efforts as well and it is better to close it asap

@cnpetra cnpetra merged commit ec5bf1d into develop Nov 15, 2020
@cnpetra cnpetra deleted the nywrk branch April 7, 2021 21:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants