Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lapacke/internal: add conv package with DpbToColMajor and DpbToRowMajor #64

Merged
merged 1 commit into from Oct 31, 2019

Conversation

vladimir-ch
Copy link
Member

With this and appropriate changes in lapacke the tests in #63 pass with both OpenBLAS and MKL.
I guess we'll have to do the same for the S, C, Z variants. Do we want to generate or copy/paste/adjust manually? Resigned, I'm for the latter.

Please take a look.

@vladimir-ch vladimir-ch force-pushed the lapacke/dpb-row-col-conv branch 2 times, most recently from ed083c0 to 5f9563a Compare August 18, 2019 23:11
@vladimir-ch
Copy link
Member Author

It's failing in Dlantr ... Why only sometimes? Why not locally on my machine? Where is my hair-tearing emoji when I need it? I'll take a look at it tomorrow.

@vladimir-ch
Copy link
Member Author

Why not locally on my machine?

Because modules.

Why only sometimes?

Not sure but with the latest test it fails also locally but the reason is different:

=== RUN   TestDlantr
--- FAIL: TestDlantr (0.00s)
    dlantr.go:134: norm=O,uplo=L,diag=N,m=5,n=3,lda=6: unexpected result; got 2.5630634709733586, want 4.541163635975173
    dlantr.go:134: norm=O,uplo=L,diag=N,m=5,n=4,lda=4: unexpected result; got 1.9755724091341136, want 2.8687395090872854
    dlantr.go:134: norm=O,uplo=U,diag=N,m=5,n=4,lda=4: unexpected result; got 2.828438928225644, want 3.2323302465805064
    dlantr.go:134: norm=O,uplo=L,diag=N,m=10,n=3,lda=3: unexpected result; got 5.708694305235774, want 6.708694305235774
    dlantr.go:134: norm=O,uplo=L,diag=N,m=10,n=3,lda=6: unexpected result; got 5.863030002610436, want 6.0156656655521905
    dlantr.go:134: norm=O,uplo=U,diag=N,m=10,n=3,lda=3: unexpected result; got 1.6075342107957737, want 2.7287199868376697
    dlantr.go:134: norm=O,uplo=L,diag=N,m=10,n=4,lda=4: unexpected result; got 5.786618199796923, want 6.786618199796923
    dlantr.go:134: norm=O,uplo=L,diag=N,m=10,n=4,lda=7: unexpected result; got 5.081402974965211, want 8.295042414527586
    dlantr.go:134: norm=O,uplo=U,diag=N,m=10,n=4,lda=4: unexpected result; got 3.1372238273987727, want 4.757874472806222

The same failure is with MKL. I'll check if the 1-norm test is correct.

@kortschak
Copy link
Member

ᕙ(⇀‸↼‶)ᕗ

@vladimir-ch
Copy link
Member Author

Adding or removing a t.Log line changes the failure from the above output to the "unexpected modification" ... ᕙ(⇀‸↼‶)ᕗ

@kortschak
Copy link
Member

@vladimir-ch
Copy link
Member Author

Oh, I hadn't realized ... functionality-wise, it's not different but some time ago I made changes in lapacke to avoid importing blas package and those conversion routines would bring it back plus blas/blas64. If you think it's not a big deal, then we use that and can close this.

@vladimir-ch
Copy link
Member Author

Oh, I hadn't realized ... functionality-wise, it's not different but some time ago I made changes in lapacke to avoid importing blas package and those conversion routines would bring it back plus blas/blas64. If you think it's not a big deal, then we use that and can close this.

Regarding the local Dlantr failures, they went away after a reboot .....

@kortschak
Copy link
Member

There is little code in blas64 that the linker won't elide if it's not called (just the methods on the types - though maybe this includes all the gonum BLAS implementations because of gonum.Implementation?), and I think the vast majority of imports of netlib would also import blas64 anyway, so I don't think there is a terrible problem.

Now I'm not so sure, mainly because of gonum.Implementation. I think leave it as your code here.

@vladimir-ch
Copy link
Member Author

I'm getting the flaky Dlantr failures locally again and they persist also on Travis. I don't know where they come from and what to do with them. @kortschak , can you reproduce them? Do you have any idea? Even though they're not related to this PR, I'd like to clear them before merging.

@kortschak
Copy link
Member

On my workstation, I'm getting SIGILL. I've rebuilt OpenBLAS and Go with a variety of versions with no change. I'll try on the laptop later.

@kortschak
Copy link
Member

OK. Now that I've correctly built this I cannot replicate the Dlantr failures.

@vladimir-ch
Copy link
Member Author

But it does fail on Travis, so there is something going on. I don't know why it's only the Dlantr test and I don't see anything extraordinary in the test. To me it looks like some concurrency issue in how the test is run. For example, sometimes I get this failure (I log locally the contents of the colsum slice):

dlantr.go:133: norm=0,uplo=L,diag=N,m=0,n=3,lda=3: unexpected result; got 0, want 1, colsum [1 1 1]

So the result is correct but the expectation not. But there is no way the test writes those 1s into colsum! m is 0 and the diagonal is non-unit. What am I missing?

@vladimir-ch
Copy link
Member Author

I still (again) think the failures are a glitch because some stale/incompatible file is being picked up from somewhere during the build. I've upgraded the system which has go1.13, purged $GOCACHE and $GOPATH/pkg, unset $GOPATH, ran the test again and the failures are gone (again). Is it possible that something similar is going on on Travis? I remember there is some kind of build cache but what's its lifetime?

@vladimir-ch
Copy link
Member Author

I've deleted all caches using Travis' web UI, let's see what happens.

@vladimir-ch
Copy link
Member Author

Deleting the caches didn't help and my inexplicable local failures are back ... I'm out of ideas ... ᕙ(⇀‸↼‶)ᕗ

@vladimir-ch
Copy link
Member Author

OMG, I found it. It's the usual suspect, work. Since we are row-major, for our Go implementation it makes sense to use work of length n for the L-1 norm (MaxColumSum). However, LAPACKE just forwards work to the Fortran reference which is column-major and uses it for L-infinity norm (MaxRowSum) and needs length of m. PR follows shortly.

@vladimir-ch
Copy link
Member Author

Fix is in #67

@kortschak
Copy link
Member

Do you want to rebase this and the other outstanding PR onto master?

@vladimir-ch
Copy link
Member Author

Done.

@codecov-io
Copy link

codecov-io commented Oct 31, 2019

Codecov Report

Merging #64 into master will increase coverage by 0.32%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #64      +/-   ##
==========================================
+ Coverage    29.9%   30.23%   +0.32%     
==========================================
  Files           2        4       +2     
  Lines        6384     6414      +30     
==========================================
+ Hits         1909     1939      +30     
  Misses       4039     4039              
  Partials      436      436
Impacted Files Coverage Δ
lapack/lapacke/internal/conv/conv.go 100% <100%> (ø)
lapack/lapacke/internal/conv/pb.go 100% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 029f6d2...c621836. Read the comment docs.

@vladimir-ch vladimir-ch merged commit eccb959 into master Oct 31, 2019
@vladimir-ch vladimir-ch deleted the lapacke/dpb-row-col-conv branch October 31, 2019 11:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants