Need some help on weird outputs of the benchmark demo. #11

AMSSWANGYIJIA · 2023-04-10T08:33:29Z

Hi! Dr Chen, recently we made some benchmark tests on cktso. The test matrices are drawn from real circuit design ranging from 1.0E+04 up to 1.3E+07. We use the benchmark demos from both NICSLU and CKTSO and meet some weird situation. When
run ./benchmark add20.mtx #nthreads the output is just fine something like this follows:
Analysis time = 4900 us.
Factorization average time = 100 us, min time = 82 us.
Refactorization average time = 49 us, min time = 47 us.
Solve average time = 10 us, min time = 9 us.
Residual = 2.47485e-10.
Transposed solve average time = 12 us, min time = 11 us.
Residual = 2.44494e-10.
NNZ(L) = 9867, NNZ(U) = 7472.
Factorization flops = 133187, solve flops = 32283.
Determinent = 5.86668*10^(-3351).
Memory usage = 646989 bytes, max memory usage = 646989 bytes.

However, let's take CKTSO as an example, if we run ./benchmark ourcircuitmatrix #nthreads, something like this follows:
Analysis time = 0 us.
Factorization average time = 0 us, min time = 0 us.
Refactorization average time = 0 us, min time = 0 us.
Solve average time = 0 us, min time = 0 us.
Residual = 7062.9.
Transposed solve average time = 0 us, min time = 0 us.
Residual = 7062.9.
NNZ(L) = 0, NNZ(U) = 0.
Factorization flops = 106382044954745, solve flops = 10.
Determinent = 4.67441e-310*10^(6.91969e-310).
Memory usage = 1480 bytes, max memory usage = 61772 bytes.

I really don't know how to tune the numerous parameters in CKTSO or NICSLU. Canyou please give me some advices on tuning the solver? Since other popular direct solvers like KLU or PARDISO solved all of our test cases without tuning too much, I'm really confused about the weird result.

AMSSWANGYIJIA · 2023-04-10T08:35:53Z

PS. many diagonal elements are zero valued.

chenxm1986 · 2023-04-10T08:42:22Z

It looks that this is not a problem of tuning the solver. There is some error and the entire solver process is not performed at all. The reported times are all 0 and values are random. Please check the return value of CKTSO_Analyze.

AMSSWANGYIJIA · 2023-04-10T08:55:11Z

The return value of CKTSO_Analyze = -3

AMSSWANGYIJIA · 2023-04-10T08:58:21Z

Our matrix data are in matrix market file format with matrix market format
"%%MatrixMarket matrix coordinate real general"

chenxm1986 · 2023-04-10T09:02:48Z

-3 means invalid matrix... does your matrix have duplicated entries(for example, you have two ai[i] of same ai[i] value and the double values of two ax[i] need to be accumulated, this case is not allowed in CKTSO)? This is the most possible reason. Other reasons are all related to value out of range or matrix singular (including ap[0]!=0, ap[n]<=0, ap[n]<n, any ap[i]<0, any ap[i]>ap[i+1], any ai[i]<0, or any ai[i]>=n).

chenxm1986 · 2023-04-10T09:11:17Z

another reason may be in the read code of mtx file. the code provided in the demos may not handle any case. they are simple and just provided for demo purpose. you may check if the read was correct or use you own read code.

AMSSWANGYIJIA · 2023-04-10T13:43:25Z

Doctor Chen, thank you for your kindness help. The real problem is the duplicated entries. I fixed the bug and everything just worked fine. Since many of these matrix files are dumped for C or C++ scientific computation libraries to assemble a matrix in some so called "Add_Values" mode, it is very common to come into such a situation, i.e., in these libraries duplicate values will be accumulated automatically. I suggest, in my humble opinion, CKTSO could provide a simple tool to check and transform such "problematic" matrix market files into good ones.

chenxm1986 · 2023-04-11T03:24:37Z

Yes, this is a practical requirement in some applications. Actually we already have a simple internal tool that can remove duplicated entries. It needs to be called before every factorization. It leads to some additional latency and memory usage, especially when the matrix is very sparse. I am now thinking about how to integrate this funtion into CKTSO to eliminate these overheads.
I have q question about your case. You mentioned that KLU and PARDISO can solve your matrices. But KLU does not allow duplicated entries either (I did not check PARDISO carefully). How did you handle duplicated entries in KLU?

AMSSWANGYIJIA · 2023-04-11T03:30:19Z

Well the answer is quite simple, both KLU and MKL PARDISO were called using PETSc APIs, the matrix was assembled using PETSc's own "Add_values-like" APIs.

AMSSWANGYIJIA · 2023-04-11T03:31:41Z

PETSc matrix APIs handle these duplicate entries automatically.

AMSSWANGYIJIA closed this as completed Apr 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need some help on weird outputs of the benchmark demo. #11

Need some help on weird outputs of the benchmark demo. #11

AMSSWANGYIJIA commented Apr 10, 2023

AMSSWANGYIJIA commented Apr 10, 2023

chenxm1986 commented Apr 10, 2023

AMSSWANGYIJIA commented Apr 10, 2023

AMSSWANGYIJIA commented Apr 10, 2023

chenxm1986 commented Apr 10, 2023

chenxm1986 commented Apr 10, 2023

AMSSWANGYIJIA commented Apr 10, 2023

chenxm1986 commented Apr 11, 2023

AMSSWANGYIJIA commented Apr 11, 2023

AMSSWANGYIJIA commented Apr 11, 2023

Need some help on weird outputs of the benchmark demo. #11

Need some help on weird outputs of the benchmark demo. #11

Comments

AMSSWANGYIJIA commented Apr 10, 2023

AMSSWANGYIJIA commented Apr 10, 2023

chenxm1986 commented Apr 10, 2023

AMSSWANGYIJIA commented Apr 10, 2023

AMSSWANGYIJIA commented Apr 10, 2023

chenxm1986 commented Apr 10, 2023

chenxm1986 commented Apr 10, 2023

AMSSWANGYIJIA commented Apr 10, 2023

chenxm1986 commented Apr 11, 2023

AMSSWANGYIJIA commented Apr 11, 2023

AMSSWANGYIJIA commented Apr 11, 2023