You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
I am noticing that the speedup is very bad as the number of processors
increases beyond 4. It is very easy to see what the problem is. The serial
Cholesky Factorization (Not the PICF) used in the IPM will always take the
same amount of time for a fixed problem size independent of the number of
processors. Thus If this were done in parallel, the speedup might be improved.
Here are the results I got for a cluster of 8-core Xeon Nodes using an
infiniband network.
CPUS TIME TRAIN IO ICF IPM B EST. OUT EFF SpdUp
2 16.507 13.684 1.547 3.529 3.344 6.810 1.277 1.000 2.000
4 13.955 12.398 0.755 3.197 2.749 6.451 0.803 0.591 2.366
8 12.015 10.929 0.515 2.466 2.490 5.974 0.570 0.343 2.748
16 6.425 5.437 0.529 1.137 2.356 1.944 0.459 0.321 5.138
24 5.082 4.181 0.163 0.512 2.396 1.272 0.739 0.271 6.496
32 5.684 4.269 1.050 0.968 2.307 0.994 0.365 0.182 5.808
Also , there seems to be some issues if the input file get too large..
The program never get through the PICF.
Thanks,
Patrick Nichols
Original issue reported on code.google.com by PatJNichols@gmail.com on 5 Aug 2008 at 5:44
The text was updated successfully, but these errors were encountered:
Hi Patrick,
What is your dataset size and what is your parameter for -rank_ratio?
For now, Cholesky Factorization is serial because it usually works on
a smaller matrix. During our experiment on RCV 800k dataset, we set rank_ratio
to 0.001 so that the matrix CF works on is 800*800. I suspect you set
rank_ratio to
a large value which may cause bad speedup. You could decrease rank_ratio and
try.
In fact, we used to consider Parallel Cholesky Factorization, but it will be
even slower on distributed computers because it requires much communication.
For most problems, the matrix CF works on is set to be small through rank_ratio.
Original comment by baihong...@gmail.com on 19 Aug 2008 at 8:22
Thanks,
That seems to drastically help the speedup. One quick question...I noticed that
the resulting treshold/bias for my training data set seems to change with
different
rank_ratio parameters. My naive impulse is to assume that this is bad. Is this
true?
Pat
Original comment by PatJNichols@gmail.com on 23 Aug 2008 at 10:34
Because for Interior Point Method, we have to do approximation to make it
solvable.
-rank_ratio is to control this approximation. Generally, the larger the
rank_ratio
is, the better the result is. But we have to trade off between time and
accuracy.
Make #number_of_data * #rank_ratio =1000 will be generally enough.
Original comment by baihong...@gmail.com on 24 Aug 2008 at 7:31
Original issue reported on code.google.com by
PatJNichols@gmail.com
on 5 Aug 2008 at 5:44The text was updated successfully, but these errors were encountered: