-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
P2: Vector-matrix multiplication #44
Comments
…, vector-matrix multiplication), and P4 (#46, deflation). Not yet tested, but this is what an initial proof of concept looks like.
@magsol File "/home/targol/anaconda2/lib/python2.7/R1DL_Pyspark.py", line 216, in <module>
S = S.apply(deflate, keepDType = True, keepIndex = True)
TypeError: apply() got an unexpected keyword argument 'keepDType' the log file is as follows : |
Can you figure out what it means? iPhone'd
|
@magsol |
No. Read the error message specifically. It's complaining about about unrecognized parameter names. Check the thunder documentation and see if you can figure out how to fix it. iPhone'd
|
Ok, let me check it. |
Actually the problem was just on spelling ! In 'KeepDType' T must be change into small 't' ! I'll correct it in main file. |
Now again I've tested the code on small test1 pattern , there z is much better than our previous z file ! now the z.txt is a sparse matrix , I'm going to test the bigger data set , the results of first small data set is as follows : |
Excellent work! |
@magsol -49.625731 -90.950085 -107.148851 -22.263390 -27.960949 -74.573206 -35.491131 -1.820312 106.864215 14.072931 171.595561 -93.018838 -3.851441 281.873055 212.104157 375.934794 -69.247916 -79.771974 -27.565432 335.760112 330.057942 255.645971 129.561707 23.689732 39.457266 338.431347 358.045253 -16.198390 211.919775 120.124855 66.542751 282.075863 378.395402 -94.307979 -2.779630 -11.584412 185.832728 279.141163 101.102970 -99.788754 -82.138987 99.249246 175.284746 101.319492 -94.943044 -29.128951 26.582609 -22.439812 16.184655 -30.774730 -42.659585 -28.481978 -76.469311 -137.889147 -69.109695 -74.959590 -93.705282 -121.603436 -149.070855 -55.650968 4.239743 -17.991413 -64.647887 -55.436329 -55.543341 -233.434969 -226.427454 -73.695304 -141.986671 -140.047461 -242.440411 -280.187721 -196.235706 89.043456 -22.907281 -11.296129 -80.976172 -138.241792 -352.324480 -125.427455 43.500121 -186.793748 -112.535951 -205.595161 -278.406738 -371.797682 -80.563537 48.026023 287.180729 178.378065 121.456420 87.679904 -109.481793 -114.439424 11.187516 282.435522 -78.271834 -78.662650 -222.487548 -393.253565 |
I'm not sure what that means.
How does it compare to what we see in the milestone 2 output? |
The answer for that 2 questions :
|
If the quality changes with the size (i.e. the results are better with Still, we need more testing. I'm on the road again tomorrow, but almost If you and Xiang could start working on unit tests, that would be great.
|
@magsol |
Hmm, that's a good question. However the fact that the very nature of the On Mon, Dec 28, 2015 at 2:44 PM MOJTABAFA notifications@github.com wrote:
|
@magsol File "/home/targol/spark-1.5.2-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py", line 2089, in
<genexpr>
File "/home/targol/anaconda2/lib/python2.7/R1DL_Pyspark.py", line 26, in <lambda>
.map(lambda x: np.array(map(float, x.strip().split("\t")))) \
ValueError: could not convert string to float: Moreover, It's really difficult and time consuming for me to test with my laptop because of lack of resources. |
It looks like there's a non-float character that we're trying to cast to a float, e.g. float("?") or something like that. Nonetheless, I hear you loud and clear. I'm sorry I haven't had time to finish setting up my cluster, but that's still in progress. I should have some news for you today or tomorrow. |
This Spark primitive is a little trickier than #20. This is due to the fact that the matrix will be row-distributed, but in vector-matrix multiplication, the columns of the matrix are multiplied.
Still, this can be done in a fairly straightforward manner.
u
to be multiplied, e.g.sc.broadcast(u)
.flatMap
over the RDD.u
.flatMap
instead ofmap
).reduceByKey
will then sum up the values for each key, which correspond to the elements of the resulting vectoru
.The text was updated successfully, but these errors were encountered: