New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling missing data in decomposition #4
Comments
Nice posts. Would you like to create a pull request to add support for missing values? Currently the Robust Tensor PCA handles missing values but Tucker and CP don't. |
I'd love to do a PR. Sadly, at this point of time, I don't know where the masking would need to be done in the code. Would you have a clue? If the factorisation can be reduced to least squares, this should be trivial. |
Just FYI - I have (scipy/numpy) code that handles this (see link below). Agree it would be a nice addition to tensorly! https://github.com/ahwillia/tensortools/blob/master/tensortools/least_squares.py |
Agreed, let's make that happen! Robust tensor PCA in TensorLy already handles missing values, ideally this should be the case for all decompositions. |
Hi! Has it been worked on? If not I would like to start working on it |
You're welcome to take a crack at it @ShivangiM! |
Hi @ShivangiM, any luck with this? |
@JeanKossaifi not yet, been busy lately. |
Hi all I am wondering if robust_pca has been implemented as intended in terms of handling the missing values. I understand the requested format for the missing value mask, but it seems that in the underlying data array X, the missing data cannot be a nan value, so it seems that you have to use a numerical value for missing data points in X. However, I have noticed that the results are sensitive to the particular numerical value used for missing points, which I think cannot be the intended behavior? Is there an assumed value that missing points must have? Sorry for lack of code, am on a mobile as not allowed GitHub at work! |
Hi, I am trying to use CP for decomposition on my experiment (a part of data missing), and I notice that function parafac provides parameter "mask" for handling the missing values. I am successful when the tensor is 2-dimensions, e.g. tl.tensor([[1., 2.], [3., 4.]]), and the mask array is same with the tensor while its value is 0/1. However, when I repeat this process when the tensor is in higher-dimension, e.g. tl.tensor([ [[1., 2.], [3., 4.]], [[5.,6.],[7.,8.]] ]), it doesn't work. The error is as followed:
It seems that the direct problem is in
I don't really understand it indeed. What should I do for my goal? I am very hopeful for help and it is better if there is example in code. Thanks very much for anyone's suggestions. |
@milanlanlan I would suggest opening a separate issue report for this. It looks like |
Fixed by #173 |
Hi, One of the use case of matrix and tensor factorization is in movie recommendation where the matrix/tensor are sparse. I tried Tensorly with missing data and it fails.
I was wondering if handling missing data functionality can be added for the decomposition routines. I wrote a couple of blog posts on how we can handle missing entries for matrix factorization when using least squares based implementation; and if we use gradient descent based solutions.
The text was updated successfully, but these errors were encountered: