Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

normalization_notebooks #2

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

caglayantuna
Copy link
Member

Comparison of possible normalization methods

This PR is related to issue #264. After our discussion, I have done some experiments in order to compare possible normalization methods for tensor decomposition and I wanted to share notebooks to be sure about the normalization method for tensorly.

In our PR #281, we suggested to normalize factors after inner iteration by using cp_normalize function for non_negative_parafac, non_negative_parafac_hals and parafac functions. As we discussed, there are different options to normalize factors and there is no exact way in the literature.

To normalize factors during inner iteration, we suggest to normalize last factor after error computation since we have used weights to compute mttkrp and removed the weights from the iprod computation. Following normalization methods are implemented to an audio data and a hyperspectral satellite image:

  1. Normalization at each outer loop (PR)
  2. Normalization at each inner loop iteration
  3. Normalization at the very end
  4. 2 and 3 together

In addition, parafac function experiments include its own normalization.

Conclusion

In the experiments, we demonstrate weights and average of the factors to see numerical stability. Besides, we report processing time and RMSE for each experiment. Briefly, there is no significant difference between the methods. We can select one of them to update our PR and tensorly.

@JeanKossaifi
Copy link
Member

Thanks @caglayantuna ! This is super useful / interesting.
Would it make sense to try on random tensors, with different magnitude?
Also curious to see the weights, over time.

In regular CP I noticed after normalization, the value of some was very large, which can be problematic, especially if we use lower precision dtypes (e.g. float32, 16 etc).

@cohenjer
Copy link

Let's make a few more tests including random tensors, and then chat about which method to keep in tensorly, and then try to change all relevant functions to have the same normalization style.

Personnaly I don't think the way we normalize will impact the results's stability so much (at least I have not observed it in practical applications). On the other hand, normalization has very small computational complexity, so it cannot hurt to overdo it a little. I think we could keep solution 1 (normalize all factors after each outer loop) always using the CPTensor.normalize routine (in Tensorly we sometimes use dedicated code, this PR also fixes that). The key point is that we propose to keep the weights stored in the weight variable, which means that the least squares update in ALS will not load all the energy on each factor at each inner update.

@caglayantuna
Copy link
Member Author

I have added new notebooks with differently magnified random tensors. Besides, I have increased number of experiments to see if there is any difference on variance of error or processing time in the other notebooks. According to all these experiments, we don't see any significant difference between the normalization methods. I will be waiting for you to select one of the method to update the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants