# NMF Question #3

Closed
opened this Issue Dec 1, 2013 · 6 comments

Projects
None yet
2 participants

### mjbock commented Dec 1, 2013

 I am hoping to use your NMF package for chemical fingerprinting. Basically chemical data from a series of samples is used to determine the end member compositions and the contribution of each end member to each sample. For the input matrix, rows are samples and columns are the chemical concentrations. The concentrations are normalized, meaning the sum of each row is 1. The nmf output should exhibit closure, meaning the rows of h should sum to 1. I have been reviewing the source code and have been unable to determine if an option is available to impose closure (rows in h sum to 1). Is this implemented? if not any advice of how best to attempt this myself? I would consider myself and intermediate R user. Thanks for you time Mike
Owner

### renozao commented Dec 11, 2013

 Hi Mike, this type of constraint is not exactly implemented in any of the built-in algorithms, but I believe there are some NMF algorithms out there that allow for this. I can think of three ways of implementing it: apply the 'Frobenius' (or 'lee' if used for clustering) to the transposed problem, with rescale = TRUE (as default), this imposes that the columns of W sum up to one and transpose the result: ``````x <- rmatrix(20,10) tmp <- nmf(t(x), 3, 'lee') res <- t(tmp) rowSums(coef(res)) `````` if this does not give you what you want, you could modify of any NMF algorithm that scales the rows of H to sum up to one, and apply the inverse scaling to the columns of W, which would not change the actual objective value: W H = W D^-1 D H. So you define your own update rule, e.g., based on the function nmf_update.lee_R, and define a new algorithm with it: ``````setNMFMethod('mynmf', 'Frobenius', Update = function(i, v, x, ...){ # add complete re-scaling here # return updated model x }, overwrite = TRUE) # you can now call nmf(x, 3, 'mynmf') `````` add this "soft" constraint via a regularisation term which penalises having the sum of H far from 1. This requires to modify the update rules for the entries of H. add a constraint to the optimisation problem, and solve it as a constrained optimisation problem. This also requires to modify the update rules for the entries of H. Please, let me know if this helped. Renaud

### mjbock commented Jan 14, 2014

 I apologized for my late reply, my original reply got hung up in our e-mail system. I utilized a modified approach that worked quite well. Rather than modify any of the NMF methods, I created a small procedure that scales the rows of h to one and calculates w: PMF<-nmf(X,k,method='lee',seed='nndsvd') w<-basis(PMF) h<-coef(PMF) h2<-sweep(h,1L,rowSums(h),"/",check.margin=FALSE) w2<-H.c %*%ginv(h2) That works like a charm and allows me to change the NMF calculation methods and still get what I want. An embarrassingly simple solution. Thanks, Mike From: Renaud [mailto:notifications@github.com] Sent: Wednesday, December 11, 2013 7:06 AM To: renozao/NMF Cc: Mike Bock Subject: Re: [NMF] NMF Question (#3) Hi Mike, this type of constraint is not exactly implemented in any of the built-in algorithms, but I believe there are some NMF algorithms out there that allow for this. I can think of three ways of implementing it: apply the 'Frobenius' (or 'lee' if used for clustering) to the transposed problem, with rescale = TRUE (as default), this imposes that the columns of W sum up to one and transpose the result: ```x <- rmatrix(20,10) tmp <- nmf(t(x), 3, 'lee') res <- t(tmp) rowSums(coef(res))``` if this does not give you what you want, you could modify of any NMF algorithm that scales the rows of H to sum up to one, and apply the inverse scaling to the columns of W, which would not change the actual objective value: W H = W D^-1 D H. So you define your own update rule, e.g., based on the function nmf_update.lee_R, and define a new algorithm with it: ```setNMFMethod('mynmf', 'Frobenius', Update = function(i, v, x, ...){ # add complete re-scaling here # return updated model x }, overwrite = TRUE) # you can now call nmf(x, 3, 'mynmf')``` add this "soft" constraint via a regularisation term which penalises having the sum of H far from 1. This requires to modify the update rules for the entries of H. add a constraint to the optimisation problem, and solve it as a constrained optimisation problem. This also requires to modify the update rules for the entries of H. Please, let me know if this helped. Renaud On 1 December 2013 22:41, mjbock wrote: I am hoping to use your NMF package for chemical fingerprinting. Basically chemical data from a series of samples is used to determine the end member compositions and the contribution of each end member to each sample. For the input matrix, rows are samples and columns are the chemical concentrations. The concentrations are normalized, meaning the sum of each row is 1. The nmf output should exhibit closure, meaning the rows of h should sum to 1. I have been reviewing the source code and have been unable to determine if an option is available to impose closure (rows in h sum to 1). Is this implemented? if not any advice of how best to attempt this myself? I would consider myself and intermediate R user. Thanks for you time Mike — Reply to this email directly or view it on GitHubhttps://github.com/renozao/NMF/issues/3 . — Reply to this email directly or view it on GitHubhttps://github.com/renozao/NMF/issues/3#issuecomment-30315047. This message contains information that may be confidential, privileged or otherwise protected by law from disclosure. It is intended for the exclusive use of the Addressee(s). Unless you are the addressee or authorized agent of the addressee, you may not review, copy, distribute or disclose to anyone the message or any information contained within. If you have received this message in error, please contact the sender by electronic reply to email@environcorp.com and immediately delete all copies of the message.
Owner

### renozao commented Jan 15, 2014

 Sure. I guess you meant `X` instead of `H.c` in your sample code. Some comments though: you have to be aware that it is unlikely that the factorization you get (w2, h2) is a -- local -- optimum of the constrained optimisation problem you want to solve. In this case, w2 is optimal for the sub-problem min_w |x wh2| given h2, but probably not for the complete problem min_{w, h} |x - wh|. Moreover, you may even get negative values in w2. even the rescaling I suggested (within the update step), and is commonly performed in NMF algorithms, sort of messes up with the optimality conditions, although it does not affect the objective value per se.

### mjbock commented Jan 15, 2014

 Not quite, left out a line: #X = input matrix #k=number of end memebers PMF<-nmf(X,k,method='lee',seed='nndsvd') w<-basis(PMF) h<-coef(PMF) H.c<-w % * %h h2<-sweep(h,1L,rowSums(h),"/",check.margin=FALSE) w2<-H.c % * % ginv(h2) So this should just be a simple re-scaling of the PMF result. I found that the results closely match those obtained using Polytopic Vector Analysis, another receptor modeling technique.However, I will experiment with adding this type of rescaling into the optimization when I get some time and need a more robust result. Thanks for your help.
Owner

### renozao commented Jan 15, 2014

 I wonder if this not equivalent to do the simpler rescaling: ``````d <- diag(1/rowSums(h)) h <- d %*% h w <- w %*% 1/d `````` since, in matrix product, we have: w2 = h.c h2^-1 = h.c (d h)^-1 = w h (h^-1 d^-1) = w d^-1 This is strictly true if h is invertible (which is not the case here), but does not hold in general for the generalised inverse. However, the special form of d (diagonal positive) may still make it work.

Open

Owner

### renozao commented Mar 20, 2014

 I am closing this, but feel free to add more comment on the subject.

Closed

Closed

Open

Closed

### Aashilbatavia referenced this issue Nov 22, 2017

Open

#### Error when running NMF #101

to join this conversation on GitHub. Already have an account? Sign in to comment