Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

NMF Question #3

Closed
mjbock opened this Issue · 6 comments

2 participants

@mjbock

I am hoping to use your NMF package for chemical fingerprinting. Basically chemical data from a series of samples is used to determine the end member compositions and the contribution of each end member to each sample. For the input matrix, rows are samples and columns are the chemical concentrations. The concentrations are normalized, meaning the sum of each row is 1. The nmf output should exhibit closure, meaning the rows of h should sum to 1. I have been reviewing the source code and have been unable to determine if an option is available to impose closure (rows in h sum to 1). Is this implemented? if not any advice of how best to attempt this myself? I would consider myself and intermediate R user.

Thanks for you time
Mike

@renozao
Owner

Hi Mike,

this type of constraint is not exactly implemented in any of the built-in
algorithms, but I believe there are some NMF algorithms out there that
allow for this. I can think of three ways of implementing it:

  • apply the 'Frobenius' (or 'lee' if used for clustering) to the transposed problem, with rescale = TRUE (as default), this imposes that the columns of W sum up to one and transpose the result:
x <- rmatrix(20,10)
tmp <- nmf(t(x), 3, 'lee')
res <- t(tmp)
rowSums(coef(res))
  • if this does not give you what you want, you could modify of any NMF algorithm that scales the rows of H to sum up to one, and apply the inverse scaling to the columns of W, which would not change the actual objective value: W H = W D^-1 D H. So you define your own update rule, e.g., based on the function nmf_update.lee_R, and define a new algorithm with it:
setNMFMethod('mynmf', 'Frobenius', Update  = function(i, v, x, ...){

  # add complete re-scaling here

  # return updated model
  x
}, overwrite = TRUE)

# you can now call
nmf(x, 3, 'mynmf')
  • add this "soft" constraint via a regularisation term which penalises having the sum of H far from 1. This requires to modify the update rules for the entries of H.
  • add a constraint to the optimisation problem, and solve it as a constrained optimisation problem. This also requires to modify the update rules for the entries of H.

Please, let me know if this helped.

Renaud

@mjbock
@renozao
Owner
@mjbock

Not quite, left out a line:
#X = input matrix
#k=number of end memebers

PMF<-nmf(X,k,method='lee',seed='nndsvd')
w<-basis(PMF)
h<-coef(PMF)
H.c<-w % * %h
h2<-sweep(h,1L,rowSums(h),"/",check.margin=FALSE)
w2<-H.c % * % ginv(h2)

So this should just be a simple re-scaling of the PMF result. I found that the results closely match those obtained using Polytopic Vector Analysis, another receptor modeling technique.However, I will experiment with adding this type of rescaling into the optimization when I get some time and need a more robust result.

Thanks for your help.

@renozao
Owner
@renozao
Owner

I am closing this, but feel free to add more comment on the subject.

@renozao renozao closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.