Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in optim(init, fn = loglik_per_TF, gr = deriv_per_TF, lower = 10^(-16), : L-BFGS-B needs finite values of 'fn' #1

Closed
rdbcasillas opened this issue Nov 20, 2019 · 6 comments
Labels

Comments

@rdbcasillas
Copy link

Hello,

Thanks for creating BiFET.

After creating the two GRanges objects as specified in the tutorial (only difference being that I used HINT instead of PIQ), I came across this error when running the following command.

> result <- calculate_enrich_p(input1_GR, input2_GR)
[1] "PWM 1 out of 1 PWMs"
Error in optim(init, fn = loglik_per_TF, gr = deriv_per_TF, lower = 10^(-16),  : 
  L-BFGS-B needs finite values of 'fn'

I have verified that my inputs look exactly like the inputs GRmotif and GRpeaks provided with the package. The two inputs can be downloaded from this dropbox link.

Any help is greatly appreciated. Thanks!

@nlawlor
Copy link
Collaborator

nlawlor commented Nov 20, 2019

Hi rdbcasillas,

Thanks for your question and for using BiFET! It looks like the input footprinting calls file that you provided are for only one transcription factor motif "KLF6". Currently, BiFET must be supplied with >1 unique transcription factor motifs/PWMs as the function calculate_enrich_p first calculates a "TF binding matrix M where i_th row represents i_th PWM and j_th column represetns j_th peak".

I hope this helps :)
Nathan

@rdbcasillas
Copy link
Author

Hi Nathan,

Thanks for responding. I added another TF STAT3 but I am getting the same error. Here is what I get this time:

result <- calculate_enrich_p(input1_GR, input2_GR)
[1] "PWM 1 out of 2 PWMs"
Error in optim(init, fn = loglik_per_TF, gr = deriv_per_TF, lower = 10^(-16),  : 
  L-BFGS-B needs finite values of 'fn'

I have updated the dropbox link with new data.

@nlawlor
Copy link
Collaborator

nlawlor commented Nov 21, 2019

Hmm, that's strange. I'm able to run the calculate_enrich_p function using the package test data with 2 and even 1 TF, so please disregard my previous comments about that.

Looks like this could be an issue with the parameters in the optimization function within calculate_enrich_p similar to what is shown here: Stack Overflow

@youna2 Can you please have a look at the provided data?

@nlawlor
Copy link
Collaborator

nlawlor commented Nov 21, 2019

Hi rdbcasillas,

I revisited your input peak file and noticed that your GC content values seemed strange.

# range of GC content values in peaks
range(in1$GC)
[1] 0.000000 0.213918
# how many peaks have a GC content of 0
length(which(in1$GC == 0))
[1] 55433

I'm not sure what species your peak data originated from, but it seems strange that your max GC content is ~21% (seems low; I think the average GC content in human genomes is usually between 40-60%) and that there are 50k peaks with a GC content of exactly zero. Perhaps there were some issues in calculating GC content in your input peak file?

Either way, this is an issue the developers and I will address so that users do not encounter the same error.

For now, a quick fix can be to remove the peaks in your dataset with a GC content equal to zero:

# load input files
in1 <- readRDS("Downloads/BiFet files/input1.rds")
in2 <- readRDS("Downloads/BiFet files/input2.rds")
# identify and remove peaks in peak dataset with GC content equal to zero
id_high <- which(in1$GC > 0)
in1_sel <- in1[id_high]
res <- calculate_enrich_p(in1_sel, in2)

Please give this a try and let me know if you still get the same error!

@rdbcasillas
Copy link
Author

Removing rows with GC content equal to zero worked! Thanks Nathan.

I'm not sure what species your peak data originated from

I am playing with mouse data. The peak file was generated by MACS2. I will have to check with other datasets if this low GC content is an anomaly or not.

@nevelsk90
Copy link

Hi,

Thanks for making the tool,

Unfortunately I get the same error even when GC content is not equal zero.
Please download R object with input files using the link : https://drive.google.com/open?id=19HZIcR5CGBwDod-Ht0jSnHBsYynGPhJB

I should say that the file containing instances of motifs (GRmotif) is NOT a result of footprint calling but of motif scanning with FIMO. But if I understand correctly BiFET should be able to peocess these results as well.

Thank you in advance and regards
TIm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants