-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: tip dates functionality #27
Comments
If I have a user with a use case: definitely! Could I use you for this? If yes, I'll take a peek how to implement this this Friday, October 19th 2018. If it is easy, I may add it that day.If it is too hard, I will give priority to getting babette accepted by rOpenSci and then to be put on CRAN. |
No use case, so worked on getting babette accepted by rOpenSci. If someone volunteers for a use case, let me know. |
Peter Durr has volunteered to help. 🎉 |
Email from Peter Durr and example files: [...] I appreciate that you probably only wanted some example files. anyway, attached are three files which will give you - I trust - a good example on which to base the tip dating function within your Beautier library:
The challenge I found was that creating the date file for BEAUti is very crude. For this to be able to be uploaded and build the height - the number of decimal years before the most recent common sample - requires that the user upload a file with two columns/fields:
This very restrictive nature of the permissive upload file means that it often fails - with no error message of why it failed! This is especially a problem with the requirement for tab-separation, as BEAUti does not accept a simple TSV export from Excel. Instead I needed to run it through various steps to get it to work - thus the file has a "4" in its' name! In practice, because uploading a separate date file is so hard, all of the tutorials on producing a time-tree in BEAST I have seen use the tip-dating tool which extracts the date from the fasta header. This does has the advantage that there will always be the correct order of the fasta sequence file and the date file, which is a potential problem if the two files are uploaded separately. However, this then puts the effort back into producing a complex header - with all the risk of introducing error manipulating the concatenation. I am also guessing that implementing this complex interface using R functions will be a lot of work for you, as well as needing a complex R function with lots of arguments. So thinking it through, I would like to recommend the following for babette/beautier: The input date file:
This I think will make the preparation of the date file very easy, but more importantly it would allow for some validation at import/parsing:
To make the above practical, I have attached as the fourth file a CSV date file exported from Excel containing just the Genbank accession ID and the year (date). [...] |
This is very helpful! I will add an argument called 'tip_dates' that requires a data frame. Let the parsing be done by the caller 🌈 [edit: will follow Peter's idea to use a filename instead] |
Came halfway, will finish at 16th (p = 25%), 23rd (p = 50%) or 30th (p = 99%) November. |
Done. Not tested to the bone, but I was able to reproduce the file supplied by Peter. |
From @ksw9 at this Issue:
The text was updated successfully, but these errors were encountered: