TODO.txt

TODO
* Bug fix (kind of): if a user quits during the estimation of the starting tree, funny things
    happen. We need to put in some kind of check for this. Basically just make sure that
    if we're thinking of using old results, then we check that the starting tree file has
    something in it. If it doesn't, quit with an error message and ask the user to 
    use force-restart
* Usability: change the 'we aborted' error message to suggest trying force-restart. A lot of 
    user errors are fixed this way, there have been a few queries on email and on the user 
    group.
* Add RAxML windows executable and check it works.
* Make RAxML mac universal binary and check it works.
* Change name of clustering to 'strict-clustering' in the program and the manual
* Change name of greediest to 'relaxed-clustering' in the program and the manual
* Change name of greediest-cutoff to something better.

* Change --save-phyml to --save-phylofiles in the manual
* merge phyml_models.py and raxml_models.py into one, have a 'model' class, make models
   know to which lists they belong, and let them return their own command lines when asked
* change phyml folder to phylofiles folder

DONE


Brett test enhancing:
---------------------

TODO 
* change name back to full_name in subset reporting
* Only makes sense to report ONE IC in the greedy algo output...
* make progress part of the reporter too
* change name back to full_name in subset reporting
* think about using __slots__ in subset and scheme?
* Consider dropping part_subsets in schemes (only need it for checking...)
* think about ditching schemes as we go, and subset info?


DONE
* Only makes sense to report ONE IC in the greedy algo output...
* make progress part of the reporter too
* Make the output match (why aren't schemes numbered the same...)
* move path creation into config
* make results member of scheme? or keep a paired list
* put all of the summarising of results info (finding best) into the AnalysisResults
* move subset writing out into reporter
* finish making reports fleshed out
* Final wiritng takes Analysis results
* think about reporter, how should it look? with config?
* USE Analysis results to produce a test_object_compartor.
* The write methods for comparining that into AnalysisResults (with epsilon differences)
* Do a folder diff on example folder, making sure that these changes
*


Done
____
* add 'clustering' search option to the manual
* add --additional_cmd_line option so that we can pass e.g. precision controllers to PhyML and RAxML
  and we could also then use the RAxML Pthreads version, and tell it how many processors to use.
  helpful for massive datasets on desktops I suspect.
* when we fail to load output from RAxML, we need to go and delete ALL files that contain the 
  run_id in question. Right now we're not doing this properly so running stuff again is failing...
* if we're not saving phylofiles, we need to do a better job deleting RAxML files more generally, right now 
  stuff is stacking up in the phyml folder. Best way to do that is as above, and maybe we should just add a function
  into phyml.py and raxml.py, something like delete_analysis_files(run_ID). That way we can keep things in check.
- add --raxml to manual, include all the model definitions
- the numbers of columns are wrong when we warn that there are missing columns. Need to add 1 to the final number.
- we don't save phyml output by default now, but we have the option to
- we have universal line break support now
- flag a warning if ANYTHING in the .cfg file has changed. Do this by putting a copy of the .cfg file in the 'analysis' folder, loading it and comparing it to the current one
- fixed all scheme and subset number calculators
- mrbayes and raxml now options for model spec in the config file
- Get better .cfg parsing, so that we get useful error messages
- semicolons now needed, fixes some bugs in .cfg input
- add a more efficient Bell-number counter when counting schemes
- Use all CPU's default (rather than 1)
- Only use the user schemes if 'search' == 'user'
- check that Brett's implementation of source alignment and filtered alignment works.
- build initial tree and branchlengths NOT from the source alignment, but from the un-partitioned alignment. These can be different e.g. if I exclude some sites from all of my partitions. This is really important.
- Ordered scheme output in partitions in codon order (in scheme.py.write_summary)
- implemented Greedy search algorithm. Right now it works on whatever metric you specify in model_selection
- tidied up output of all_schemes.txt, best_schemes.txt, and the subset output files too
- Added model_selection option, AIC, AICc, or BIC now implemented
- Added branch_lengths option 'linked' or 'unlinked'
- Fixed calling of PhyML so that it actually constrains brlens
- PhyML constrained_lens option fixed by Stephane
- Added an error message if you specify a partition with sites that aren't in the alignment
- Fixed partition definition to include single sites
- optional charset at beginning of partition def