Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Make a List of Reusable Classes/Functions from Mothur #4

Closed
azmfaridee opened this Issue Apr 30, 2012 · 12 comments

Comments

Projects
None yet
3 participants
Owner

azmfaridee commented Apr 30, 2012

Mothur is a complex software and has a lot of Classes and Functions can be re-used rather than reinventing them again and again. We'd need to make a list of them and decide how these can be reused.

Child Issues: #5

@ghost ghost assigned kdiverson Apr 30, 2012

Collaborator

kdiverson commented Apr 30, 2012

You could also outline what functions you'll need and then see how many of them already exist. You'll want to talk to Sarah (https://github.com/mothur-westcott) about this.

Owner

azmfaridee commented Apr 30, 2012

@kdiverson: That is a good idea.
@mothur-westcott: Sarah, give us a buzz if you are getting notifications from this tagging.

I'm going to create more issues to break down the complex tasks to atomic ones. Right now trying to create new issues/updating them according to the chat's that we had for the last month so that we do not have to go through our chat history and just come to this github issue tracker and get a summary of all that is happening.

Collaborator

mothur-westcott commented Apr 30, 2012

Yup, I am getting these. I can certainly help you find some functions and classes to reuse. Here's a few to get you started as well as some background.
InputData and SharedRAbundVector - handle the read of a shared file. They are set up to allow the user to select certain groups to process as well as handle mothur's smart distancing reading. I can explain more of what that means later. A good example of their use in mothur can be found in the cooccurrencecommand.cpp function execute. The lookup vector returned in line 171 is a representation of the sharedfile. lookup.size() = number of rows in file. lookup[0]->getNumBins() = number of columns.

A little background on mothur's container classes. Here are the 5 most commonly used types, listvector, rabundvector, sabundvector, ordervector, and sharedrabundvector.

a group1
b group2
c group3
d group1
e group2
f group3
g group1
h group2
i group3
j group1
k group2
l group3
m group1
listvector      =   a,b,c,d,e,f     g,h,i       j,k     l       m  
rabundvector    =   6               3           2       1       1
sabundvector    =   2       1       1       0       0       1
ordervector =   1   1   1   1   1   1   2   2   2   3   3   4   5 
sharedrabund  = 0.03 group1 5 2 1 10 1
                        = 0.03 group2 5 2 1 1 0 0
                        = 0.03 group3 5 2 1 0 1 0

The rabund is the number of sequences in each otu, in a sharedrabund this is divided by group. The sabund is the number of otus with that abundance, so in our example there are 2 with abundance 1, 1 with abundance 2, 1 with abundance 3, 0 with abundance 4, 0 with abundance 5 and 1 with abundance 6. The ordervector contains the binNumber abund times, so 1 is in it 5 times because otu1 has 5 members, and 2 is in there 3 times because otu2 has 3 members.... The container classes have functions to create the other container types from themselves.

Collaborator

mothur-westcott commented Apr 30, 2012

You may also want to make a issue tracker for mothur's common practices, like the error catching mechanisms, signal captures and logging functions?

Owner

azmfaridee commented May 1, 2012

You may also want to make a issue tracker for mothur's common practices, like the error catching mechanisms, signal captures and logging functions?

@kdiverson @mothur-westcott Created Issue #5 for that. Also, if you think the other members of the community would be helpful, please let them know, they have to bear the trouble of signing into github, most of them might be interested in using a Wiki or blog where it's easier to comment, but I think using an issue tracker like this would be a lot better solution as it has more developer friendly tools to get us organized.

Owner

azmfaridee commented May 1, 2012

There are three type of Engines, they are BatchEngine, InteractEngine and ScriptEngine. In any single run only one of these are instantiated and most of other commands are controlled from this. New commands are created by CommandFactory in the following manner

Command* command = cFactory->getCommand(commandName, options);

So we'd need to add our new command in the CommandFactory. All commands extend the Command class

We'd just need to override the following virtual methods from Command class when we try to extend that.

virtual string getCommandName() = 0;
virtual string getCommandCategory() = 0;
virtual string getHelpString() = 0;
virtual string getCitation() = 0;
virtual string getDescription() = 0;

virtual map<string, vector<string> > getOutputFiles() { return outputTypes; }
virtual vector<string> setParameters() = 0; //to fill parameters
virtual vector<CommandParameter> getParameters() { return parameters; }

virtual int execute() = 0;
virtual void help() = 0;
virtual ~Command() { }
Collaborator

mothur-westcott commented May 1, 2012

You can't override those functions. Doing so, would make your new command unable to work with certain parts of mothur as well as a GUI we have in the works. The execute function is where you will do most of your work, it is similar to the main function in a stand alone program. The command name, description, help and citation are fairly straightforward. The category is mostly likely Hypothesis Testing, your thoughts Kathryn? The getOutputTypes, setParameters and getParameters have very specific formats to work with the GUI. commandparameter.h gives a description of how the setParameters are setup, and all the commands have this function so there are lots of examples to look at. I have on my to do list creating a command template that would make this interface easier. I could probably get that to you before the coding time begins. The format of the command is probably more of a mothur integration task, but thinking about that early will save headaches later, :).

Owner

azmfaridee commented May 1, 2012

@mothur-westcott Some of this functions are pretty straight forward, no? For example this is an excerpt from shared command.h

class SharedCommand : public Command {

public:
    SharedCommand(string);  
    SharedCommand();
    ~SharedCommand();

    vector<string> setParameters();
    string getCommandName()         { return "make.shared";             }
    string getCommandCategory()     { return "OTU-Based Approaches";    }
    string getHelpString(); 
    string getCitation() { return "http://www.mothur.org/wiki/Make.shared"; }
    string getDescription()     { return "make a shared file from a list and group file"; }

    int execute(); 
    void help() { m->mothurOut(getHelpString()); }  

Aren't we overriding these functions effectively? What's wrong with these?

What are the challenges that we might face if I want to create a new setParameters() for my new command, how much could it differ from SharedCommand::setParameters() assuming that I insert appropriate CommandParameters for that particular function depending on the combination of the arguments?

vector<string> SharedCommand::setParameters(){  
    try {
        CommandParameter plist("list", "InputTypes", "", "", "none", "none", "none",false,true); parameters.push_back(plist);
        CommandParameter pgroup("group", "InputTypes", "", "", "none", "none", "none",false,true); parameters.push_back(pgroup);
        //CommandParameter pordergroup("ordergroup", "InputTypes", "", "", "none", "none", "none",false,false); parameters.push_back(pordergroup);
        CommandParameter plabel("label", "String", "", "", "", "", "",false,false); parameters.push_back(plabel);
        CommandParameter pgroups("groups", "String", "", "", "", "", "",false,false); parameters.push_back(pgroups);
        CommandParameter pinputdir("inputdir", "String", "", "", "", "", "",false,false); parameters.push_back(pinputdir);
        CommandParameter poutputdir("outputdir", "String", "", "", "", "", "",false,false); parameters.push_back(poutputdir);

        vector<string> myArray;
        for (int i = 0; i < parameters.size(); i++) {   myArray.push_back(parameters[i].name);      }
        return myArray;
    }
    catch(exception& e) {
        m->errorOut(e, "SharedCommand", "setParameters");
        exit(1);
    }
}
Collaborator

mothur-westcott commented May 1, 2012

I'm sorry, I thought when you said override, you meant to change the command class so that the functions were not pure.

Your setParameters would probably look like:

CommandParameter pshared("shared", "InputTypes", "", "", "none", "none", "none",false,true); parameters.push_back(pshared);     
CommandParameter pinputdir("inputdir", "String", "", "", "", "", "",false,false); parameters.push_back(pinputdir);
CommandParameter poutputdir("outputdir", "String", "", "", "", "", "",false,false); parameters.push_back(poutputdir);
CommandParameter plabel("label", "String", "", "", "", "", "",false,false); parameters.push_back(plabel);
CommandParameter pisalwaystogether("isalwaystogether", "String", "", "", "", "", "",false,false); parameters.push_back(pisalwaystogether);
CommandParameter pgroups("groups", "String", "", "", "", "", "",false,false); parameters.push_back(pgroups);
CommandParameter pmethod("method", "Multiple", "randomforest", "randomforest", "", "", "",false,false); parameters.push_back(pmethod);

We may add others as the questions we want to ask with this command become more clear.

Owner

azmfaridee commented May 1, 2012

I'm sorry, I thought when you said override, you meant to change the command class so that the functions were not pure.

So just to make sure we are on the same page, by the term Overriding we want to denote the Object Oriented Programming concept of Function Overriding, so basically there are not complex issues, I just create a new class the extends Command class and Override the functions just as they have been in the other classes like SharedCommand right?

Collaborator

mothur-westcott commented May 1, 2012

Right, :)

Collaborator

mothur-westcott commented May 9, 2012

I added 2 new files to the mothur repository, newcommandtemplate.h and newcommandtemplate.cpp. They should help with the integration into mothur.

@azmfaridee azmfaridee closed this Sep 1, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment