Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
branch: master
Fetching contributors…

Octocat-spinner-32-eaf2f5

Cannot retrieve contributors at this time

file 240 lines (165 sloc) 8.277 kb
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239

Biopipe FAQ
-----------
v. 1.0

This FAQ maintained by:
* Shawn Hoon <shawnh@fugu-sg.org>


---------------------------------------------------------------------------

Contents

---------------------------------------------------------------------------

0. About this FAQ

   Q0.1: What is this FAQ?
   Q0.2: How is it maintained?

1. Biopipe questions

   Q1.1: How do I start?
   Q1.2: What are the methods that I have to insert in the
datahandler table?
   Q1.3: What should be inserted in the field 'name' in the argument table?
   Q1.4: When I create the database for the pipeline should I manually
create the tables?
   Q1.5: In the input table, there is a field called 'name'. What's that?
   Q1.6: There are 2 dbnames, one in the analysis table and
the other in the dbadaptor table. Are they the same?
   Q1.7: The pipeline ran well the first time but didn't the
second time. What could have gone wrong?
   Q1.8: The pipeline is not running. All the tables are
populated correctly. What went wrong?
   Q1.9: There are many tables that I left empty. Don't we need to put
something in them?
  Q1.10: What about the output of the pipeline?


---------------------------------------------------------------------------

0. About this FAQ

---------------------------------------------------------------------------



   Q0.1: What is this FAQ?

      A: It is the list of Frequently Asked Questions about Biopipe.


   Q0.2: How is it maintained?

      A: This FAQ was generated using a Perl script and an XML file. All
the files are in the Bioperl distribution directory doc/faq. So do
not edit this file! Edit file faq.xml and run:

% faq.pl -text faq.xml

The XML structure was originally used by the Perl XML project.
Their website seems to have vanished, though. The XML and
modifying scripts were copied from Michael Rodriguez's web site
http://www.xmltwig.com/xmltwig/XML-Twig-FAQ.html and modified to
our needs.


---------------------------------------------------------------------------

1. Biopipe questions

---------------------------------------------------------------------------



   Q1.1: How do I start?

      A: First install the components, as described in
http://www.biopipe.org/bioperl-pipeline-install.html. Next steps:

- Create a DB for the pipeline, e.g genscan_pipe (follow the
schema of course)

- Update the DB in the PipeConf.pm

- Populate the DB Tables (see Populating the Tables in INSTALL for
more Info)


   Q1.2: What are the methods that I have to insert in the
datahandler table?

      A: The methods in the datahandler table are used to get
the input for the runnable and store the output of the
runnable. If the input object is in a database, we will be
         instantiating a new dbadaptor and use its method to retrieve
the data from the database. The data will be
the input to the runnable. The output of the runnable(an object)
has to be stored in the database. Hence the methods that do this has
to be specified. Example:


+----------------+--------------+---------------------------+-----
-+
| datahandler_id | iohandler_id | method | rank
|
+----------------+--------------+---------------------------+-----
-+
| 1 | 1 | get_Contig_by_internal_id | 1
|
| 2 | 1 | perl_primary_seq | 2
|
| 3 | 2 | get_ScoreAdaptor | 1
|
| 4 | 2 | store_by_PID | 2
|
+----------------+--------------+---------------------------+-----
-+

The method get_Contig_by_internal_id gets the contig from the
database from its internal id. Then the perl_primary_seq is
called upon the contig to obtain a Seq object from the
contig. The output from the runnable (genscan) a seqfeature
object will have to be stored back to the database (or other
database). Hence the adaptor for the database will obtain the
         scoreAdaptor via get_ScoreAdaptor method and the score
Adaptor will store the output in the database by the
store_by_PID method.


   Q1.3: What should be inserted in the field 'name' in the argument table?

      A: The argument table is for the arguments required by the methods in
the datahandler table. Example:


+-------------+----------------+--------+------+--------+
| argument_id | datahandler_id | name | rank | type |
+-------------+----------------+--------+------+--------+
| 1 | 1 | INPUT | 1 | SCALAR |
| 2 | 4 | OUTPUT | 1 | SCALAR |
+-------------+----------------+--------+------+--------+

Since only two methods take arguments (in this case) there
are only 2 entries in the arguments table. In the field
'name', there entries 'INPUT' and 'OUTPUT'. The method with
         corresponding datahandler_id's 1 and 4 are the methods that
retrieve the input and store the output respectively.


   Q1.4: When I create the database for the pipeline should I manually
create the tables?

      A:
No. Users can copy the biopipelinedb-mysql.sql scheme when
they create a new database. Hence they don't have to manually create
tables. Steps:- go to the directory where the schema file
is then type these commands (assuming the dbname is testdb):

mysql>use testdb;
Database changed
mysql> source biopipelinedb-mysql.sql;


   Q1.5: In the input table, there is a field called 'name'. What's that?

      A: When we fetch an input from the database based on
certain fields, that field will be the name in the input
directory.

If the method get_Contig_by_internal_id is used, then the
internal_id will be the name in this case. The name of every
single entry has to be populated into the input table before
running the pipeline.


   Q1.6: There are 2 dbnames, one in the analysis table and
the other in the dbadaptor table. Are they the same?

      A: No. The db in the analysis table is for analysis like
         Blast where it requires a database to blast against. The
dbname in the dbadaptor table is the database(s) where
the input is retrieved or stored.


   Q1.7: The pipeline ran well the first time but didn't the
second time. What could have gone wrong?

      A: There could be many reasons for this. The most
probable reason could be that you failed to empty the
output and job tables after running the pipeline the first time.
You might get a message like this:

Tests Completed. Starting Pipeline
Fetched 0 jobs
Waking up and run again!

The two tables have to be reset everytime the pipeline is run.

   Q1.8: The pipeline is not running. All the tables are
populated correctly. What went wrong?

      A: Again as in (7) there could be more than 1 reason for
         this. One possibility is that you might be getting
messages like this:

Tests Completed. Starting Pipeline
Fetched 2 jobs
opening bsub command line:
bsub -o
/data0/tmp//2/genscan_pipe.job_1.Genscan.1025852231.749.out -e
/data0/tmp//2/genscan_pipe.job_1.Genscan.1025852231.749.err -q
/usr/users/savikalpa/src/bioperl-pipeline//Bio/Pipeline/runner.pl
1 2
couldn't submit jobs 1 2 to LSF.
Waking up and run again!

This could be due to some problem in sending the job to LSF.
Try running it locally (i.e. perl PipelineManager.pl -l) if
it runs then it confirms this.


   Q1.9: There are many tables that I left empty. Don't we need to put
something in them?

      A: Some of the tables like the job and output tables will be
automatically populated as the pipeline runs.


  Q1.10: What about the output of the pipeline?

      A: The output of the pipeline (i.e. gene object etc.) can be stored
in any database of choice (need the dbadaptor of course).

---------------------------------------------------------------------------
Copyright (c)2002-2003 Open Bioinformatics Foundation. You may distribute
this FAQ under the same terms as perl itself.

Something went wrong with that request. Please try again.