Skip to content

sheenamt/Bio-AGP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 

Repository files navigation

NAME
    Bio::AGP::LowLevel - functions for dealing with AGP files

SYNOPSIS
     $lines_arrayref = agp_parse('my_agp_file.agp');

     agp_write( $lines => 'my_agp_file.agp');

DESCRIPTION
    functions for working with AGP files.

FUNCTIONS
    All functions below are EXPORT_OK.

  str_in
      Usage: print "it's valid" if str_in($thingy,qw/foo bar baz/);
      Desc : return 1 if the first argument is string equal to at least one of the
             subsequent arguments
      Ret  : 1 or 0
      Args : string to search for, array of strings to search in
      Side Effects: none

      I kept writing this over and over in validation code and got sick of it.

  is_filehandle
      Usage: print "it's a filehandle" if is_filehandle($my_thing);
      Desc : check whether the given thing is usable as a filehandle.
             I put this in a module cause a filehandle might be either
             a GLOB or isa IO::Handle or isa Apache::Upload
      Ret  : true if it is a filehandle, false otherwise
      Args : a single thing
      Side Effects: none

  agp_parse
      Usage: my $lines = agp_parse('~/myagp.agp',validate_syntax => 1, validate_identifiers => 1);
      Desc : parse an agp file
      Args : filename or filehandle, hash-style list of options as 
                           validate_syntax => if true, error
                               if there are any syntax errors,
                           validate_identifiers => if true, error
                              if there are any identifiers that
                              CXGN::Tools::Identifiers doesn't recognize
                              IMPLIES validate_syntax
                           error_array => an arrayref.  if given, will push
                              error descriptions onto this array instead of
                              using warn to print them to stderr
      Ret  : undef if error, otherwise return an
             arrayref containing line records, each of which is like:
             { comment => 'text' } if a comment,
             or if a data line:
             {  objname  => the name of the object being assembled
                           (same for every record),
                ostart   => start coordinate for this component (object),
                oend     => end coordinate for this component   (object),
                partnum  => the part number appearing in the 4th column,
                linenum  => the line number in the file,
                type     => letter type present in the file (/[ADFGNOPUW]/),
                typedesc => description of the type, one of:
                             - (A) active_finishing
                             - (D) draft
                             - (F) finished
                             - (G) wgs_finishing
                             - (N) known_gap
                             - (O) other
                             - (P) predraft
                             - (U) unknown_gap
                             - (W) wgs_contig
                ident    => identifier of the component, if any,
                length   => length of the component,
                is_gap   => 1 if the line is some kind of gap, 0 if it
                            is covered by a component,
                gap_type => one of:
                     fragment: gap between two sequence contigs (also
                        called a "sequence gap"),
                     clone: a gap between two clones that do not overlap.
                     contig: a gap between clone contigs (also called a
                        "layout gap").
                     centromere: a gap inserted for the centromere.
                     short_arm: a gap inserted at the start of an
                        acrocentric chromosome.
                     heterochromatin: a gap inserted for an especially
                        large region of heterochromatic sequence (may also
                        include the centromere).
                     telomere: a gap inserted for the telomere.
                     repeat: an unresolvable repeat.
                cstart   => start coordinate relative to the component,
                cend     => end coordinate relative to the component,
                linkage  => 'yes' or 'no', only set for type of 'N',
                orient   => '+', '-', 0, or 'na'
                            orientation of the component
                            relative to the object,
             }

      Side Effects: unless error_array is given, will print error
                    descriptions to STDERR with warn()
      Example:

  agp_write
      Usage: agp_write($lines,$file);
      Desc : writes a properly formatted AGP file
      Args : arrayref of line records to write, with the line records being
                 in the same format as those returned by agp_parse above,
             filename or filehandle to write to,
      Ret :  nothing meaningful

      Side Effects: dies on failure.  if you gave it a filehandle, does
                    not close it
      Example:

  agp_format_part( $record )
    Format a single AGP part line (string terminated with a newline) from
    the given record hashref.

  agp_contigs
      Usage: my @contigs = agp_contigs( agp_parse($agp_filename) );
      Desc : extract and number contigs from a parsed AGP file
      Args : arrayref of AGP lines, like those returned by agp_parse() above
      Ret  : list of contigs, in the same order as they occur in the
             file, formatted as:
                [ agp_line_hashref, agp_line_hashref, ... ],
                [ agp_line_hashref, agp_line_hashref, ... ],
                ...

AUTHOR(S)
    Robert Buels

    Sheena Scroggins

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages