# GenePattern Python Library Documentation

This notebook provides basic documentation for the GenePattern Python Library and its member classes. It is intended for users who are familiar with programming and with the Python language. It is written for easy reference while coding.

Programming users looking for an introductory tutorial would be better served by downloading and exploring the *GenePattern Python Tutorial* notebook or the *GenePattern Files in Python* notebook.

In [4]:
import gp
gp.__version__

'1.3.1'

## GPServer

In [5]:
help(gp.GPServer)

Help on class GPServer in module gp:

class GPServer(builtins.object)
 |  Wrapper for data needed to make server calls.
 |  
 |  Wraps the server url, username and password, and provides helper function
 |  to construct the authorization header.
 |  
 |  Methods defined here:
 |  
 |  __init__(self, url, username, password)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __str__(self)
 |      Return str(self).
 |  
 |  authorization_header(self)
 |      Returns a string containing the authorization header used to authenticate
 |      with GenePattern. This string is included in the header of subsequent
 |      requests sent to GenePattern.
 |  
 |  get_task_list(self)
 |      Queries the GenePattern server and returns a list of GPTask objects,
 |      each representing one of the modules installed on the server. Useful
 |      for determining which are available on the server.
 |  
 |  run_job(self, job_spec, wait_until_done=True)
 |      Runs a job defi

## GPTask

In [6]:
help(gp.GPTask)

Help on class GPTask in module gp:

class GPTask(GPResource)
 |  Describes a GenePattern task (module or pipeline).
 |  
 |  The constructor retrieves data transfer object (DTO) describing task from GenePattern server.
 |  The DTO contains general task information (LSID, Category, Description, Version comment),
 |  a parameter list and a list of initial values.  Class includes getters for each of these
 |  components.
 |  
 |  Method resolution order:
 |      GPTask
 |      GPResource
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  __init__(self, server_data, name_or_lsid, task_dict=None)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  get_description(self)
 |      :return: Returns the task's description as a string
 |  
 |  get_lsid(self)
 |      :return: Returns the task's LSID as a string
 |  
 |  get_name(self)
 |      :return: Returns the task's name as a string
 |  
 |  get_parameters(self)
 |      :return: Returns a list of GPTaskP

## GPJob

In [7]:
help(gp.GPJob)

Help on class GPJob in module gp:

class GPJob(GPResource)
 |  A running or completed job on a Gene Pattern server.
 |  
 |  Contains methods to get the info of the job, and to wait on a running job by
 |  polling the server until the job is completed.
 |  
 |  Method resolution order:
 |      GPJob
 |      GPResource
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  __init__(self, server_data, uri)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  get_child_jobs(self)
 |      Queries the GenePattern server for child jobs of this job, creates GPJob
 |      objects representing each of them and assigns the list of them to the
 |      GPJob.children property. Then return this list.
 |  
 |  get_comments(self)
 |      Returns the comments for the job, querying the
 |      server if necessary.
 |  
 |  get_file(self, name)
 |      Returns the output file with the specified name, if no output files
 |      match, returns None.
 |  
 |  get_info(se

## GPJobSpec

In [8]:
help(gp.GPJobSpec)

Help on class GPJobSpec in module gp:

class GPJobSpec(builtins.object)
 |  Data needed to make a request to perform a job on a Gene Pattern server
 |  
 |  Encapsulates the data needed to make a server call to run a job.  This
 |  includes the LSID of the job, and the parameters.  Helper methods set
 |  the LSID and parameters.
 |  
 |  Methods defined here:
 |  
 |  __init__(self, server_data, lsid)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  set_parameter(self, name, values, group_id=None)
 |      Sets the value of a parameter for the GPJobSpec
 |      :param name: name of the parameter
 |      :param values: list of values for the parameter
 |      :param group_id: optional parameter group ID
 |      :return:
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references 

## GPTaskParam

In [9]:
help(gp.GPTaskParam)

Help on class GPTaskParam in module gp:

class GPTaskParam(builtins.object)
 |  Encapsulates single parameter information.
 |  
 |  The constructor's input parameter is the data transfer object
 |  associated with a single task parameter (i.e., element from list
 |  returned by GPTask.getParameters)
 |  
 |  Methods defined here:
 |  
 |  __init__(self, task, dto)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  allow_choice_custom_value(self)
 |      Returns boolean indicating whether choice parameter supports custom value.
 |      
 |      If choice parameter supports custom value, user can provide parameter value
 |      other than those provided in choice list.
 |  
 |  allow_multiple(self)
 |      Return whether the parameter allows multiple values or not
 |      :return: Return True if the parameter allows multiple values, otherwise False
 |  
 |  get_alt_description(self)
 |      Returns the alternate description of a parameter.
 |      Only pipeli

## GPFile

In [10]:
help(gp.GPFile)

Help on class GPFile in module gp:

class GPFile(GPResource)
 |  A file on a Gene Pattern server.
 |  
 |  Wraps the URI of the file, and contains methods to download the file.
 |  
 |  Method resolution order:
 |      GPFile
 |      GPResource
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  __init__(self, server_data, uri)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  get_name(self)
 |      Returns the file name of the output file
 |  
 |  get_url(self)
 |      Returns the URL to the GPFile
 |  
 |  open(self)
 |      Opens the URL associated with the GPFile and returns a file-like object
 |      with three extra methods:
 |      
 |          * geturl() - return the ultimate URL (can be used to determine if a
 |              redirect was followed)
 |      
 |          * info() - return the meta-information of the page, such as headers
 |      
 |          * getcode() - return the HTTP status code of the response
 |  
 |  read(self)
 | 

## GPResource

In [11]:
help(gp.GPResource)

Help on class GPResource in module gp:

class GPResource(builtins.object)
 |  Base class for resources on a Gene Pattern server.
 |  
 |  Wraps references to resources on a Gene Pattern server, which are all
 |  defined by a URI.  Subclasses can implement custom logic appropriate for
 |  that resources such as downloading a file or info for a running or completed
 |  job.
 |  
 |  Methods defined here:
 |  
 |  __init__(self, uri)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)
 |  
 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |  
 |  uri = None



## GPException

In [12]:
help(gp.GPException)

Help on class GPException in module gp:

class GPException(builtins.Exception)
 |  An exception raised by GenePattern and returned to the user
 |  
 |  Method resolution order:
 |      GPException
 |      builtins.Exception
 |      builtins.BaseException
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  __init__(self, value)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __str__(self)
 |      Return str(self).
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)
 |  
 |  ----------------------------------------------------------------------
 |  Methods inherited from builtins.Exception:
 |  
 |  __new__(*args, **kwargs) from builtins.type
 |      Create and return a new object.  See help(type) for accurate signature.
 |  
 |  ----------------------------------------------------------------------
 |  

# GenePattern Data Package

In addition to the core GenePattern features of definine analyses and launching jobs, the GenePattern Python Library also includes functionality for reading common GenePattern file formats and importing the data from those file into Pandas DataFrames.

In [13]:
import gp.data

## GCT Files

In [14]:
help(gp.data.GCT)

Help on class GCT in module gp.data:

class GCT(builtins.object)
 |  Wraps and represents a GCT file, importing the associated data
 |  into a pandas dataframe.
 |  
 |  For more information on the GCT format see:
 |  http://software.broadinstitute.org/cancer/software/genepattern/file-formats-guide
 |  
 |  :gct_obj: The GCT file. Accepts a file-like object, a file path, a URL to the file
 |            or a string containing the raw data.
 |  
 |  Methods defined here:
 |  
 |  __init__(self, gct_obj)
 |      Create a wrapper object for the GCT file
 |  
 |  col_count(self)
 |      Return the number of data columns in the GCT file
 |  
 |  row_count(self)
 |      Return the number of data rows in the GCT file
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)
 |  
 |  ----

## ODF Files

In [15]:
help(gp.data.ODF)

Help on class ODF in module gp.data:

class ODF(builtins.object)
 |  Wraps and represents an ODF file, importing the associated data
 |  into a pandas dataframe.
 |  
 |  For more information on the ODF format see:
 |  http://software.broadinstitute.org/cancer/software/genepattern/file-formats-guide
 |  
 |  :odf_obj: The ODF file. Accepts a file-like object, a file path, a URL to the file
 |            or a string containing the raw data.
 |  
 |  Methods defined here:
 |  
 |  __init__(self, odf_obj)
 |      Create a wrapper object for the ODF file
 |  
 |  col_count(self)
 |      Return the number of data columns in the ODF file
 |  
 |  count_header_blanks(self, lines, count)
 |      Count the number of blank lines in the header
 |  
 |  row_count(self)
 |      Return the number of data rows in the ODF file
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (i

# GenePattern Modules Package

Finally, the GenePattern Python Library includes preliminary functionality for wrapping Python methods as GenePattern modules. These are defined in the modules package.

In [16]:
import gp.modules

## GPTaskSpec

In [17]:
help(gp.modules.GPTaskSpec)

Help on class GPTaskSpec in module gp.modules:

class GPTaskSpec(builtins.object)
 |  Specification needed to create a new GenePattern module
 |  
 |  Methods defined here:
 |  
 |  __init__(self, name=None, description='', version_comment='', author='', institution='', categories=[], privacy=<Privacy.PRIVATE: 'private'>, quality=<Quality.DEVELOPMENT: 'development'>, file_format=[], os=<OS.ANY: 'any'>, cpu=<CPU.ANY: 'any'>, language='Python', user=None, support_files=[], documentation='', license='', lsid=None, version=1, lsid_authority=0, command_line=None, parameters=[])
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  create_zip(self, clean=True, increment_version=True, register=True)
 |      Creates a GenePattern module zip file for upload and installation on a GenePattern server
 |      :param clean: boolean
 |      :return:
 |  
 |  validate(self)
 |      Perform some basic checks to help ensure that the specification is valid.
 |      Throws an exc

## GPParamSpec

In [18]:
help(gp.modules.GPParamSpec)

Help on class GPParamSpec in module gp.modules:

class GPParamSpec(builtins.object)
 |  Specification needed to create a parameter for a new GenePattern module
 |  
 |  Methods defined here:
 |  
 |  __init__(self, name=None, description='', optional=<Optional.REQUIRED: ''>, type=<Type.TEXT: 'TEXT'>, choices={}, value='', default_value='', file_format=[], min_values=0, max_values=1, flag='', prefix_when_specified=False)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  manifest_repr(self, p_num)
 |      Builds a manifest string representation of the parameters and returns it
 |      :param p_num: int
 |      :return: string
 |  
 |  validate(self)
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)



## Included Enumerations

In [19]:
help(gp.modules.CPU)

Help on class CPU in module gp.modules:

class CPU(StringEnum)
 |  An enumeration.
 |  
 |  Method resolution order:
 |      CPU
 |      StringEnum
 |      builtins.str
 |      enum.Enum
 |      builtins.object
 |  
 |  Data and other attributes defined here:
 |  
 |  ALPHA = <CPU.ALPHA: 'alpha'>
 |  
 |  ANY = <CPU.ANY: 'any'>
 |  
 |  INTEL = <CPU.INTEL: 'intel'>
 |  
 |  POWERPC = <CPU.POWERPC: 'powerpc'>
 |  
 |  SPARC = <CPU.SPARC: 'sparn'>
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from enum.Enum:
 |  
 |  name
 |      The name of the Enum member.
 |  
 |  value
 |      The value of the Enum member.
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from enum.EnumMeta:
 |  
 |  __members__
 |      Returns a mapping of member name->value.
 |      
 |      This mapping lists all enum members, including aliases. Note that this
 |      is a read-only view of

In [20]:
help(gp.modules.OS)

Help on class OS in module gp.modules:

class OS(StringEnum)
 |  An enumeration.
 |  
 |  Method resolution order:
 |      OS
 |      StringEnum
 |      builtins.str
 |      enum.Enum
 |      builtins.object
 |  
 |  Data and other attributes defined here:
 |  
 |  ANY = <OS.ANY: 'any'>
 |  
 |  LINUX = <OS.LINUX: 'linux'>
 |  
 |  MAC = <OS.MAC: 'mac'>
 |  
 |  WINDOWS = <OS.WINDOWS: 'windows'>
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from enum.Enum:
 |  
 |  name
 |      The name of the Enum member.
 |  
 |  value
 |      The value of the Enum member.
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from enum.EnumMeta:
 |  
 |  __members__
 |      Returns a mapping of member name->value.
 |      
 |      This mapping lists all enum members, including aliases. Note that this
 |      is a read-only view of the internal mapping.



In [21]:
help(gp.modules.Optional)

Help on class Optional in module gp.modules:

class Optional(StringEnum)
 |  An enumeration.
 |  
 |  Method resolution order:
 |      Optional
 |      StringEnum
 |      builtins.str
 |      enum.Enum
 |      builtins.object
 |  
 |  Data and other attributes defined here:
 |  
 |  OPTIONAL = <Optional.OPTIONAL: 'on'>
 |  
 |  REQUIRED = <Optional.REQUIRED: ''>
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from enum.Enum:
 |  
 |  name
 |      The name of the Enum member.
 |  
 |  value
 |      The value of the Enum member.
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from enum.EnumMeta:
 |  
 |  __members__
 |      Returns a mapping of member name->value.
 |      
 |      This mapping lists all enum members, including aliases. Note that this
 |      is a read-only view of the internal mapping.



In [22]:
help(gp.modules.Privacy)

Help on class Privacy in module gp.modules:

class Privacy(StringEnum)
 |  An enumeration.
 |  
 |  Method resolution order:
 |      Privacy
 |      StringEnum
 |      builtins.str
 |      enum.Enum
 |      builtins.object
 |  
 |  Data and other attributes defined here:
 |  
 |  PRIVATE = <Privacy.PRIVATE: 'private'>
 |  
 |  PUBLIC = <Privacy.PUBLIC: 'public'>
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from enum.Enum:
 |  
 |  name
 |      The name of the Enum member.
 |  
 |  value
 |      The value of the Enum member.
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from enum.EnumMeta:
 |  
 |  __members__
 |      Returns a mapping of member name->value.
 |      
 |      This mapping lists all enum members, including aliases. Note that this
 |      is a read-only view of the internal mapping.



In [23]:
help(gp.modules.Quality)

Help on class Quality in module gp.modules:

class Quality(StringEnum)
 |  An enumeration.
 |  
 |  Method resolution order:
 |      Quality
 |      StringEnum
 |      builtins.str
 |      enum.Enum
 |      builtins.object
 |  
 |  Data and other attributes defined here:
 |  
 |  DEVELOPMENT = <Quality.DEVELOPMENT: 'development'>
 |  
 |  PREPRODUCTION = <Quality.PREPRODUCTION: 'preproduction'>
 |  
 |  PRODUCTION = <Quality.PRODUCTION: 'production'>
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from enum.Enum:
 |  
 |  name
 |      The name of the Enum member.
 |  
 |  value
 |      The value of the Enum member.
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from enum.EnumMeta:
 |  
 |  __members__
 |      Returns a mapping of member name->value.
 |      
 |      This mapping lists all enum members, including aliases. Note that this
 |      is a read-only vi

In [24]:
help(gp.modules.Type)

Help on class Type in module gp.modules:

class Type(StringEnum)
 |  An enumeration.
 |  
 |  Method resolution order:
 |      Type
 |      StringEnum
 |      builtins.str
 |      enum.Enum
 |      builtins.object
 |  
 |  Data and other attributes defined here:
 |  
 |  DIRECTORY = <Type.DIRECTORY: 'DIRECTORY'>
 |  
 |  FILE = <Type.FILE: 'FILE'>
 |  
 |  FLOATING_POINT = <Type.FLOATING_POINT: 'Floating Point'>
 |  
 |  INTEGER = <Type.INTEGER: 'Integer'>
 |  
 |  PASSWORD = <Type.PASSWORD: 'PASSWORD'>
 |  
 |  TEXT = <Type.TEXT: 'TEXT'>
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from enum.Enum:
 |  
 |  name
 |      The name of the Enum member.
 |  
 |  value
 |      The value of the Enum member.
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from enum.EnumMeta:
 |  
 |  __members__
 |      Returns a mapping of member name->value.
 |      
 |      This m