Skip to content

Latest commit

 

History

History
403 lines (293 loc) · 6.7 KB

api.md

File metadata and controls

403 lines (293 loc) · 6.7 KB
.. module:: eugene
.. automodule:: eugene
   :noindex:

API

preprocess

from eugene import preprocess

This module is designed to let users interact and modify SeqData objects to prepare for model training and other steps of the workflow. There are three main classes of preprocessing functions.

.. module:: eugene.preprocess
.. currentmodule:: eugene

Sequence preprocessing

.. autosummary::
   :toctree: api/

   preprocess.make_unique_ids_sdata
   preprocess.pad_seqs_sdata
   preprocess.ohe_seqs_sdata

Train-test splitting

.. autosummary::
   :toctree: api/

   preprocess.train_test_chrom_split
   preprocess.train_test_homology_split
   preprocess.train_test_random_split

Target preprocessing

.. autosummary::
   :toctree: api/

   preprocess.clamp_targets_sdata
   preprocess.scale_targets_sdata

dataload

from eugene import dataload

This module is designed to help users prepare their SeqDatas for model training and other steps of the workflow (e.g. augmentation)

.. module:: eugene.dataload
.. currentmodule:: eugene

SeqData utilities

.. autosummary::
   :toctree: api/

   dataload.concat_sdatas
   dataload.add_obs

Augmentation

.. autosummary::
   :toctree: api/

   dataload.RandomRC

models

from eugene import models

This module is designed to allow users to easily build and initialize several neural network architectures that are designed for biological sequences.

Blocks

Blocks are composed to create architectures in EUGENe. You can find all the arguments that would be passed into the dense_kwargs and recurrent_kwargs arguments of all built-in model in the DenseBlock and RecurrentBlock classes, respectively. See the towers section for more information on the conv_kwargs argument.

.. module:: eugene.models
.. currentmodule:: eugene
.. autosummary::
   :toctree: api/classes

   models.DenseBlock
   models.Conv1DBlock
   models.RecurrentBlock

Towers

The Conv1DTower class is currently used for all built-in CNNs. This will be deprecated in the future in favor of the more general Tower class. For now, you can find all the arguments that would be passed into the cnn_kwargs argument of all built-in CNNs in the Conv1DTower class.

.. autosummary::
   :toctree: api/classes

   models.Tower
   models.Conv1DTower

LightningModules

.. autosummary::
   :toctree: api/classes

   models.SequenceModule
   models.ProfileModule

Initialization

.. autosummary::
   :toctree: api/

   models.init_weights
   models.init_motif_weights

Zoo

Arguments for the cnn_kwargs, recurrent_kwargs and dense_kwargs of all models can be found in the Conv1DTower, RecurrentBlock and DenseBlock classes, respectively. See the blocks section and the towers section for more information. The Satori architecture currently uses the MultiHeadAttention layer which can be found at eugene.models.base._layers for more information on the mha_kwargs argument.

.. module:: eugene.models.zoo
.. currentmodule:: eugene
.. autosummary::
   :toctree: api/classes

   models.zoo.FCN
   models.zoo.dsFCN
   models.zoo.CNN
   models.zoo.dsCNN
   models.zoo.RNN
   models.zoo.dsRNN
   models.zoo.Hybrid
   models.zoo.dsHybrid
   models.zoo.TutorialCNN
   models.zoo.DeepBind
   models.zoo.ResidualBind
   models.zoo.Kopp21CNN
   models.zoo.DeepSEA
   models.zoo.Basset
   models.zoo.FactorizedBasset
   models.zoo.DanQ
   models.zoo.Satori
   models.zoo.Jores21CNN
   models.zoo.DeepSTARR
   models.zoo.BPNet
   models.zoo.DeepMEL
   models.zoo.scBasset

Utilities

.. autosummary::
   :toctree: api/

   models.list_available_layers
   models.get_layer
   models.load_config

train

from eugene import train

Training procedures for data and models.

.. module:: eugene.train
.. currentmodule:: eugene
.. autosummary::
   :toctree: api/

   train.fit
   train.fit_sequence_module
   train.hyperopt

evaluate

from eugene import evaluate

Evaluation functions for trained models. Both prediction helpers and metrics.

.. module:: eugene.evaluate
.. currentmodule:: eugene

Predictions

.. autosummary::
   :toctree: api/

   evaluate.predictions
   evaluate.predictions_sequence_module
   evaluate.train_val_predictions
   evaluate.train_val_predictions_sequence_module

interpret

from eugene import interpret

Interpretation suite of EUGENe, currently broken into filter visualization, feature attribution and in silico experimentation

.. module:: eugene.intepret
.. currentmodule:: eugene

Filter interpretation

.. autosummary::
   :toctree: api/

   interpret.generate_pfms_sdata
   interpret.filters_to_meme_sdata

Attribution analysis

.. autosummary::
   :toctree: api/

   interpret.attribute_sdata

Global importance analysis (GIA)

.. autosummary::
   :toctree: api/

   interpret.positional_gia_sdata
   interpret.motif_distance_dependence_gia

Generative

.. autosummary::
   :toctree: api/

   interpret.evolve_seqs_sdata

plot

from eugene import plot

Plotting suite in EUGENe for multiple aspects of the workflow.

.. module:: eugene.plot
.. currentmodule:: eugene

Categorical plotting

.. autosummary::
   :toctree: api/

   plot.countplot
   plot.histplot
   plot.boxplot
   plot.violinplot
   plot.scatterplot

Training summaries

.. autosummary::
   :toctree: api/

   plot.metric_curve
   plot.loss_curve
   plot.training_summary

Performance

.. autosummary::
   :toctree: api/

   plot.performance_scatter
   plot.confusion_mtx
   plot.auroc
   plot.auprc
   plot.performance_summary

Sequences

.. autosummary::
   :toctree: api/

   plot.seq_track
   plot.multiseq_track
   plot.filter_viz
   plot.multifilter_viz

Global importance analysis (GIA)

.. autosummary::
   :toctree: api/

   plot.positional_gia_plot
   plot.distance_cooperativity_gia_plot

utils

.. module:: eugene.utils
.. currentmodule:: eugene

File I/O

.. autosummary::
   :toctree: api/

   utils.make_dirs