Skip to content

Commit

Permalink
Correct spelling and other errors in readme files
Browse files Browse the repository at this point in the history
  • Loading branch information
veghp committed Jan 7, 2021
1 parent 5b9435a commit e529c71
Show file tree
Hide file tree
Showing 5 changed files with 16 additions and 14 deletions.
6 changes: 3 additions & 3 deletions dnachisel/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,13 +23,13 @@ Here we explain the relationship between the main classes in DNA Chisel. The nex
Class methods describe how _Locations_ are created, extended, merged, intersected, imported from Genbank locations, etc.
- **SpecEvaluation** is a class describing the result of the evaluation of a *DnaOptimizationProblem* by a *Specification*. Contains a score, a message, a list of *Locations* of sub-optimal regions. Several evaluations can be grouped using classes *ProblemConstraintsEvaluations* and *ProblemObjectivesEvaluations*, which implement methods for printing or exporting as Genbank a set of evaluations.
- **SequencePattern** is a class for representing pattern. Methods define how to parse a pattern (such as "BsmBI_site") and find the pattern in a sequence.
- **MutationSpace** is a class to represent the possible mutations at different locations of the sequence for a given problem. It is initialized at the creation of the problem by the problem's constraints. A MutationSpace is basically a list of *MutationChoices* applying at various locations of the sequence. *MutationSpace* methods describe how to extract variants from the mutation space (this is used by DnaOptimizationProblem's solver to explore new sequence variants).A *MutationChoice* comprises a location and a set of sequence choices for this location, and the methods define how to extract an random choice, how to merge *MutationChoices* together, etc.
- **MutationSpace** is a class to represent the possible mutations at different locations of the sequence for a given problem. It is initialized at the creation of the problem by the problem's constraints. A MutationSpace is basically a list of *MutationChoices* applying at various locations of the sequence. *MutationSpace* methods describe how to extract variants from the mutation space (this is used by DnaOptimizationProblem's solver to explore new sequence variants). A *MutationChoice* comprises a location and a set of sequence choices for this location, and the methods define how to extract an random choice, how to merge *MutationChoices* together, etc.

## Code organization

- **biotools/** contains many methods and data tables related to biology and sequence manipulation, either used in the core DNA Chisel classes, or very helpful when writing DNA Chisel scripts (see this folder's README for more).
- **builtin specifications/** contains all the built-in _Specification_ subclasses, one per file. While most files define a directlyusable *Specification* subclass (CodonOptimize, EnforceGCContent, etc.), some files encode generic subclasses meant to be in turn subclassed (*CodonSpecification*, *TerminalSpecification*)
- **DnaOptimizationProblem/** contains the code for the *DnaOptimizationProblem* and *CircularDnaOptimizationProblem* classes. As *DnaOptimizationProblem* implements the solver and is very big, the methods in this class have been regrouped into "mixins" in other files (see this folder's README for more).
- **builtin specifications/** contains all the built-in _Specification_ subclasses, one per file. While most files define a directly usable *Specification* subclass (CodonOptimize, EnforceGCContent, etc.), some files encode generic subclasses meant to be in turn subclassed (*CodonSpecification*, *TerminalSpecification*)
- **DnaOptimizationProblem/** contains the code for the *DnaOptimizationProblem* and *CircularDnaOptimizationProblem* classes. As *DnaOptimizationProblem* implements the solver and is very big, the methods in this class have been regrouped into "mixins" in other files.
- **MutationSpace/** contains the implementation of the *MutationSpace* and *MutationChoice* classes.
- **reports/** contains methods to generate plots, PDF reports, etc. from a _DnaOptimizationProblem_ (before and after its optimization). It also contains assets (logo, stylesheet, template) for the PDF report.
- **Specification/** contains the implementation of the *Specification* class (some methods for this class are regrouped in file *FeatureRepresentationMixin.py* to make smaller files). The *SpecEvaluation/* subfolder contains the implementations of *SpecEvaluation*, *SpecEvaluations*, *ProblemConstraintEvaluations*, *ProblemObjectiveEvaluations*.
Expand Down
12 changes: 7 additions & 5 deletions dnachisel/biotools/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,14 @@ DNA Chisel scripts.


- **biotables.py** provides tables (=dictionaries) of biological data, such as genetic code, IUPAC nucleotide definitions, etc.
- **blast_sequence** contains a practical BLAST method (using NCBI+). It is used in AvoidBlastMatches but could be used anywhere else.
- **enzymes_operations** is for enzyme-related methods. Currently only "list_common_enzymes", which can be practical.
- **formatting_operations** contains methods to format strings and numericals, used throughout the library.
- **blast_sequence.py** contains a practical BLAST method (using NCBI+). It is used in AvoidBlastMatches but could be used anywhere else.
- **bowtie.py** contains methods for using Bowtie in AvoidMatches.
- **enzymes_operations.py** is for enzyme-related methods. Currently only "list_common_enzymes", which can be practical.
- **formatting_operations.py** contains methods to format strings and numericals, used throughout the library.
- **gc_content.py** contains a method implementing (windowed) GC content and is notably used by EnforceGCContent.
- **genbank_operations** contains many Genbank and Biopython record related methods, used intensively in the library and examples.
- **genbank_operations.py** contains many Genbank and Biopython record related methods, used intensively in the library and examples.
- **indices_operations.py** contains methods to group or ungroup segments and sets of indices, which are used to handle breach *Locations* in the core code.
- **random_sequences.py** contains methods for generating random sequences.
- **sequences_differences.py** contains methods for comparing sequences and pointing at locations that differ.
- **sequences_operations.py** contains methods for manipulating "ATGC" strings representing sequences (methods include reverse_complement, reverse_translate, etc.)
- **data/** contains data files used by the code in this folder (genetic code, iupac definitions, etc.)
- **data/** contains data files used by the code in this folder (genetic code, IUPAC definitions, etc.)
4 changes: 2 additions & 2 deletions dnachisel/biotools/data/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Bio data files

- **complements.csv**: Gives the reverse complements of nucleotides, including degenerate IUPAC notation nucleotides (A=>T, C=>G, M=>K, R=>Y...)
- **iupac_notation.csv**: Gives the ATGC nucleotides associated with each letter of the IUPAC notation (A=>A, "C=>C", B=>CGT, D=>AGT)
- **nucleotide_to_regexpr.csv** Gives for each letter of the IUPAC alphabet all the letters of the IUPAC alphabet which could be a match (A=>A, W=>[ATW], H=>[ACHMTWY])
- **iupac_notation.csv**: Gives the ATGC nucleotides associated with each letter of the IUPAC notation (A=>A, C=>C, B=>CGT, D=>AGT...)
- **nucleotide_to_regexpr.csv**: Gives for each letter of the IUPAC alphabet all the letters of the IUPAC alphabet which could be a match (A=>A, W=>[ATW], H=>[ACHMTWY]...)
2 changes: 1 addition & 1 deletion dnachisel/reports/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Reports module

This module is contains templates and methods for reports generation.
This module is contains templates and methods for report generation.
It notably implements the end-to-end method ``optimize_with_report``.
6 changes: 3 additions & 3 deletions docs/genbank/genbank_notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,10 +34,10 @@ e.g. ``~MySpec(..., boost=2)``.
src='../_static/images/genbank_annotations/boosting.png'></img>


Multiple specifications in a same annotation
--------------------------------------------
Multiple specifications in the same annotation
----------------------------------------------

If you want to write less annotations, you can define several specifications in
If you want to write fewer annotations, you can define several specifications in
a single feature, separating them with the ``&`` symbol. For instance, to
conserve a gene while getting rid of CpG islands and keeping the global GC%
between 45% and 55%:
Expand Down

0 comments on commit e529c71

Please sign in to comment.