Skip to content

Commit

Permalink
Merge pull request #10 from AlfonsoJan/dev
Browse files Browse the repository at this point in the history
Made the rust code better
  • Loading branch information
AlfonsoJan committed Feb 19, 2024
2 parents a21791e + 56b3571 commit 8a76494
Show file tree
Hide file tree
Showing 7 changed files with 165 additions and 70 deletions.
8 changes: 8 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Changelog

All notable changes to this project will be documented in this file.

## [1.0.3] - 2024-02-19

* Broke down the rust funciton into multiple functions
* Added information about the input in the readme
2 changes: 1 addition & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "sbsgenerator"
version = "1.0.2"
version = "1.0.3"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
Expand Down
29 changes: 23 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,19 +4,15 @@
<h2 align="center">SBSGenerator</h2>

<p align="center">
<a href="https://github.com/AlfonsoJan/sbsgenerator/"><img alt="Actions Status" src="https://img.shields.io/badge/docs-latest-blue.svg"></a>
<a href="https://github.com/alfonsojan/sbsgenerator/actions"><img alt="Actions Status" src="https://github.com/alfonsojan/sbsgenerator/actions/workflows/deploy.yml/badge.svg"></a>
<a href="https://github.com/alfonsojan/sbsgenerator/blob/main/LICENSE"><img alt="License: MIT" src="https://black.readthedocs.io/en/stable/_static/license.svg"></a>
<a href="https://pypi.org/project/sbsgenerator/"><img alt="PyPI" src="https://img.shields.io/pypi/v/sbsgenerator"></a>
<a href="https://pypi.python.org/pypi/sbsgenerator/"><img alt="PyPI" src="https://img.shields.io/pypi/pyversions/sbsgenerator.svg"></a>
<a href="https://github.com/alfonsojan/sbsgenerator/blob/main/LICENSE"><img alt="License: MIT" src="https://black.readthedocs.io/en/stable/_static/license.svg"></a>
</p>

_SBSGenerator_ is a comprehensive Python package designed for bioinformaticians and researchers working in the field of genomics. This package offers a robust set of tools for generating, analyzing, and interpreting single base substitutions (SBS) mutations from Variant Call Format (VCF) files. With a focus on ease of use, efficiency, and scalability, SBSGenerator facilitates the detailed study of genomic mutations, aiding in the understanding of their roles in various biological processes and diseases. Uniquely developed using a hybrid of Python and Rust, SBSGenerator leverages the PyO3 library for seamless integration between Python's flexible programming capabilities and Rust's unparalleled performance. This innovative approach ensures that SBSGenerator is not only user-friendly but also incredibly efficient and capable of handling large-scale genomic data with ease.

- [Installation](#installation)
- [Usage](#usage)
- [Contributing](#contributing)
- [License](#license)

## Installation

```bash
Expand All @@ -35,6 +31,25 @@ The `SBSGenerator` package is designed to facilitate the generation and analysis
- Context 7: The dataframe contains all of the following the pyrimidine single nucleotide variants, NNN[{C > A, G, or T} or {T > A, G, or C}]NNN.
*64 (4x4x4) nucleotides x 6 pyrimidine variants x 64 (4x4x4) possible ending dinucleotides = 24576 total combinations.*

### VCF INPUT FILE FORMAT

This tool currently only supports vcf formats. The user must provide variant data adhering to the format. The input VCF (Variant Call Format) file should adhere to the following format:

| Name | Fullname | Datatypes |
|:-:|:-:|:-:|
| Type | Represents the type of mutation. | str |
| Gene | Indicates the specific gene associated with the mutation. | str |
| PMID | Refers to the PubMed ID of the associated research paper. | str |
| Genome | Specifies the genome version used for mapping. | str |
| Mutation Type | Describes the type of mutation. | str |
| Chromosome | Represents the chromosome number where the mutation occurs. | str |
| Start Position | Indicates the starting position of the mutation on the chromosome. | str |
| End Position | Represents the ending position of the mutation on the chromosome. | str |
| Reference Allele | Denotes the original allele at the mutation site. | str |
| Mutant Allele | Represents the altered allele resulting from the mutation. | str |
| Method | Describes the method used for mutation detection. | str |



```python
from sbsgenerator import generator
Expand All @@ -50,6 +65,8 @@ sbsgen = generator.SBSGenerator(
ref_genome=ref_genome
)
sbsgen.count_mutations()
# The attribute count_samples holds the sbs matrix
sbsgen.count_samples
```

## Contributing
Expand Down
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ dependencies = [

[project.urls]
Homepage = "https://github.com/AlfonsoJan/sbsgenerator"
Changelog = "https://github.com/AlfonsoJan/sbsgenerator/blob/main/CHANGES.md"

[project.optional-dependencies]
test = ["pytest"]
Expand Down
24 changes: 20 additions & 4 deletions python/sbsgenerator/generator.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@
import itertools


def generate_mutation_list():
def generate_mutation_list() -> list:
"""
Generate a list of mutations in the format 'C>A', 'C>G', etc.
Expand All @@ -60,7 +60,7 @@ def create_sort_regex(context: int) -> str:
return rf"({r_string})"


def increase_mutations(context: int) -> list[str]:
def increase_mutations(context: int) -> list:
"""
Increases mutations in a given column based on a specified context.
Expand All @@ -71,7 +71,7 @@ def increase_mutations(context: int) -> list[str]:
list: A list of increased mutations based on the specified context.
"""
if context < 3:
raise ValueError("Context must be aleast 3")
raise ValueError("Context must be at least 3")
nucleotides = ["A", "C", "G", "T"]
combinations = list(itertools.product(nucleotides, repeat=context - 3))
# Generate new mutations based on the context and combinations
Expand All @@ -95,7 +95,23 @@ class NotADirectoryError(Exception):
pass


def validate_input(func):
def validate_input(func: callable) -> callable:
"""
Decorator function that validates the input parameters of the decorated function.
Args:
func (function): The function to be decorated.
Returns:
function: The decorated function.
Raises:
ValueError: If the 'context' parameter is not an odd number greater than 1.
TypeError: If the 'vcf_files' parameter is not a list or tuple.
FileNotFoundError: If any of the 'vcf_files' do not exist.
NotADirectoryError: If the 'ref_genome' parameter is not a valid directory.
"""

@wraps(func)
def wrapper(context, vcf_files, ref_genome, **kwargs):
# Check if context is an odd number greater than 1 and an integer
Expand Down
Loading

0 comments on commit 8a76494

Please sign in to comment.