Skip to content

Commit

Permalink
add readme
Browse files Browse the repository at this point in the history
  • Loading branch information
milescsmith committed Feb 3, 2022
1 parent 1f4d345 commit c142343
Show file tree
Hide file tree
Showing 7 changed files with 94 additions and 80 deletions.
26 changes: 25 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,43 @@
# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.2.0] - 2021-02-02

### Added

- A real README.md

### Changed

- Removed conditional requirement of `import_metadata` as a version of Python >=3.8 is required

## [0.1.11] - 2021-02-02

### Changed

- updated dependency versions

## [0.1.10] - 2021-12-20

### Changed

- updated dependency versions


## [0.1.9] - 2021-04-20

### Changed

- updated dependency versions

## [0.1.8] - 2021

### Changed

- renamed `logging.py` to `logger.py` to avoid colliding with the standard module name
- alter how the PED file is written to: for whatever reason, `Path().write_text()` was not working
but using a open file context manager with `writelines()` does.
Expand All @@ -27,19 +46,24 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [0.1.7] - 2021-02-22

### Added

- CHANGELOG.md
- more logging info

### Fixed

- add missing import of liftPed to plinkliftover.__main__ from plinkliftover.liftover
- if the liftOver executable cannot be found, raise an error

## [0.1.6] - 2021-02-22

### Fixed
- replace `set` and `tuple` in type hints with their `typing.Set` and `typing.Tuple` counterparts

- replace `set` and `tuple` in type hints with their `typing.Set` and `typing.Tuple` counterparts

[0.2.0]: https://github.com/olivierlacan/keep-a-changelog/compare/0.1.11...0.2.0
[0.1.11]: https://github.com/olivierlacan/keep-a-changelog/compare/0.1.10...0.1.11
[0.1.10]: https://github.com/olivierlacan/keep-a-changelog/compare/0.1.9...0.1.10
[0.1.9]: https://github.com/olivierlacan/keep-a-changelog/compare/0.1.8...0.1.9
[0.1.8]: https://github.com/olivierlacan/keep-a-changelog/compare/0.1.7...0.1.8
[0.1.7]: https://github.com/olivierlacan/keep-a-changelog/compare/0.1.6...0.1.7
Expand Down
83 changes: 46 additions & 37 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,75 +1,84 @@
# plinkliftover
# PLINKLiftOver

<div align="center">
<div align="justified">

[![Build status](https://github.com/milescsmith/plinkliftover/workflows/build/badge.svg?branch=master&event=push)](https://github.com/milescsmith/plinkliftover/actions?query=workflow%3Abuild)
[![Python Version](https://img.shields.io/pypi/pyversions/plinkliftover.svg)](https://pypi.org/project/plinkliftover/)
[![Dependencies Status](https://img.shields.io/badge/dependencies-up%20to%20date-brightgreen.svg)](https://github.com/milescsmith/plinkliftover/pulls?utf8=%E2%9C%93&q=is%3Apr%20author%3Aapp%2Fdependabot)

[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![Security: bandit](https://img.shields.io/badge/security-bandit-green.svg)](https://github.com/PyCQA/bandit)
[![Pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white)](https://github.com/milescsmith/plinkliftover/blob/master/.pre-commit-config.yaml)
[![Semantic Versions](https://img.shields.io/badge/%F0%9F%9A%80-semantic%20versions-informational.svg)](https://github.com/milescsmith/plinkliftover/releases)
[![License](https://img.shields.io/github/license/milescsmith/plinkliftover)](https://github.com/milescsmith/plinkliftover/blob/master/LICENSE)
![Alt](https://repobeats.axiom.co/api/embed/8d9c682229fb45f45eef3f300367eb33a44bd347.svg "Repobeats analytics image")

Awesome `plinkliftover` is a Python cli/package created with https://github.com/TezRomacH/python-package-template
**PLINKLiftOver** is a utility enabling [liftOver](http://genome.ucsc.edu/cgi-bin/hgLiftOver)
to work on genomics files from [PLINK](https://www.cog-genomics.org/plink/),
allowing one to update the coordinates from one genome reference version to
another.

</div>

## Installation

PLINKLiftOver requires
* Python 3.8
* The command line version of [liftOver](http://genome.ucsc.edu/cgi-bin/hgLiftOver),
installed and on the system path
* An appropriate [chain file](http://hgdownload.soe.ucsc.edu/downloads.html#liftover)
* The [MAP file](https://zzz.bwh.harvard.edu/plink/data.shtml) from a PLINK
dataset

```bash
pip install -U plinkliftover
```

or install with `Poetry`
or install with the development version with

```bash
poetry add plinkliftover
pip install -U git+https://github.com/milescsmith/plinkliftover.git
```

Then you can run
## Usage

```bash
plinkliftover --help
Usage: plinkliftover [OPTIONS] MAPFILE CHAINFILE

Converts genotype data stored in plink's PED+MAP format from one genome
build to another, using liftOver.
Arguments:
MAPFILE The plink MAP file to `liftOver`. [required]
CHAINFILE The location of the chain files to provide to `liftOver`.
[required]
Options:
--pedfile TEXT Optionally remove "unlifted SNPs" from the plink
PED file after running `liftOver`.
--datfile TEXT Optionally remove 'unlifted SNPs' from a data
file containing a list of SNPs (e.g. for
--exclude or --include in `plink`)
--prefix TEXT The prefix to give to the output files.
--liftoverexecutable TEXT The location of the `liftOver` executable.
-v, --version Prints the version of the plinkliftover package.
--help Show this message and exit.
```
For example
```bash
plinkliftover --name Roman
plinkliftover updating.map hg19ToHg38.over.chain.gz
```
or if installed with `Poetry`:
### Note!
```bash
poetry run plinkliftover --help
```
By default, [PLINK 2.0](https://www.cog-genomics.org/plink/2.0/) does not
use/create the required MAP file. It can be generated using PLINK 1.9 by
```bash
poetry run plinkliftover --name Roman
plink --bfile original --recode --out to_update
```
## 📈 Releases

You can see the list of available releases on the [GitHub Releases](https://github.com/milescsmith/plinkliftover/releases) page.

We follow [Semantic Versions](https://semver.org/) specification.

We use [`Release Drafter`](https://github.com/marketplace/actions/release-drafter). As pull requests are merged, a draft release is kept up-to-date listing the changes, ready to publish when you’re ready. With the categories option, you can categorize pull requests in release notes using labels.

For Pull Request this labels are configured, by default:

| **Label** | **Title in Releases** |
| :-----------------------------------: | :---------------------: |
| `enhancement`, `feature` | 🚀 Features |
| `bug`, `refactoring`, `bugfix`, `fix` | 🔧 Fixes & Refactoring |
| `build`, `ci`, `testing` | 📦 Build System & CI/CD |
| `breaking` | 💥 Breaking Changes |
| `documentation` | 📝 Documentation |
| `dependencies` | ⬆️ Dependencies updates |

You can update it in [`release-drafter.yml`](https://github.com/milescsmith/plinkliftover/blob/master/.github/release-drafter.yml).

GitHub creates the `bug`, `enhancement`, and `documentation` labels for you. Dependabot creates the `dependencies` label. Create the remaining labels on the Issues tab of your GitHub repository, when you need them.
where `original` is the prefix for the bed/bim/fam files and `to_update` is the prefix to give the new files.
## 🛡 License
Expand Down
12 changes: 6 additions & 6 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 2 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ build-backend = "poetry.masonry.api"

[tool.poetry]
name = "plinkliftover"
version = "0.1.11"
version = "0.2.0"
description = "Converts genotype data stored in plink's PED+MAP format from one genome build to another, using liftOver"
readme = "README.md"
authors = [
Expand All @@ -32,8 +32,7 @@ classifiers = [ # Update me
"plinkliftover" = "plinkliftover.__main__:app"

[tool.poetry.dependencies]
python = "^3.8"
importlib_metadata = {version = "^3.4.0", python = "<3.8"}
python = ">=3.8.3,<4.0"
typer = "^0.4.0"
rich = "^10.2.1"
psutil = "^5.8.0"
Expand Down
6 changes: 1 addition & 5 deletions src/plinkliftover/__init__.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,6 @@
# type: ignore[attr-defined]
"""`plinkliftover` Converts genotype data stored in plink's PED+MAP format from one genome build to another, using liftOver"""
try:
from importlib.metadata import PackageNotFoundError, version
except ImportError: # pragma: no cover
from importlib_metadata import PackageNotFoundError, version

from importlib.metadata import PackageNotFoundError, version

try:
__version__ = version(__name__)
Expand Down
12 changes: 3 additions & 9 deletions src/plinkliftover/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@
from pathlib import Path

import typer
from plinkliftover import __version__
from plinkliftover.liftover import bed2map, liftBed, liftDat, liftPed, map2bed
from plinkliftover.logger import plo_logger as logger
from . import __version__
from .liftover import bed2map, liftBed, liftDat, liftPed, map2bed
from .logger import plo_logger as logger
from rich.console import Console

app = typer.Typer(
Expand Down Expand Up @@ -66,9 +66,6 @@ def main(
# Show usage message if user hasn't provided any arguments, rather
# than giving a non-descript error message with the usage()

lifted_set = set()
unlifted_set = set()

mapfile = Path(mapfile)
oldbed = mapfile.with_suffix(".bed")
map2bed(mapfile, oldbed)
Expand All @@ -94,14 +91,11 @@ def main(
)

newbed = Path(f"{mapfile}.bed")
# unlifted = Path(f"{prefix}.unlifted")
lifted_set, unlifted_set, lb_status = liftBed(
fin=oldbed,
fout=newbed,
chainfile=chainfile,
liftOverPath=liftOverPath,
unlifted_set=unlifted_set,
lifted_set=lifted_set,
)

if lb_status:
Expand Down
30 changes: 11 additions & 19 deletions src/plinkliftover/liftover.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
from pathlib import Path
from subprocess import check_output

from plinkliftover.logger import plo_logger as logger
from .logger import plo_logger as logger
from rich.console import Console
from typer import progressbar

Expand All @@ -36,10 +36,10 @@ def map2bed(fin: Path, fout: Path) -> bool:
f"Converting [green]MAP[/] file [yellow]{fin.name}[/] file to [green]UCSC BED[/] file [blue]{fout.name}[/]..."
)
lines = fin.read_text().split("\n")
output = []
output = list()
with progressbar(lines) as map_lines:
for ln in map_lines:
if len(x := ln.split()) == 4:
for line in map_lines:
if len(x := line.split()) == 4:
chrom, rs, _, pos = x
output.append(f"chr{chrom}\t{int(pos)-1}\t{int(pos)}\t{rs}")
else:
Expand All @@ -53,8 +53,6 @@ def liftBed(
fout: Path,
chainfile: Path,
liftOverPath: Path,
unlifted_set: Set[str],
lifted_set: Set[str],
) -> Tuple[Set[str]]:
console.print(f"Lifting [green]BED[/] file [blue]{fin.name}[/]...")
params = {
Expand All @@ -70,18 +68,14 @@ def liftBed(
unlifted_lines = Path(params["UNLIFTED"]).read_text().split("\n")
console.print(f"Processing [red]unlifted[/] {fout.name}.unlifted")
with progressbar(unlifted_lines) as unlifted:
for ln in unlifted:
if len(ln) == 0 or ln[0] == "#":
continue
unlifted_set.add(ln.strip().split()[-1])
print("Using new set comprehension for 'unlifted_set'")
unlifted_set = {ln.strip().split()[-1] for ln in unlifted if len(ln) > 0 and ln[0] != "#"}

console.print(f"Processing [red]new[/] {fout.name}")
new_bed_lines = Path(params["NEW"]).read_text().split("\n")
with progressbar(new_bed_lines) as new_bed:
for ln in new_bed:
if len(ln) == 0 or ln[0] == "#":
continue
lifted_set.add(ln.strip().split()[-1])
print("Using new set comprehension for 'lifted_set'")
lifted_set = {ln.strip().split()[-1] for ln in new_bed if len(ln) != 0 and ln[0] != "#"}

return lifted_set, unlifted_set, True

Expand All @@ -91,7 +85,7 @@ def bed2map(fin: Path, fout: Path) -> bool:
f"Converting lifted [green]BED[/] [blue]{fin.name}[/] file back to [green]MAP[/] [yellow]{fout.name}[/]..."
)
bed_lines = fin.read_text().split("\n")
output = []
output = list()
with progressbar(bed_lines) as lines:
for ln in lines:
if len(x := ln.split()) == 4:
Expand All @@ -105,7 +99,7 @@ def bed2map(fin: Path, fout: Path) -> bool:
def liftDat(fin: Path, fout: Path, lifted_set: Set[str]) -> bool:
console.print(f"Updating [green]DAT[/] file [pink]{fin.name}[/]...")
lines = fin.read_text().split("\n")
output = []
output = list()
with progressbar(lines) as dat_lines:
for ln in dat_lines:
if len(ln) == 0 or ln[0] != "M":
Expand Down Expand Up @@ -133,12 +127,11 @@ def liftPed(

console.print(f"Updating [green]PED[/] file [orange]{fin.resolve()}[/]...")
lines = fin.read_text().split("\n")
output = []
output = list()
with progressbar(lines) as liftped_lines:
for ln in liftped_lines:
if ln.strip() != "":
f = ln.strip().split()
# l = len(f)
f = f[:6] + [
f[i * 2] + " " + f[i * 2 + 1] for i in range(3, len(f) // 2)
]
Expand All @@ -151,7 +144,6 @@ def liftPed(
a = "\t".join(f[:6])
b = "\t".join(newmarker)
output.append(f"{a}\t{b}\n")
# print marker[:10]
console.print(
f"Writing new [green]PED[/] data to [light_slate_blue]{fout.resolve()}[/]"
)
Expand Down

0 comments on commit c142343

Please sign in to comment.