The phiperio package provides utilities to import, validate, and
manage PhIP-Seq datasets, including standardized conversion pipelines,
data checks, and access to cached peptide metadata.
You can install the development version of phiperio from GitHub with
either pak or devtools:
# install.packages("pak")
pak::pak("Polymerase3/phiperio")
# or, using devtools:
# install.packages("devtools")
devtools::install_github("Polymerase3/phiperio")For guided walk-throughs, see the pkgdown vignettes:
- Importing long tidy data (convert_standard) — cross-sectional and longitudinal tidy inputs.
- Importing multiple files at
once
— batch ingest of many CSV/Parquet files with
sample_id_from_filenames. - Importing legacy PhIP-Seq data (convert_legacy) — classic wide matrices (exist/fold_change/raw counts) plus sample/timepoint metadata.
phiperio focuses on reliable ingest and validation of PhIP-Seq data,
so downstream analyses start from a clean, standardized base. Key
features include:
- DuckDB backend + Parquet first: uses DuckDB under the hood and
writes/reads Parquet by default as the transaction layer between the
phiperdata source andphiperio, giving fast I/O and great interoperability. - Scales to millions of rows: lazy database pipelines and Parquet storage let you work efficiently with very large PhIP-Seq datasets.
- Import helpers for common PhIP-Seq inputs and peptide metadata
(peptide library cached and maintained in the companion
phiperrepo). - Strong validation and consistency checks to catch data issues early.
- Lightweight, reproducible pipelines to standardize raw inputs into
<phip_data>objects.
Spotted a bug or want to request a feature? Please open an issue: https://github.com/Polymerase3/phiperio/issues