Skip to content


Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

ArcPP - Archaeal Proteome Project

Modern proteomics approaches can explore whole proteomes within a single mass spectrometry (MS) run. However, the enormous amount of MS data generated often remains incompletely analyzed due to a lack of sophisticated bioinformatic tools and expertise needed from a diverse array of fields. In particular, in the field of microbiology, efforts to combine large-scale proteomic datasets have so far largely been missing. Thus, despite their relatively small genomes, the proteomes of most archaea remain incompletely characterized. This in turn undermines our ability to gain a greater understanding of archaeal cell biology.

Therefore, we have initiated the Archaeal Proteome Project (ArcPP), a community effort that works towards a comprehensive analysis of archaeal proteomes. Starting with the model archaeon Haloferax volcanii, using state-of-the-art bioinformatic tools, we have:

  • reanalyzed 26 Mio. spectra
  • optimized the analysis using parameter sweeps, multiple search engines implemented in Ursgal, and the combination of resultsthrough the combined PEP approach
  • thoroughly controlled false discovery rates for high confidence protein identifications using the picked protein FDR approach and limiting FDR to 0.5%
  • identified more than 45k peptides, corresponding to 3069 proteins (>75% of the proteome) with a median sequence coverage of 55%.
  • analyzed N-terminal protein processing, including N-terminal acetylation and signal peptide cleavage
  • performed a detailed glycoproteomic analysis, identifying >230 glycopeptides corresponding to 45 glycoproteins

Benefiting from the established bioinformatic infrastructure, we will follow up on this analysis focusing on H. volcanii proteogenomics as well as the characterization of various post-translational modifications. Furthermore, ArcPP will integrate quantitative results obtained from the individual datasets in order to identify common regulatory mechanisms. These studies on the H. volcanii proteome can serve as a blueprint for comprehensive proteomic analyses performed on a diverse range of archaea and bacteria.


Initial publication: Schulze, S.; Adams, Z.; Cerletti, M. et al. (2020). The Archaeal Proteome Project advances knowledge about archaeal cell biology through comprehensive proteomics. Nat Commun 11, 3145.

Glycoproteomic analysis using ArcPP: Schulze, S.; Pfeiffer, F.; Garcia, B.A.; Pohlschroder, M. (2021). Comprehensive glycoproteomics shines new light on the complexity and extent of glycosylation in archaea. PLOS Biol.

Step-by-step description of sample preparation and bioinformatic analysis: Schulze, S.; Pohlschroder, M. (2021). Proteomic sample preparation and data analysis in line with the Archaeal Proteome. (coming soon)


Get the latest version via GitHub:

as a .zip package:

or via git clone::

git clone

Further result files can found here:

An interactive, searchable representation of protein and peptide identifications can be found at:

How to use

For looking at the meta data, database and result files, of course no specific software is required.

However, if you want to execute the analysis scripts, Python 3.5 or higher is required as well as the Python package Ursgal. Further information on the installation and usage of Ursgal can be found here:

Furthermore, docstrings in the scripts explain the basic functions and steps in the analysis pipeline.

Questions and Participation

If you encounter any problems you can open up issues at GitHub, or contact us directly by email. For any contributions, fork us at and open up pull requests.


Version 1.3.0 (06-2021)

  • Full integration of PXD021827

Version 1.2.0 (05-2021)

  • Integration of PXD021827 (meta data)
  • Integration of PXD021874 (meta data + results)
  • Including glycoproteomic analysis scripts
  • Including as a generalized example analysis script

Version 1.1.0 (05-2020)

  • Including analysis of Natrialba magadii as part of PXD009116

Version 1.0.0 (05-2020)

  • Initial ArcPP release


Copyright 2019-today by authors and contributors in alphabetical order

  • Zach Adams
  • Micaela Cerletti
  • Rosana De Castro
  • Sébastien Ferreira-Cerca
  • Christian Fufezan
  • Benjamin A. Garcia
  • María Inés Giménez
  • Michael Hippler
  • Zivojin Jevtic
  • Robert Knüppel
  • Georgio Legerme
  • Christof Lenz
  • Anita Marchfelder
  • Julie Maupin-Furlow
  • Roberto A. Paggi
  • Friedhelm Pfeiffer
  • Ansgar Poetsch
  • Mechthild Pohlschroder
  • Stefan Schulze
  • Henning Urlaub


Dr. Stefan Schulze Department of Biology, University of Pennsylvania 201 Leidy Laboratories, 3740 Hamilton Walk, Philadelphia, PA 19104 email:


While all analyses for ArcPP have been performed with greatest care, errors in the underlying software, raw datasets or database cannot be ruled out completely. Therefore, the source code used for this project is available here and should be inspected if in doubt, or used to verify results. As common practice in science: be critical and never trust results blindly.


The content of this project itself is licensed under the Creative Commons Attribution 3.0 Unported license, and the underlying source code used to format and display that content is licensed under the GNU Lesser General Public License v3.0.


The Archaeal Proteome project - advancing knowledge about archaeal cell biology through comprehensive proteomics







No packages published