A modern, PHP 8.x‑ready library for creating, manipulating and validating BagIt bags – the standard packaging format for digital preservation in the LOCKSS Personal Lockers Network (PLN).
🔐 Ensure integrity · 📋 Generate manifests · 🌐 Fetch remote content · ✅ Validate bags
BagIt PHP is a PHP implementation of the BagIt 0.96 specification (now standardized as RFC 8493). A "bag" is a self‑describing, integrity‑checked container that bundles digital payload files together with their metadata – making it ideal for archival storage, network transfer, and long‑term digital preservation.
This Wizdam Edition is a complete modernization of the original scholarslab/BagItPHP library, fully refactored for PHP 8.4+ compatibility, PSR‑4 autoloading, and seamless integration with Composer. It is purpose‑built to support the LOCKSS PLN (Private LOCKSS Network) ecosystem, where integrity‑verified content packages are critical for distributed preservation workflows.
| 🔧 Feature | 📝 Description |
|---|---|
| Bag Compilation | Create new bags from scratch with a single constructor call |
| Manifest Generation | Automatically generate manifest-{sha1,md5}.txt and tagmanifest-{sha1,md5}.txt |
| Tag File Handling | Read and write bagit.txt, bag-info.txt and custom tag files |
| Remote Fetching | Download files listed in fetch.txt via HTTP (optional) |
| Validation | Full integrity checking – checksum verification, required file/directory presence |
| Compression | Export bags as .tgz (gzip) or .zip archives |
| Dual Hash Support | Choose between SHA‑1 and MD5 hashing algorithms on the fly |
| Extended Bag Mode | Optional creation of bag-info.txt, fetch.txt, and tag manifests |
| PHP 8.4+ Native | Fully compatible with modern PHP – type‑safe, deprecation‑free |
| PSR‑4 Autoloading | PSR‑4 namespaced under Wizdam\BagIt for immediate Composer autoloading |
composer require wizdam/bagit-phpClone the repository and include the autoloader:
git clone https://github.com/mokesano/bagit-php.git
cd bagit-php
composer install- PHP ≥ 8.4
- PEAR
Archive_Tar≥ 1.4 (automatically pulled by Composer)
<?php
require_once 'vendor/autoload.php';
use Wizdam\BagIt\BagIt;
// Create a fresh bag in the "my-bag" directory
$bag = new BagIt('my-bag');
// Add a file to the payload
$bag->addFile('/path/to/source/document.pdf', 'document.pdf');
// Update checksums and manifests
$bag->update();
// Export as a gzipped tarball
$bag->package('my-bag'); // creates my-bag.tgz
echo "✅ Bag created successfully!\n";<?php
require_once 'vendor/autoload.php';
use Wizdam\BagIt\BagIt;
// Metadata to store in bag-info.txt
$bagInfo = [
'Source-Organization' => 'University of Virginia Library',
'Bagging-Date' => date('Y-m-d'),
'External-Description' => 'Preservation copy of a scholarly article'
];
// Create an extended bag with metadata and remote fetching enabled
$bag = new BagIt(
bag: 'extended-bag',
validate: true, // run checksum validation after creation
extended: true, // generate optional tag files
fetch: true, // download files listed in fetch.txt
bagInfoData: $bagInfo
);
// Add files and register remote URLs
$bag->addFile('localfile.xml', 'data/localfile.xml');
$bag->fetch->add('https://example.org/remote-file.pdf', 'data/remote-file.pdf');
// Add more metadata
$bag->setBagInfoData('Internal-Sender-Identifier', 'archive-2025-001');
// Finalize the bag
$bag->update();
$bag->package('extended-bag', 'zip'); // create extended-bag.zip<?php
use Wizdam\BagIt\BagIt;
// Open and validate an existing bag
$bag = new BagIt('path/to/existing-bag.tgz');
// Check validity
if ($bag->isValid()) {
echo "✅ Bag is valid!\n";
// List all payload files
foreach ($bag->getBagContents() as $file) {
echo " 📄 $file\n";
}
// Retrieve remote files if needed
$bag->fetch->download();
} else {
echo "❌ Validation errors found:\n";
foreach ($bag->getBagErrors() as $error) {
echo " ⚠️ {$error['file']}: {$error['message']}\n";
}
}The library is organized into four source files, each providing a distinct capability:
| 📁 File | 🏷️ Class / Functions | 📋 Responsibility |
|---|---|---|
src/bagit.php |
BagIt, BagItException |
Core bag logic – creation, file management, validation, compression |
src/bagit_manifest.php |
BagItManifest |
Reading, writing, and validating manifest checksum files |
src/bagit_fetch.php |
BagItFetch |
Managing fetch.txt entries and downloading remote payload files |
src/bagit_utils.php |
rls(), rrmdir(), tmpdir(), endsWith(), filterArrayMatches() |
Filesystem and string utility functions |
| Method | Description |
|---|---|
__construct($bag, $validate, $extended, $fetch, $bagInfoData) |
Initialize a bag – create new or open existing |
addFile($src, $dest) |
Copy a file into the bag's data/ directory |
update() |
Regenerate all manifests and sanitize file names |
package($destination, $method) |
Compress to .tgz or .zip archive |
validate() |
Run full integrity checks (returns error list) |
isValid() |
Returns true if no validation errors exist |
getBagContents() |
List all files in the data directory |
getBagErrors() |
Return array of validation errors |
getBagInfo() |
Retrieve bag version, encoding, and hash algorithm |
setHashEncoding($algo) |
Switch between 'sha1' and 'md5' |
setBagInfoData($key, $value) |
Add or update metadata key–value pairs |
A BagIt "bag" is a hierarchical file layout:
my-bag/
├── bagit.txt # BagIt version and encoding
├── bag-info.txt # Metadata (extended bags)
├── manifest-sha1.txt # Checksums of payload files
├── tagmanifest-sha1.txt # Checksums of tag files
├── fetch.txt # Remote file URLs (optional)
└── data/ # Payload – the actual content
├── document.pdf
├── image.jpg
└── subdir/
└── another-file.xml
The specification ensures that each bag carries its own integrity verification data, making it self‑contained and suitable for archiving, network exchange, and long‑term preservation.
📘 Learn more: RFC 8493 – The BagIt File Packaging Format
We welcome contributions! Please review our Contributing Guidelines before submitting a pull request.
Coding Standards:
- PHP code must follow PSR‑1 guidelines
- JavaScript follows Crockford's conventions
- All new features require updated unit tests and documentation
This project adheres to the Contributor Covenant Code of Conduct. By participating, you agree to uphold these standards.
Security is taken seriously. Please do not publicly disclose any vulnerabilities.
- Reporting: Send vulnerability reports to security@sangia.org
- Acknowledgment: The lead maintainer will respond within 48 hours
- Advisories: Published at GitHub Security Advisories
Full details are in our Security Policy.
This project is distributed under the GNU General Public License v3.0 (GPL‑3.0‑only). See LICENSE for the full text.
Note: The original BagItPHP library by the University of Virginia / Scholars' Lab was released under Apache License 2.0. This fork retains attribution headers in source files and is relicensed under GPL‑3.0‑only for the Wizdam Edition.
| 🏷️ Attribution | 🔗 Reference |
|---|---|
| Original Author | Wayne Graham, Eric Rochester – University of Virginia Scholars' Lab |
| Original Repository | scholarslab/BagItPHP (no longer maintained) |
| Wizdam Edition Maintainer | Rochmady (mokesano) |
| BagIt Specification | RFC 8493 – J. Kunze, J. Littman, E. Madden, J. Scancella |
| LOCKSS Program | lockss.org – Stanford University |
| Dependency | PEAR Archive_Tar |
Made with ❤️ for the digital preservation community
© 2026 Rochmady. Licensed under GPL‑3.0‑only.