Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
The BitCurator Access project developed tools to help libraries, archives, and museums provide web-based and local access to born-digital materials held on disk images. BitCurator Access tools simplify access to raw and forensically-packaged disk images, allowing users to incorporate these objects into access environments while preserving original order and relevant environmental context. Using open source digital forensics software libraries, these tools enable detailed analysis of file and file system provenance, quality and accessibility of files, metadata in files and the file system, and residual or hidden data.
BitCurator Access focused on four areas of interest related to accessing born-digital collections:
- Web-based access to raw and forensically packaged disk images
- Redaction of file items, metadata and hidden data from disk images
- OS and executable virtualization for legacy disk images
- Transforming and using digital forensics metadata in collecting environments
The bitcurator-access-webtools project is a Flask application that allows users to browse file systems in raw and forensically packaged disk images within a web browser. The application can parse raw and E01-packaged images containing FAT16, FAT32, NTFS, HFS+, and EXT 2/3/4 file systems, and allows users to navigate the file system contents, download individual files, and search the contents within a simple web interface.
For more information on the design of the application, along with instructions on how to obtain and build the software, see the BitCurator Access Webtools page. Visit our Screencast Tutorials for walkthroughs of past releases.
The BitCurator Access Redaction project builds on existing disk image redaction and Digital Forensics XML tools to provide collecting institutions with software to redact strings and byte sequences identified in disk images. The software also includes a Python API allowing institutions to develop powerful custom redaction facilities using cutting-edge tools including lightgrep.
Virtualized Access Environments
Born-digital materials that contain executable content or bootable operating systems often require virtualized hardware support to remain accessible. Between 2014 and 2016, the BitCurator Access project explored a range of methods to provision virtual machines providing access to disk images extracted from legacy media, including the use of the bwFLA Emulation-as-a-Service platform developed at the University of Freiburg.
You can find a more detailed explanation of the intended use cases and related technologies in the paper Functional Access to Forensic Disk Images in a Web Service (also in the Proceedings of iPres 2015).
This wiki, documentation, and other materials generated by the BitCurator team are licensed under Creative Commons Attribution 4.0 International (CC BY 4.0). All other software included in the BitCurator environment is distributed in accordance with original licenses. See our GitHub repositories for licenses associated with specific projects.
Development, Funding, and Partners
Grants from the Andrew W. Mellon Foundation supported the BitCurator project (a partnership between the School of Information and Library Science at the University of North Carolina at Chapel Hill and the Maryland Institute for Technology in the Humanities through September 2014, and the BitCurator Access project through September 2016. A grant from the Andrew W. Mellon Foundation currently supports the BitCurator NLP project (2016-2018).