Skip to content
This repository has been archived by the owner. It is now read-only.
DataRefuge workflow for DataRescue events
Branch: master
Clone or download
Pull request Compare This branch is 16 commits ahead, 6 commits behind datarefuge:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

DataRescue Workflow

This guide describes the DataRescue workflow we use for DataRescue activities as developed by the DataRefuge project and EDGI, both at in-person events and when people work remotely. It explains the process that a URL/dataset goes through from the time it has been identified, either by a Seeder as "uncrawlable," or by other means, until it is made available as a record in the CKAN data catalog. The process involves several stages, and is designed to maximize smooth hand-offs so that each phase is handled by someone with distinct expertise in the area they're tackling, while the data is always being tracked for security.

Are you looking for the actual documentation?

We have moved the documentation to a more user-friendly format. You can now find the guide at

Note that we are still working on it, and will shortly add screenshots, etc.

Contributing to this guide

Suggestions and improvements are welcome! All changes to the guide are managed through this GitHub repository. Please check our contribution guidelines for details.


DataRescue is a broad, grassroots effort with support from numerous local and nationwide networks. DataRefuge and EDGI partner with local organizers in supporting these events. See more of our institutional partners on the DataRefuge home page.

You can’t perform that action at this time.