Grateful Data isn't programming code, but an online tutorial about data acquisition, cleaning and enriching, using publicly accessible data on the band the Grateful Dead as examples. Read the Wiki to find out how to use the sample data.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
ASCAP-Data-Extended.google-refine.tar.gz
IA-Data.xlsx
IMDB-titles.txt
LICENSE
README.md
dead-dates-skos.rdf

README.md

Grateful Data

Welcome to Grateful Data. Grateful Data isn't programming code (per se), but an online tutorial about data acquisition, cleaning and enriching, using publicly accessible data on the band the Grateful Dead as examples.

Getting Started

To get started, you will need a program called OpenRefine. If you don't already have it, I would recommend downloading and installing version 2.8; released in 2017. Version 2.8 is currently the stable version.

OpenRefine's website, with download links and installation instructions, is here.

Once OpenRefine is installed on your machine, click on DOWNLOAD ZIP to save the practice/sample data locally. Then click the Wiki link at the top of the page to jump right in.

A Note About Licensing

The instructions of this tutorial are licensed as Creative Commons Attribution-NonCommercial-ShareAlike 4.0. More information is available here, but the short version is treat it like a tape-trader: share and share alike, but not for commercial purposes, and always with appropriate credit. If you want to build on what I've done, fantastic, but only under the same license.

A Note About Data Licenses

With the growth of publicly-available datasets, the research data community -- including Harvard’s Dataverse, Figshare, and Dryad -- recommends CC0, or "No Rights Reserved", for data and database licensing, especially scientific data. However, this is not always the case, so always be sure to check the terms of use for any data you acquire.

Grateful Data uses sample data culled from the Internet Archive and the American Society of Composers, Authors and Publishers (ASCAP) ACE Repertory Search. The Internet Archive's terms of use clearly state that access to their data is granted for scholarship and research purposes; however, ASCAP's terms of service state:

ASCAP owns all rights, title and interest in and to its ACE Repertory Search, and any copyrights, database rights and/or other intellectual property and/or proprietary rights therein. Information from ACE may be downloaded, reproduced and/or used solely for evaluation purposes (either for your own benefit or the benefit of a third party for whom you are providing legal, accounting or other professional services) and may not be sold, offered for sale, marketed, promoted, advertised, commercialized or used as the basis to create derivative works, directly or indirectly, in any manner. By accessing ACE, you acknowledge and agree that you shall comply, and take steps to ensure that any third party for whom you are providing legal, accounting or other professional services in connection with information obtained from ACE agrees to comply, with the above ACE Terms of Use and any additional terms and conditions that we provide to you in connection with ACE and other products and services we may offer or make available to you.

Because of this, I will provide instructions on how to acquire ASCAP's data, but will not make the data available on this Git.

Info, Questions and Comments

I am Scott Carlson, the Metadata Coordinator of Fondren Library at Rice University. I received my MLIS from Dominican University (River Forest, IL) and an Archives Certificate in Digital Stewardship from Simmons College (Boston, MA). I am also the co-founder of Indie Preserves, a website that provides practical preservation advice to independent music labels and bands. My favorite show is 1977-05-22, The Sportatorium, Pembroke Pines, Florida.

Email me at: sjc5@rice.edu