- A-1. Curator of dataset (person or organization who authored/created/gathered/whatever’ed the data in the first place)
- A-2. Who can people contact about the dataset
- A-2a. Name
- A-2b. Email
- A-3. How should the dataset be cited
- A-4. What papers/products have been produced based on the dataset (if none, enter "none yet")
- C-1. Name
- C-2. URL
- C-3. Size of community
- C-4. Description of community
- E-1. Does the dataset include human subjects data, broadly defined?
- E-2. Context of collection (e.g., did human subjects know the data was being collected for research?)
- E-3. Informed consent collected [Y, N]
- E-3a. How?
- E-3a. For what data uses?
- E-4. De-identification steps taken
- E-5. Contains sensitive data [Y, N]
- DS-1. “original” use of the dataset (e.g., research questions addressed, unit(s) of analysis)
- DS-2. URL of dataset
- DS-3. Format (of the files themselves)
- DS-4. Size of the files (magnitude - e.g., MBs, GBs, TBs)
- DS-5. N of X (for each type of thing included users, tweets, chat entries, whatever)
- DS-6. Processing
- DS-6a. “raw” from source
- DS-6b. processed by someone
- DS-6c. Name of processor
- DS-6d. link to processing script or workflow description
- A-5. License under which the data can be used
- DS-7. Version of the data set
- DS-8. How the data was originally collected (e.g., which Twitter API did you use)