Skip to content

Data Quality Procedure

Charlie Hagedorn edited this page Apr 8, 2020 · 99 revisions

Read the entire document first if you are new to the way our data process works. Do not enter/change/etc data in any Google Sheet or Form before reading this wiki.


Data Flow

The following explains how data you enter to the Google Form uploads to the site.

  1. Entries from the Donation Site Google Form appear in the Form-Responses 1 sub-sheet. This sub-sheet collects all the raw data. A team of data moderators edit the sheet to make the entries "Publication Ready."
  2. Entries with an "x" placed in the "Approved" cell are added to the live website by a moderator.

⚠️ Important! DO NOT delete an entry or row from the spreadsheet. ⚠️


Overview of Moderator Roles

We have four moderator roles. Many of us switch between these roles as needed.

  • Quick-Moderators manage the approval of new locations that need supplies onto the live site. We watch the incoming recipient list and list most of them onto the website immediately. Instructions for quick-moderators

  • Data-Moderators comb through the Sheet carefully, verifying the quality of entries, checking the entries for clarity, cleaning up formatting, and ensuring that only qualified recipients are posted. Instructions for data-moderators

  • Update/Removal-Moderators respond quickly to requests for updates of listed information or to remove sites that no longer want donated PPE.

  • De-duplication Moderators remove duplicate and conflicting entries in our records. Instructions for de-dup moderators

Your job as a Quick-moderator

This is a fun and easy role in which we start new moderators. You help hospitals and other recipients get listed for donations immediately. The goal is to get most entries up quickly.

  1. New Recipients appear at the bottom of the Form Responses Google Sheet-tab.

  2. When they appear, the "Final Address" field will be blank. A script that runs once / minute will fill it in for you. It is important to wait for this to be filled in -- consistent computer-generated addresses help us to identify duplicates quickly.

  3. Compare the "Final Address" field with the "Drop off address", "City", and "State".

  • If they look functionally the same, then the Final Address is correct. If the "Street Address for Dropoff" field contains information that has been stripped from the Final Address ("Building B", "Attn: 7th floor", "Attn:Sherlock Holmes", etc.) add it to the "Drop-off instructions" field in a way that makes sense.

  • If the "Final Address" is not a complete address (missing street, city, or state in US), either fix the address or, if insufficient info was provided by the submitter, do not approve.

  • If you notice any typos in the "Street Address for Dropoff" field, you can fix them up, then delete the "Final Address" field. (It will repopulate automatically.)

  1. Put an F (for "Fast") into the "Mod Status" column (Column D).

  2. If the Recipient looks like a clear candidate for PPE (i.e. hospitals, medical centers, EMS, doctor's offices, etc.), place an "x" in the "Approved" column (Column A). If you're not sure if the Recipient is a candidate, do not approve them. We want to get obvious recipients up fast and other moderators will handle ambiguous Recipients.

  3. Publish the entry by loading this URL: https://us-central1-findthemasks.cloudfunctions.net/reloadsheetdata

  4. You're done. The entry will go live at findthemasks.com within 5 minutes.

Thanks to increasing automation, this role can be fun and easy. Your job will be to type "F" and "x" to approve Recipients and wait for new Recipients to submit their information. You're helping get donors information as fast as possible, which is vital during this global pandemic.

If you have extra time, feel free to work upward from the bottom and correct simple capitalization/spelling errors in entries, in the Name, City, and Drop-off instructions. Don't worry about duplicate entries -- handle them as usual. De-duplicating moderators will handle them.

Please do keep an eye on #data Slack, to see if we need to stop publishing (sometimes the developers ask us to pause) and to watch/learn-from the more-experienced moderators. Join the Slack here.


Your job as a Data-moderator

Data-moderators ensure data in each entry is "Publication Ready", which means it is compliant with our schema format and is of high-quality. If you think a Recipient should be listed, approve it with an "x" in the approval column (Column A). If not, leave that field blank.

  1. Verify: Look over entries which have been submitted.
  • Does the name seem complete?
  • Is the address correct? Does it point to the right place? Be skeptical. Adjust it if it needs adjusting. If you're not sure about the entry, Google the name and address to try to understand a little bit about the Recipient entry. There is no need to correct any capitalization in the "Street" or "City" fields.
  • Clean up the formatting of the Drop-off instructions. Capitalization, punctuation. Feel free to add structure and line-breaks (Ctrl+Enter) to the Drop-off instructions. If instructions are vague, make them clear to the extent that you can. If you are unsure, don't make a change to an instruction.
  • Link If the entry includes a URL, format it nicely with Descriptive Text
  1. Approve: This is the most-important job of the Data-Moderator.
  • If the Recipient qualifies, which will be the vast majority of Recipients, place an "x" in the Approved column (Column A). If not, remove any "x" that may be present and add a letter-code to column B ("Reason not published"). Ask in #data Slack if you have questions -- we are happy to help.
  1. Update "Mod Status" field: add an "M" to column D.

  2. Reload the Publish URL (see bottom of this document) to publish it immediately if you wish.

Have questions as a data-moderator? Ask them in the Slack #data channel. Join the Slack here.


Recent changes for data-moderators

  • As of 3/24, each entry will indicate if it is a request to add a new donation site, update an existing donation site, or remove an existing donation site. Sections below outline procedures for each.

Adding a new donation site

If Column I in the Form-Responses 1 sub-sheet says "Add a new donation site", then:

  1. Review data in the entry and make it "Publication-Ready", which means it's compliant with the schema format and is high-quality before it is published to the site. See Entry Formatting section below for details.

  2. Approve entry to go-live. If no duplicate entry is found & the entry is "Publication-Ready", then add an "x" to the "Approved" column on that sheet.

  • If you decide that an entry should not be published for some reason, add a letter to the "Why not published" column: "D" for dupe, "N" for not-approved organization, or "G" for garbage if the data is bad. Don't worry too much about which letter to use if it's a gray area. If unsure, add a "?" and a comment so that others can take a look.
  1. Pushing new entries to the website. When the entry is ready to go, load this URL: https://us-central1-findthemasks.cloudfunctions.net/reloadsheetdata. The new entry will go live within 5 minutes.

Your job as a De-duplicator Moderator

These instructions were last updated on March 28, 10 a.m. PDT

Using the Dupe Locations tab and filter view

If 2+ entries have an exact-location match AND they are approved, then they will appear at the top of the Dupe Locations Google Sheet tab. Note that due to the wackiness of filter views, only one person can act as de-dupe moderator at a time.

  1. Go to the Dupe locations tab. If any row has a count = 2, note the first part of that address . e.g. "580 West 8th Street". (You can either jot this down or keep it in your head.)

  2. Go to the de-duping filter view

  3. Click on the filter icon next to 'Final Address'

  4. Click on 'clear'

  5. In the search box, type the first few characters of the address that was listed in the 'Dupe locations' tab.

  6. Put checks next to the addresses that are similar (you may see 'Avenue' and 'Ave').

  7. Compare the data for the duplicate entries to determine if you need to augment or update which entry is live.

  • Check to see if ‘Type of request’ is marked as ‘Edit an existing entry.’ Take that into consideration when you decide how to combine information. Sometimes it’s selected by mistake but other times the person filling out the form really is trying to update instructions.

  • If the new entry's information is better than the previous entry, then

    • add an "x" to the "Approved" column for the new entry

    • remove the "x" from the previous entry

    • add a 'D' in the 'Why not published' column. if it is a complete duplicate.

    • add a ‘DU’ in the 'Why not published' column if you are updating the new entry with some of the information from this entry.

    • make sure that any additional supplies in the ‘What do you need’ column of the previous entry are included in the new entry

    • leave the following comment on the previous entry's approved cell: "Superseded by [ROW #]"

    • make sure all the entries you’ve reconciled are marked ‘FM’ (for full moderation’ in the Mod Status column. Only the one that is staying approved needs to be formatted properly. See instructions for Data-moderator.

  • If the previous entry's information is better than the new entry, then

    • leave the "Approved" column for the new entry blank
    • add a 'D' in the 'Why not published' column if it is a complete duplicate
    • add a ‘DU’ in the 'Why not published' column if you are updating the previous entry with some of the information from the new entry
    • make sure that any additional supplies in the ‘What do you need’ column of the new entry are included in the previous entry
    • leave the following comment on the previous entry's approved cell: "Duplicate of [ROW #]"
    • make sure all the entries you’ve reconciled are marked ‘FM’ (for full moderation’ in the Mod Status column. Only the one that is staying approved needs to be formatted properly. See instructions for Data-moderator.

Q: What if the duplicate entries have different drop off instructions? (and neither record seems to have better info than the other)

Answer: Combine the information and surface both instructions in the entry that's Approved. See Row 1155 for example. Sometimes the information may be conflicting. Use your best judgment. Mark ‘FZ’ in column C (Mod Status) to indicate further discussion is required.


Useful Links:


Coordinating between moderators

  1. If another editor has click-focus on a cell, don't edit it unless they have left it untouched for more than 3 minutes.

  2. If you are co-editing large blocks of backlog, call out your blocks on the #data Slack. Saying "I'm entering rows 800-815" will help. No need to call out blocks longer than 15 at a time.

  3. If you are co-editing with one other moderator on entries that are streaming in, let one address the backlog from the top, while the other jumps to the most-recent entry and work upward.


Moderator Shift Scheduling

Once we are caught up on our backlog, we are interested in having volunteers to update new entries as they come in. Feel free to sign up for 4-hour blocks. The United States mostly sleeps at night, so it is okay to let the night-time (~midnight-5 am Pacific) go un-moderated for now.


⚠️ Note that the following sections contain information that is partially obsolete and will be updated on 3/26. ⚠️

Your job as a Remove/Update moderator

** These instructions are partially obsolete, as of 3/26. We will update them on 3/26.**

In this role, you help to keep our listings up to date. On ~3/24, we added a new question for our recipients, asking whether they were requesting a new listing, an updated listing, or that existing listings be removed.

Scan the relevant Column in Form Responses 1 (Column I, at 2:10pm, 3/25) for "Edit an existing site" or "Remove an existing site". When you find a row with one of them, do the following:

Remove an existing site: Search the "moderated" sheet for all entries above the "remove" entry. If any have an "x" in the "approve" column (A), remove the "x", and leave a Google Sheet comment referencing the removal entry. "Removed by request 1155". If the removal request has been approved with an "x", remove the "x" and leave the comment "removal request".

** Edit an existing site **: Search the "moderated" sheet for the relevant institution in entries above the "update" entry. Remove the approval "x" for any older entries, and leave a Google Sheet comment referencing the new entry, i.e. "Updated by request 1155". If it seems prudent to include information in older entries in the new, updated entry, add that information to the entry on the Form Responses sheet.

Have questions? Ask them in the Slack #data channel. Join the Slack here.

Updating an existing donation site

If Column I in the Form-Responses 1 sub-sheet says "Edit existing donation site", then

  1. Follow the procedure outlined in the Finding Duplicate Entries section.

Removing an existing donation site

If Column I in the Form-Responses 1 sub-sheet says "Remove existing donation site", then:

  1. Find the exact entry in Moderated sub-sheet and remove the "x" from the "Approved" column for that entry.
  2. Push updated entries to the website. When the entry is ready to go, load this URL: https://us-central1-findthemasks.cloudfunctions.net/reloadsheetdata. The new entry will go live within 5 minutes.

⚠️ Never delete an entry! Never delete a row! Entries are referred to by their row numbers. ⚠️


Entry Formatting (Publication-Ready)

This step is important, as it ensures that the data are compliant with the schema format and are high-quality before being published to the site. For each entry review the following:


Address

In the "City" field:

  • For New York City locations, specify the borough as well as the city - ie "New York - Bronx" or "New York - Manhattan"
  • Make sure there are no extra spaces after the city name. With our current script, differences in extra spaces around the same city name will cause the script to create a new listing for the same city.
  • Check for proper capitalization of the city name.

Drop-off Instructions

  • If there is a link in the instructions, convert plain text to a clickable link by formatting it using <a href="www.url.com" tags. Tip: A period after </a> sometimes doesn't play nice with formatting, so put it before </a>.
  • Check the contributed text for unmistakable errors, capitalize any necessary capitalization, etc.

Q: What do I do when the drop-off instructions field is empty?

Answer: Leave it empty.


What entities qualify as recipients?

Moderators will encounter entries where it is unclear whether or not a recipient needs PPE more than an emergency room or ICU. At present, our guidelines are very permissive. They may change as we get guidance from medical professionals.

Our no delete policy means that the status of entries can always be revisited.

Until we develop more-nuanced guidelines, we must rely on moderators' judgement. A useful guideline may be "will this recipient have significant contact with patients?"

If you're vexed, please ask the on the #data Slack channel for extra opinions. If everyone is vexed, then the answer is "yes" -- the entry should be included.

  • Dentists: If the clinic is continuing emergency procedures, then approve. Check their website for info, or call the clinic to verify.

  • Individual home-health providers If they appear to be operating professionally, they are approved.

  • Eye Doctors At this time, they are approved. Mark FMZ any that you're uncertain about.

  • Pharmacies: We have been converging on 'no' for this.

  • International: We list entities in any United States state or territory (multiple entries in the Mariana Islands!). We can direct Canadian and European entities to our sister findthemasks sites.


Updating Recipients

Any edits should be done on "Form Responses 1," not on "moderated."

If a Recipient has submitted repeat entries:

  • Use the most complete / updated data.

  • Remove "x" from Column A for all duplicates; do not delete them.

  • Put a comment in the empty column A box to label it as a repeat. This will help other moderators know why it has been un-approved.

  • Refresh the site's data.json file for changes to take effect, by loading the following URL: https://us-central1-findthemasks.cloudfunctions.net/reloadsheetdata -->