Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 

WhoPaysWriters

UPDATE: WhoPaysWriters.com asked that their data not be posted on a third-party site, so the datasets have been removed. Please email me with any questions.


A data scrape and analysis of WhoPaysWriters.com. A summary of the results can be found here. Collected for an article in the Columbia Journalism Review. Questions and suggestions for improvement are welcome: kevinrmcelwee@gmail.com.


WhoPaysWriters.com is an anonymous platform where freelance journalists post details about their compensation. There were approximately 3000 submissions to the site from 2012-2018, making it the largest publicly-available dataset available of its kind. Journalists not only submit their pay, but also include information about their rights, their relationship with the editor, and other contextual data.

scrapeWPW.py

This script opens creates three kinds of CSVs:

  • publications.csv, which lists all publications scraped from the opening webpage.
  • A CSV created for each publication's page under the data folder.
  • allData_raw.csv, which is one CSV of everything in data. It requires that the user download ChromeDriver in addition to its python packages.

Clean_Data.ipynb

Cleans data for analysis. Other than normal cleaning, here are some decisions made:

  • I replaced most other entries with NaNs.
  • I dropped everything with fewer than 100 words.
  • I dropped all fiction and poetry entries.
  • I removed entries for 2019.
  • Potential spam, unreasonable outliers are cut. They are addressed on a case-by-case basis. This notebook creates allData_clean.csv, what is ultimately used for analysis.

Explore_Data.ipynb

Explores most 2-variable relationships and creates appropriate graphs for study. Also creates publications_rank.csv, which uses rankings from totalPaid, wordRate, daysToBePaid, and paymentDifficulty to rank publications with more than 7 submissions.

About

A data scrape and analysis of WhoPaysWriters.com

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published