The Selenium scraper used to collect data from one million Medium articles.
Switch branches/tags
Nothing to show
Clone or download
Latest commit ba6de61 Nov 1, 2018
Permalink
Failed to load latest commit information.
img res Oct 21, 2018
Data_cleaning.ipynb . Nov 1, 2018
Medium_Author_Leaderboard.ipynb Medium's top authors Oct 21, 2018
Medium_EDA.ipynb Perfecting the beauty Oct 20, 2018
Medium_EDA_expanded.ipynb i Oct 31, 2018
README.md readme updates Nov 1, 2018

README.md

Analyzing_Medium

What is Medium?

Medium is a blogging platform where writers and readers share their ideas. With a strong following in the tech community, it is a place where people can come to learn from professionals and industry experts. I began writing on Medium very recently, inspired to write about data-science and machine learning. For more information, check out my writing here.

This Project

In this project I collected data on 1.4 million unique Medium stories from 95 of the most popular writing subjects. I used this data to answer the following questions.

  1. What do I need to know about Medium as a writer and as a reader? (source)
  2. Who are the top Data-Science writers on Medium? (source)
  3. How can Medium writer's measure the performance of their stories? How can they compare their performance to that of similar writers? (source)

After I answered these questions I wrote a story detailing my findings in Medium's largest tech publication, freeCodeCamp (496k subscribers). The full article can be found here. I then published the full data-set for public use by the Medium community. All 1.4 million data points are freely available on Kaggle. My introductory article, describing the dataset and how I collected it, can be found here.

This repository is a collection of everything I found while analyzing the Medium data. For a list of key findings look in the next section.

My Findings


1. Most Stories on Medium receive very little reader engagement.


2. Stories are shorter in length. (2-3 Minutes)


3. Most authors only wrote one story, and a quarter were published in a publication.


4. The top 1% of stories received more than two thousand claps.


5. Authors can compare their stories to the top 1% of stories in their writing-topic.


6. The most-clapped stories on freeCodeCamp far outrank other, larger, publications.


7. Here are the top 100 most-clapped data-science writers on Medium of the last year. (I am 41st)