This is the Github repository for Backlinko´s 2020 SEO Jobs Report.
- 📝 The full data report can be found below.
- 🔨 The study was conducted with the statistical programming language R.
- 📊 The code for the analysis and plots
- 💾 The datasets (excluding Linkedin due to its size)
- Cédric Scherer and Daniel Kupka (both frontpagedata.com)
- Brian Dean (backlinko.com)
We use two data sets:
- Glassdoor data (original) with 2,651 observations
- LinkedIn data (original) with 144,519 observations
- subset for "SEO": with 5,856 observations
- subset for "SEO" and English-speaking offers: with 984 observations
The LinkedIn data contain global job offers while the GlassDoor data only jobs from the US. The LinkedIn data including only job offers with the term SEO
(or seo
) contain 5,856 offers overall, 984 offers from English-speaking countries (USA, Canada, UK, Ireland, Australia, South Africa) and 862 from the USA and the UK (links starting with www.linkedin.com
).
We merged both data sets and kept as many variables as possible, manually creating new variables for both datasets (GlassDoor: seniority
and employment type
; LinkedIn: sector
) based on text matching of job titles and descriptions. We also removed as many duplictaed entries as possible by matching job title, employer and job location. The final worldwide data set contains 7,051 observations.
Because the job offers are collected from all over the world, a lot of foreign terms are included. Thus, we merged the GlassDoor data also with the English subset of the LinkedIn data and kept again as many variables as possible by manually creating new variables for both data sets. The final "All English" data set contains 2,569 observations.
The GlassDoor data are cleaner with regard to job titles and description than the LinkedIn data. Consequently, some plots using the GlassDoor data do a better job so we provide for now both version (the merged "All English" data set and the GlassDoor data set).
Also, the GlassDoor data contain information that are missing from the LinkedIn data such as estimated salary range
, rating
, employer
, industry
, and size (no. of employees)
.
We analyzed the data on job titles using text mining techniques. In a first step, we tokenize the job titles into single words and visualize their frequency. Stop words and words that appeared less than 7 times were removed to make the graph easier to grasp.
In a second step, we analyzed sequences of words in the job title. The sorted bar plot shows the most popular consecutive sequences of words (5 or more occurrences), colored by category.
Are technical terms in job titles more popular? Which technical and non-technical terms are most popular?
We manually classified in technical and non-technical positions, removing all words that are no specific to any of the both categories:
- technical ~
analy|special|engine|develop|technic|optimi
- non-technical ~
manage|direct|writ|consult|coordinat|edito|market|sale|social|strateg|supervis
The modified stacked bar plot shows the number of words found per job category and, additionally as another stacked bar next to it, the most common words per category (with labels for words that occured at least 20 times). The height of the stacks indicates as well the number, the width is arbitrary.
Note: For now we focussed on the job offers from the US in terms of cities, states and counties. This decision was based on two reasons: First, we believe US data is of most interest; secondly, we center all our analyses on job offers from English-speaking countries only so a world map would be in contrast here. Of course, if you think it is valuable, we can also create a world map.
-> Counts of unique companies per revenue class
I tokenized the description and removed stop words and numbers as well as manually non-sense/non-skill-related words. There might be more but if we keep it we can have a closer look I would say. The wordclouds show the 75 most common words per group.
The following graphs show counts of unique companies per rating, first for each rating score found in the data (including 1 decimal place) and then as a histogram grouped into ranges of 1.
Note: Since there were many companies with the highest rating of 5 (see plots before), we decided to modify the conditions and to remove companies with revenues below $10 million. There were anayway only seven companies with the lowest rating possible, so we did not apply any filtering in that case.
This is a simple wordcloud of words mentioned in the job descriptions with a frequency of 10 or more. That way, we can scan through the list and select those that are of interest.
Words that occured at least 100 times:
Words that occured at least 1000 times:
Note: Consecutive sequences of words (ngrams) in the job desciptions do not bring any insightful, a lot of phrasings and fill words.
We extracted from the job descriptions the required/desired degree:
- Bachelors ~
b.ba.|b.sc.|bba|bsc|\\bbachelor\\b
- Masters ~
"m.ba.|m.sc.|mba|msc|[(to|will)]\\s\\bmaster\\b
- Doctorates ~
ph.d.|phd|doctora
In total we found 792 positions mentioning Bachelors, 204 Masters and only 11 belonging to the Doctorate category (out of 2,651 job offer descriptions).
Afterwards, we determined the minimimum degree required (Doctorate > Masters > Bachelors), yielding 792 Bachelors, 146 Masters and 7 Doctorates.
For example, "SEO Manager" by Quizlet in San Francisco, CA, with an estimated average salary of $107.5K has listed no degree requirements .
Note: There are many ways one could look at those numbers: absolute, proportional across all job offers and proportional within all that list any educational requirement.
A) Absolute nubmer of job offers requiring a qualification (but thus no information about the overall number of job offers - coult be added but would make most of the plot grey = no requirement...):
B) Proportional across all job offers (i.e. number of offers with [education] / sum(job offers)):
C) Proportional of all job offers requiring a qualification (i.e. number of offers with [education] / sum(job offers with any education required)):
A) Absolute nubmer of job offers requiring a qualification (but thus no information about the overall number of job offers - coult be added but would make most of the plot grey = no requirement...):
B) Proportional across all job offers (i.e. number of offers with [education] / sum(job offers)):
C) Proportional of all job offers requiring a qualification (i.e. number of offers with [education] / sum(job offers with any education required)):
I for now use the programming languages listed by the SO yearly survey: JavaScript, HTML/CSS, SQL, Python, Java, Bash/Shell/PowerShell, C#, PHP, C++, TypeScript, C, Ruby, Go, Assembly, Swift, Kotlin, R, VBA, Objective-C, Scala, Rust, Dart, Elixir, Clojure, WebAssembly + Julia
We searched for given tool names within the description of the job offer. The following list of (popular) tools was used for the the analysis:
Bing Webmaster Tools
, Botify
, Bright Local
, Browseo
, Clusteric
, ContentKing App
, DareBoost
, DeepCrawl
, EasyRedir
, Forecheck
, Google Analytics
, Google Mobile-Friendly Test
, Google PageSpeed Insights
, Google Search Console
, Google XML Sitemaps
, GTmetrix
, HeadMasterSEO
, LinkPatrol
, Lipperhey
, OnCrawl
, Panguin Tool
, Raven Tools
, Screaming Frog
, Seobility
, Seomator
, SERPmetrics
, Siteliner
, Topvisor
, Varvy SEO Tool
, Whitespark
, Woorank
, Yoast
, Zadroweb
, Answer The Public
, ClearScope
, Exploding Topics
, FAQfox
, Google Keyword Planner
, Google Location
, Google Trends
, Google Data Studio
, Gookey
, GrepWords
, HitTail
, Imforsmb
, iSpionage
, Jaaxy
, Keyword Eye
, Keyword Revealer
, Keyword Snatcher
, Keyworddit
, KeywordIn
, Keywords Everywhere
, KeywordSpy
, KeywordTool.io
, Kombinator
, kwfinder
, Long Tail Pro
, Power Suggest Pro
, QuestionDB
, SanityCheck
, SECockpit
, Seed Keywords
, SEMrush
, SERPStat
, SimilarWeb
, Soovle
, SpyFu
,
StoryBase
, TermExplorer
, TwinWord
, UberSuggest
, Webtexttool
, Wondersearch
, Wordstream's Free Keyword Tools
, WordTracker
, Wordtracker Scout
, Advanced Web Ranking
, Agency Analytics
, AMZ Tracker
, Authority Labs
, GeoRanker
, Microsite Masters
, NightWatch
, Pro Rank Tracker
, Rank Ranger
, Rival IQ
, SE Ranking
, Search Latte
, Serpfox
, SERPs.com
, SERPWoo
, Sistrix
, WebCEO
, WordTail
, Animalz Revive
, BuzzSumo
, Can I Rank
, ClickFlow
, Google SERP Preview Tool
, Keys4Up
, LSIGraph
, MarketMuse
, MetaTags.io
, nTopic
, Positionly
, Ryte
, SEOptimer
, TrendSpottr
, Upcity
, WordLift
, Ahrefs
, cognitiveSEO
, Kerboo
, Majestic
, Moz
, MozBar
, SEO PowerSuite
, SEOGadget
, ShareMetric
,
URL Profiler
, WebMeUp Backlink Tool
, Morningfame
, Social Blade
, TubeBuddy
, VidIQ
, YTCockpit
, AuthoritySpy
, Buzzstream
, DIBZ
, disavow.it
, Domain Hunter Plus
, GroupHigh
, HARO
, JustReachOut
, Linkbird
, Linkody
, Linkstant
, MailShake
, Muck Rack
, Ninja Outreach
, Ontolo
, Pitchbox
, Remove'em
, Rmoov
, ScrapeBox
, Tableau
, qlik
, Power BI
New version showing proportions none versus 1+:
Still not sure how to approach that in a better way. For now I have filtered the descriptions for the following strings:
[0-9]+ years experience
and1 year experience
experience: [0-9]+ year
experience [0-9]+ year
experience of [0-9]+ year
Note: I checked some manually, especially the high values. An experience of 23+ years is really required by two job positions, but one with 30 years experience was a mistake that was likely to occur—it was part of the employer description: "the language recruitment specialist with over 30 years experience" and was thus removed.
Bars showing average and standard deviation:
For example, "SEO Director" by Etsy in Brooklyn, NY, requires JavaScript, HTML and CSS with an estimated average salary is $116K (estimated range $108-123K). The average salary in the state of New York is $71.6K and $70K as median salary.
Only English-speaking countries:
(The date in parenthesis indicates the week that covered the first of each month.)