# Makeover Monday | 2026 W4 Job titles in Highest Deman - State by State

The data for this weeks Makeover Monday challenge came from the article published by Lightcast "[Job titles in highest-demand, state by state](https://lightcast.io/resources/blog/most-posted-for-jobs-in-each-us-state)" (published on Nov 19, 2025 by Hannah Grieser & JP Lespinasse). The article provides a high-level view of the overall national and state-level hiring trends and includes an interactive US map encouraging users to explore the data for themselves. The data used in this notebook and the final Tableau Dashboard was prepared by the [Makeover Monday](https://makeovermonday.co.uk/), for use in the Tableau Makeover Monday Data Visualization challenge.

**How Lightcast Collects their Data**

I was interested in understanding how this data was collected, so I took a closer look at Lightcast’s methodology. Lightcast is a labor market and economic data intelligence company that specializes in large-scale job posting and workforce data across the world.
Their data is compiled from millions of online job postings, employer career sites, and job boards, which are then cleaned and refined by AI and human experts to create a comprehensive database of job posting activity.

see their [About : How We Do It](https://lightcast.io/why-lightcast/about) section on their website for mor information



* **Article Link**: https://lightcast.io/resources/blog/most-posted-for-jobs-in-each-us-state

* **Compiled Data**: (*publisehd by Makeover Monday*):https://data.world/makeovermonday/2025w4-most-posted-us-jobs-by-state


In [1]:
import pandas as pd
import pandas_gbq
import numpy as np
import geopandas as gpd #you need this to use the spatial files

In [2]:
df = pd.read_csv('https://query.data.world/s/cexlh3qmmucu3oifgigaual53fahxm?dws=00000',sep=';')

## Exploratory Data Analysis

In [3]:
df.describe(include='all').T

Unnamed: 0,count,unique,top,freq
State Name,50,50,Alabama,1
1st Place,50,2,Registered Nurse,49
2nd Place,50,5,Retail Sales Associate,31
3rd Place,50,8,Tractor-Trailer Truck Driver,16


In [4]:
df.head()

Unnamed: 0,State Name,1st Place,2nd Place,3rd Place
0,Alabama,Registered Nurse,Tractor-Trailer Truck Driver,Retail Sales Associate
1,Alaska,Registered Nurse,Retail Sales Associate,Physician
2,Arizona,Registered Nurse,Retail Sales Associate,Physician
3,Arkansas,Registered Nurse,Tractor-Trailer Truck Driver,Retail Sales Associate
4,California,Registered Nurse,Retail Sales Associate,Sales Representative


In [5]:
first_place = df['1st Place'].unique()
print(first_place)

['Registered Nurse' 'Retail Sales Associate']


In [6]:
second_place = df['2nd Place'].unique()
print(second_place)

['Tractor-Trailer Truck Driver' 'Retail Sales Associate'
 'Software Developer / Engineer' 'Physician' 'Registered Nurse']


In [7]:
third_place = df['3rd Place'].unique()
print(third_place)

['Retail Sales Associate' 'Physician' 'Sales Representative'
 'Tractor-Trailer Truck Driver' 'Customer Service Representative'
 'Retail Store Manager / Supervisor' 'Software Developer / Engineer'
 'Licensed Practical / Vocational Nurse']


# Plan for Tableau Viz: **Top Jobs vs Popular Jobs**

The plan for this visualization is to explore the relationship between top-ranked jobs (1st place by state) and overall job popularity across the US using a simple bar chart paired with a supporting hexmap.

**Have you ever heard of Ranked Choice Voting?**

This was the first thing that came to mind when looking at this data, and I immediately thought - how can I show the interaction between the top job (1st place - ranked choice) and the most popular jobs in the USA. Maybe they are the same, maybe they aren't.
There isn't a lot of data processing that needs to be done for this, the data is all here we just need to reshape it to be "long" so my "melting technique" which I use for everything will apply - and then we just count!

**Hexmap to show Highlight Geospatial Relationship**

My favorite way to show US state data (when you dont really care about where in the state the data is coming from) is to use a Hexmap and I find myself going back to this saved YouTube video over and over :
[Tableau Hexagon Map Tutorial: How to Create Padded & Non-Padded Hex Maps](https://www.youtube.com/watch?v=f4teeqBkwLs).
The Hexmap in my viz is not going to do the heavy lifting, its going to be an aside in the dashboard - what I really want to show is the bar chart.

Here is where is gets complicated - in the youtube tutorial they use a relationship - which is great but I dont want to do that.  I will be enriching my df_long with the spatial files from https://stanke.co/hexstatespadded/
I dont want to pull in two different files into Tableau - I just want 1.

## Data Processing

In [8]:
#clean up the column names to snake_case
df_1 = df.copy()
df_1.columns = (
    df.columns
      .astype(str)
      .str.strip()
      .str.replace(r"\s+", "_", regex=True)
      .str.lower()
)

df_1.head()

Unnamed: 0,state_name,1st_place,2nd_place,3rd_place
0,Alabama,Registered Nurse,Tractor-Trailer Truck Driver,Retail Sales Associate
1,Alaska,Registered Nurse,Retail Sales Associate,Physician
2,Arizona,Registered Nurse,Retail Sales Associate,Physician
3,Arkansas,Registered Nurse,Tractor-Trailer Truck Driver,Retail Sales Associate
4,California,Registered Nurse,Retail Sales Associate,Sales Representative


In [9]:
##create the df_long which unpivot the data

df_1['count'] = 1

df_long = (
    df_1
      .melt(
          id_vars=["state_name"],
          var_name='rank_text',
          value_name='job_title'
      )
)

df_long['rank'] = df_long['rank_text'].str.replace(r'st_place|nd_place|rd_place', '', regex=True) #This replaces '1st_place' with '1', '2nd_place' with '2', etc.
df_long

Unnamed: 0,state_name,rank_text,job_title,rank
0,Alabama,1st_place,Registered Nurse,1
1,Alaska,1st_place,Registered Nurse,1
2,Arizona,1st_place,Registered Nurse,1
3,Arkansas,1st_place,Registered Nurse,1
4,California,1st_place,Registered Nurse,1
...,...,...,...,...
195,Virginia,count,1,count
196,Washington,count,1,count
197,West Virginia,count,1,count
198,Wisconsin,count,1,count


In [10]:
job_title_count = df_long['job_title'].nunique()
job_title = df_long['job_title'].unique()
print(f'{job_title_count} unique jobs'),
print(job_title)

10 unique jobs
['Registered Nurse' 'Retail Sales Associate'
 'Tractor-Trailer Truck Driver' 'Software Developer / Engineer'
 'Physician' 'Sales Representative' 'Customer Service Representative'
 'Retail Store Manager / Supervisor'
 'Licensed Practical / Vocational Nurse' 1]


### Enrich the Data with the Spatial Data

In [11]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [12]:
hex_gdf = gpd.read_file("/content/drive/MyDrive/Makeover Monday/HexStatesPadded/HexStatesPadded.shp")

### Inspect the HeStates Padded Shapefile

We want to see:
 * state_name or state
 * state_abbr
 * geometry << this is the polygon object
This geometry column is the hex shape, that fit so nicely together and are arranged so perfectly. I am not going to try to recreate these - as someone else has already done the work!
see https://stanke.co/hexstatespadded/

In [15]:
hex_gdf.columns

Index(['State', 'State_Abbr', 'geometry'], dtype='object')

In [16]:
hex_gdf.head()

Unnamed: 0,State,State_Abbr,geometry
0,Alabama,AL,"POLYGON ((14.06 0.133, 14.06 1.156, 14.99 1.71..."
1,Alaska,AK,"POLYGON ((0.06 10.333, 0.06 11.356, 0.99 11.91..."
2,Arizona,AZ,"POLYGON ((5.06 1.833, 5.06 2.856, 5.99 3.414, ..."
3,Arkansas,AR,"POLYGON ((11.06 1.833, 11.06 2.856, 11.99 3.41..."
4,California,CA,"POLYGON ((3.06 1.833, 3.06 2.856, 3.99 3.414, ..."


You can see that the geometry field is a POLYGON - and what we want to do is get the **boundary coordinates** i.e. the lines/vertices that make up the hexagon shape. To do this we need to convert the polygon to lines - and then pull out the points in those lines. This can be done using **exterior.coords** by accessing the coordinate points that define the outer boundary (exterior ring) of a polygon object

In [18]:
rows = []

for _, r in hex_gdf.iterrows():
    state = r["State"]
    coords = list(r.geometry.exterior.coords)  #exterior.coords gives the ordered vertices around the polygon /  coordinate points that define the outer boundary (exterior ring) of a polygon object

    for path, (x, y) in enumerate(coords):
        rows.append({"State": state, "x": x, "y": y, "path": path}) #that path is what tells tableau to connect all the parts back into a hexagon shape

hex_points_clean = pd.DataFrame(rows)

hex_points_clean.head()

Unnamed: 0,State,x,y,path
0,Alabama,14.06,0.133,0
1,Alabama,14.06,1.156,1
2,Alabama,14.99,1.714,2
3,Alabama,15.92,1.156,3
4,Alabama,15.92,0.133,4


Now join hex_points_clean to df_long for the final data and load that into Tableau for visualization!

In [20]:
df_final = pd.merge(
    df_long,
    hex_points_clean,
    left_on='state_name',
    right_on='State',
    how='left'
)

df_final.shape
##note - you will have 6x the number of rows in df_long because of hex_points_clean

(1400, 8)

In [21]:
df_final.head()

Unnamed: 0,state_name,rank_text,job_title,rank,State,x,y,path
0,Alabama,1st_place,Registered Nurse,1,Alabama,14.06,0.133,0
1,Alabama,1st_place,Registered Nurse,1,Alabama,14.06,1.156,1
2,Alabama,1st_place,Registered Nurse,1,Alabama,14.99,1.714,2
3,Alabama,1st_place,Registered Nurse,1,Alabama,15.92,1.156,3
4,Alabama,1st_place,Registered Nurse,1,Alabama,15.92,0.133,4


# Save to BigQuery Table

Save the df_final to a dedicated table in my BigQuery warehouse which I will then use with Connected Sheets to connect to Tableau Public.

In [23]:
#convert the df to target table in bigquery dataet
UPLOAD_TO_BQ = False # set True when you actually want to write tables

if UPLOAD_TO_BQ:
  project_id = 'data-projects-478723' #'your_project_id'
  destination_table = 'makeover_monday.highest_demand_job_titles' #'your_dataset.another_new_table'

  pandas_gbq.to_gbq(
      dataframe=df_final,
      destination_table=destination_table,
      project_id=project_id,
      if_exists='replace' ## 'if_exists' options: 'fail', 'replace', 'append'
  )

100%|██████████| 1/1 [00:00<00:00, 6195.43it/s]


# Tableau Dashboard