## Job Market Pulse 

# Job Market Pulse

A Python-based tool to analyze and optimize my job search using data-driven insights. I call it David vs. Immigration Policy, jokes, Job-Market pulse is more palpable (this ones for you Hiring agents and HR reps)...

## Features

- Visualizes my application response times
- Tracks industry-wise outcomes
- Extracts keywords from job ads to optimize my CV
- Highlights which strategies (custom cover letters, LinkedIn reach-outs) are actually working

## Technologies

- Python
- pandas
- seaborn & matplotlib
- sklearn & WordCloud
- CSV as a lightweight backend

## How to Use

1. Create `applications.csv`
2. Run `job_tracker.py` generating visual insights.
3. Use the data to adjust my job search strategy.


Job_Title,Company,Industry,Application_Date,Response_Date,Outcome,CV_Version,Cover_Letter_Customized,LinkedIn_Connection,Location,Salary_Range,Job_Description
Data Analyst,Globex Corp,Healthcare,2025-03-10,2025-03-14,Rejected,CV_v2,Yes,Yes,Geneva,80000-100000,"Responsible for analyzing healthcare data..."
Machine Learning Engineer,Initech,Tech,2025-03-11,,Ghosted,CV_v1,No,No,Berlin,,"Develop predictive models using Python and TensorFlow..."
Business Intelligence Analyst,Hooli,Finance,2025-03-12,2025-03-17,Interview,CV_v3,Yes,Yes,Zurich,90000-110000,"Create dashboards and work with SQL..."

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from wordcloud import WordCloud
from sklearn.feature_extraction.text import CountVectorizer
import datetime

# Load your CSV
df = pd.read_csv("applications.csv", parse_dates=["Application_Date", "Response_Date"])


In [None]:
# Clean column names if needed
df.columns = [col.strip().replace(" ", "_") for col in df.columns]

# --- 1. Basic Stats ---
print("Total applications:", len(df))
print("Responses received:", df['Response_Date'].notna().sum())
print("Interview rate:", (df['Outcome'] == 'Interview').mean())


In [None]:
# --- 2. Time to Response ---
df['Days_to_Response'] = (df['Response_Date'] - df['Application_Date']).dt.days
sns.histplot(df['Days_to_Response'].dropna(), bins=10)
plt.title("Days Between Application and Response")
plt.xlabel("Days")
plt.ylabel("Number of Applications")
plt.show()

# --- 3. Outcome by Industry ---
plt.figure(figsize=(8, 4))
sns.countplot(data=df, x='Industry', hue='Outcome')
plt.title("Outcomes by Industry")
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

In [None]:
# --- 4. Keyword Extraction ---
job_descriptions = df['Job_Description'].dropna().values
vectorizer = CountVectorizer(stop_words='english', max_features=50)
X = vectorizer.fit_transform(job_descriptions)
word_freq = X.sum(axis=0).A1
keywords = vectorizer.get_feature_names_out()

wordcloud = WordCloud(width=800, height=400).generate_from_frequencies(dict(zip(keywords, word_freq)))
plt.figure(figsize=(10, 5))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.title("Top Keywords from Job Descriptions")
plt.show()

## No response at all:
The black hole of job applications—where hope goes to scream silently into the void.

I realise I am not alone. Most modern recruitment systems are designed less for human interaction and more for funneling the unwashed masses through a Kafkaesque filter of keyword checks, HR apathy, and ATS algorithms that would ghost their own creators.

But here's the good news: my lack of rejections is data. In fact, it's some of the most valuable data—I just have to read the silence correctly.

#### 1 - My resume didn’t even reach a human.
Translation: ATS filtered you out. Time to play the keyword game, not the competence game.

#### 2 - My resume was seen, but didn’t spark.
I was beige at KitKat. Either:
I'm not signaling industry alignment
or my resume has a "junior coder just left bootcamp" smell
Or worse, it’s just... fine. And fine is fatal.
#### 3- No one is reading applications at all.
The job post is a formality, already filled internally or paused.
(I'm positing discovering a depressingly high number fall into this category.)

time to create a ghost rate metric: 
- Plot ghost rate by Industry
- Plot ghost rate by CV version
- Plot ghost rate by LinkedIn connection (did you message someone? yes/no)
This transforms my existential despair into usable intelligence.

In [None]:
# Ghost rate metric

['Ghosted'] = df['Outcome'].isna() & df['Response_Date'].isna()
ghost_rate = df['Ghosted'].mean()
print(f"Ghosted rate: {ghost_rate:.2%}")

## Adding Keyword Matching from CV to Job Ads

In [None]:
from sklearn.feature_extraction.text import TfidfVectorizer

# Assuming your CV text is in a string
with open("my_cv.txt", "r") as file:
    cv_text = file.read()

# TF-IDF between your CV and job ads
job_descs = df['Job_Description'].dropna().tolist()
vectorizer = TfidfVectorizer(stop_words='english')
tfidf_matrix = vectorizer.fit_transform([cv_text] + job_descs)

from sklearn.metrics.pairwise import cosine_similarity
similarities = cosine_similarity(tfidf_matrix[0:1], tfidf_matrix[1:]).flatten()

df.loc[df['Job_Description'].notna(), 'CV_Match_Score'] = similarities
print(df[['Job_Title', 'CV_Match_Score']].sort_values(by='CV_Match_Score', ascending=False))


Do I get ghosted more when my CV is a bad match?
What keywords am I consistently missing?

Sidequest: LinkedIn Cold Message Tracker

- Add a field to my CSV:

- LinkedIn_Contacted: Yes/No
- Contact_Name: If applicable
- Analyze: Ghost rate when you didn’t message anyone? Probably high. 
        Response rate after connecting with someone? Slightly less miserable.