Trendwork: Job Market Intelligence Platform

Trendwork is a data-driven platform designed to aggregate, process, and analyze job listings from multiple job boards. It provides a centralized view of market trends, skill demands, and salary insights by leveraging modern data engineering practices and Artificial Intelligence.

Core Objective

The primary goal of Trendwork is to automate the extraction of job market data and transform it into actionable intelligence. By using advanced NLP and geocoding, it identifies emerging skills and geographic hotspots for various roles.

Architecture

The platform follows a Medallion Architecture (Bronze, Silver, Gold) implemented on Google Cloud Platform to ensure data quality and lineage.

Bronze Layer (Raw Data)

Scrapers running on Cloud Run extract raw JSON data from job boards.
Data is stored as immutable objects in Google Cloud Storage.
Cloud Scheduler triggers automated scraping cycles.

Silver Layer (Cleaned Data)

A Cloud Function (Processor) is triggered by new files in Cloud Storage.
Performs data cleaning, normalization, and deduplication.
Loads structured data into BigQuery silver tables.
Implements fail-safe mechanisms for missing fields like keywords from file metadata.

Gold Layer (Enriched insights)

An Enrichment service utilizes Vertex AI (Gemini 2.0 Flash) to extract specific information:
- Skill requirements.
- Salary ranges.
- Role summaries.
Geocoding services convert location strings into latitude and longitude.
Final enriched data is stored in BigQuery gold tables for high-performance querying.

Technical Stack

Infrastructure: Terraform for Reproducible Infrastructure as Code.
Compute: Cloud Run and Cloud Run Functions (Python 3.11).
Storage: Google Cloud Storage and BigQuery.
AI/ML: Vertex AI (Gemini) for text extraction and analysis.
Visualization: Streamlit for interactive dashboards.
Observability: OpenTelemetry and Google Cloud Trace for distributed tracing.

Key Features

Automated Scraping: Bypasses modern bot detection mechanisms.
AI-Powered Extraction: Transforms unstructured job descriptions into structured skill lists and summaries.
Geographic Mapping: Visualizes job density across regions using interactive 3D maps.
Real-time Analysis: Dynamic filtering by job keywords and automatic metric calculations.

Business Questions Answered

Which job titles are experiencing the highest growth in demand?
What are the top technical skills required for specific roles in current market?
How do salary ranges vary across different geographic locations?
Are there emerging roles or skills appearing suddenly in recent postings?

Deployment Sequence

To successfully deploy the platform using Terraform, the following order must be observed:

Initial Infrastructure: Configure terraform.tfvars with your project_id and bucket_name.
Container Image: The dashboard service depends on a pre-existing container image. Before applying the dashboard infrastructure, build and push the image to Google Container Registry:
```
cd dashboard
docker build --platform linux/amd64 -t gcr.io/[PROJECT_ID]/jobstreet-dashboard:latest .
docker push gcr.io/[PROJECT_ID]/jobstreet-dashboard:latest
```
Terraform Apply: Run terraform apply to provision the Cloud Run services, BigQuery tables, and necessary IAM permissions.

Local Development

The dashboard can be run locally using streamlit run dashboard/app.py.
Ensure Google Cloud credentials are configured via gcloud auth login.
Secrets are managed through Streamlit secrets or environment variables.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
dashboard		dashboard
enricher		enricher
processor		processor
scraper		scraper
shared		shared
terraform		terraform
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Trendwork: Job Market Intelligence Platform

Core Objective

Architecture

Bronze Layer (Raw Data)

Silver Layer (Cleaned Data)

Gold Layer (Enriched insights)

Technical Stack

Key Features

Business Questions Answered

Deployment Sequence

Local Development

About

Uh oh!

Releases

Packages

Languages

coderJT/Trendwork

Folders and files

Latest commit

History

Repository files navigation

Trendwork: Job Market Intelligence Platform

Core Objective

Architecture

Bronze Layer (Raw Data)

Silver Layer (Cleaned Data)

Gold Layer (Enriched insights)

Technical Stack

Key Features

Business Questions Answered

Deployment Sequence

Local Development

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages