Skip to content

algsoch/job_agentic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

2 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿš€ Job Intelligence Operating System

A production-ready, autonomous job application automation system powered by AI.

Demo Python Ollama License

๐Ÿ“บ Demo

Watch the system in action:

Job Intelligence OS Demo

๐Ÿ‘‰ Click to watch the full demo video - See how the system collects jobs from LinkedIn, GitHub, Naukri, YCombinator, and Wellfound, scores them with AI, and generates personalized outreach emails.

โœจ What This Is

  • ๐Ÿค– Fully Autonomous - Backend automation system (no UI, no manual intervention)
  • โฐ Runs Unattended - Schedule via cron for daily/weekly execution
  • ๐Ÿง  AI-Powered Decisions - APPLY / APPLY_LATER / WATCH / SKIP with reasoning
  • ๐Ÿ“ง Personalized Outreach - Auto-generates tailored emails for each job
  • ๐Ÿ›ก๏ธ Production-Grade - Survives failures, maintains state, produces audit logs
  • ๐Ÿ“Š Beautiful Terminal UI - Rich progress bars, tables, and analytics

๐Ÿšซ What This Is NOT

  • โŒ Not an agent framework experiment
  • โŒ Not a demo or prototype
  • โŒ Not dependent on paid APIs (100% local LLM)
  • โŒ Not a spam machine (intelligent filtering + rate limiting)

๐Ÿ—๏ธ Architecture Principles

  1. Rule-based first, LLM second - Fast, deterministic logic before expensive inference
  2. Idempotent by design - Same input โ†’ same output, safe to re-run
  3. Fail gracefully - One broken source doesn't crash the pipeline
  4. Auditable - Every decision has a reason, every action is logged
  5. Modular - Each component is testable, replaceable, inspectable

๐Ÿ”„ System Flow

๐Ÿ“ฅ Collect โ†’ ๐Ÿงน Clean โ†’ ๐Ÿ” Dedupe โ†’ ๐ŸŽฏ Enrich โ†’ ๐Ÿ“Š Score โ†’ ๐Ÿง  Decide โ†’ ๐Ÿ“ง Outreach โ†’ ๐Ÿ’พ Store

๐Ÿ› ๏ธ Tech Stack

Component Technology Purpose
Language Python 3.11+ Core runtime
LLM Ollama (llama3.1:8b) Local AI inference (no API costs)
Storage CSV + SQLite Primary + queryable storage
Scraping httpx, BeautifulSoup, Playwright Job data collection
Email SMTP (Gmail) Automated outreach
Terminal UI Rich library Beautiful CLI experience
CLI Typer Command-line interface
Scheduler cron Automated execution

๐ŸŽฏ Key Features

Multi-Source Job Collection

  • LinkedIn - Scrapes new grad & early career positions
  • GitHub - Careers page + repos with hiring in README + issues
  • Naukri.com - India's largest job portal
  • YCombinator - Startup job board
  • Wellfound (AngelList) - Startup hiring platform

Deep Profile Analysis

  • GitHub Analysis - Analyzes all 92 repositories with AI insights
  • LinkedIn Scraping - Uses Playwright for dynamic content
  • Auto-send Hiring Alerts - Detects hiring posts and emails immediately
  • CSV Export - Complete job data export for analysis

Intelligent Decision Making

  • Rule-based Scoring - Fast filtering based on experience, tech stack, location
  • AI Reasoning - Uses Ollama for complex job description analysis
  • Match Scoring - 0-100% compatibility score with detailed reasoning
  • Smart Decisions - APPLY (75+), APPLY_LATER (50-74), WATCH (30-49), SKIP (<30)

Beautiful Terminal Reports

  • All Jobs List - Complete table with company, role, location, score, decision
  • Company Breakdown - Top 15 companies with job counts and percentages
  • Location Analysis - Top 10 locations with visual bars
  • Skills Analysis - Most in-demand (top 15) and least common (bottom 10) skills
  • Score Distribution - Excellent/Good/Fair/Poor ranges with counts
  • Summary Statistics - Average score, remote jobs, email found, unique companies

Enhanced Email Notifications

  • Job Cards with URLs - Every job shows clickable "Apply Now" button
  • AI Reasoning - "Why you should apply" for each position
  • Match Scores - Color-coded badges (high/medium/low)
  • Visual Design - Professional HTML email with gradient headers
  • Organized Sections - APPLY jobs first, then other decisions

๐Ÿš€ Quick Start

# 1. Clone the repository
git clone https://github.com/algsoch/job_agentic.git
cd job_agentic

# 2. Install Ollama and pull the model
curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3.1:8b

# 3. Install Python dependencies
pip install -e .

# 4. Configure environment (create .env file)
cp .env.example .env
# Edit .env with your settings:
# - SMTP credentials (Gmail app password)
# - GitHub token (for 5000 requests/hour)
# - Job search preferences
# - Enabled collectors (linkedin,github,naukri,ycombinator,wellfound)

# 5. Dry run (no emails sent)
python3 cli.py run --dry-run

# 6. Real run (all sources)
python3 cli.py run

# 7. Check CSV output
cat jobs.csv

# 8. View detailed terminal report
# (automatically shown after pipeline completion)

๐Ÿ“ Directory Structure

job_agentic/
โ”œโ”€โ”€ ๐ŸŽฏ core/              # Orchestration & state machine
โ”‚   โ”œโ”€โ”€ engine.py         # Main pipeline orchestrator
โ”‚   โ”œโ”€โ”€ models.py         # Data models (Job, Decision, PipelineResult)
โ”‚   โ””โ”€โ”€ state.py          # State management & persistence
โ”œโ”€โ”€ ๐ŸŒ collectors/        # Job scrapers (one per source)
โ”‚   โ”œโ”€โ”€ base.py           # Abstract collector interface
โ”‚   โ”œโ”€โ”€ linkedin.py       # LinkedIn job scraper
โ”‚   โ”œโ”€โ”€ github.py         # GitHub careers + repos + issues
โ”‚   โ”œโ”€โ”€ naukri.py         # Naukri.com (India) scraper
โ”‚   โ”œโ”€โ”€ ycombinator.py    # YC job board
โ”‚   โ””โ”€โ”€ wellfound.py      # Wellfound/AngelList jobs
โ”œโ”€โ”€ ๐Ÿง  intelligence/      # Decision engine
โ”‚   โ”œโ”€โ”€ rules.py          # Rule-based scoring logic
โ”‚   โ”œโ”€โ”€ scorer.py         # Job compatibility scoring
โ”‚   โ””โ”€โ”€ decider.py        # Final decision maker (APPLY/SKIP/etc)
โ”œโ”€โ”€ ๐ŸŽจ enrichment/        # Data enrichment & analysis
โ”‚   โ”œโ”€โ”€ email_finder.py   # Email discovery via Hunter.io/Clearbit
โ”‚   โ”œโ”€โ”€ profile_report.py # Deep GitHub + LinkedIn analysis
โ”‚   โ””โ”€โ”€ company_research.py # Company data enrichment
โ”œโ”€โ”€ ๐Ÿ“ง outreach/          # Email automation system
โ”‚   โ”œโ”€โ”€ composer.py       # Email template & personalization
โ”‚   โ”œโ”€โ”€ sender.py         # SMTP sending with rate limits
โ”‚   โ””โ”€โ”€ templates/        # Email templates (HTML/text)
โ”œโ”€โ”€ ๐Ÿ’พ storage/           # Persistence layer
โ”‚   โ”œโ”€โ”€ csv_store.py      # Primary CSV storage
โ”‚   โ”œโ”€โ”€ sqlite_store.py   # Queryable SQL database
โ”‚   โ””โ”€โ”€ backup.py         # Automated daily backups
โ”œโ”€โ”€ ๐Ÿ“Š observability/     # Monitoring & debugging
โ”‚   โ”œโ”€โ”€ logger.py         # Structured logging
โ”‚   โ”œโ”€โ”€ metrics.py        # Statistics tracking
โ”‚   โ”œโ”€โ”€ notifier.py       # Email notifications (completion/errors)
โ”‚   โ””โ”€โ”€ circuit_breaker.py # Failure protection
โ”œโ”€โ”€ cli.py                # Typer CLI interface + Rich UI
โ”œโ”€โ”€ config.py             # Configuration management
โ”œโ”€โ”€ requirements.txt      # Python dependencies
โ””โ”€โ”€ .env.example          # Environment variables template

๐Ÿ“Š Data Model

Each job record contains:

Field Type Description
job_id str Unique hash (prevents duplicates)
company str Company name
role str Job title
source str Collection source (linkedin, github, etc)
job_url str Application link
description str Job description text
location str Job location
salary_min/max int Salary range (if available)
email str Hiring manager email
email_confidence float Email validity score (0-1)
score int Match score (0-100)
decision Decision APPLY/APPLY_LATER/WATCH/SKIP
reason str AI-generated reasoning
status JobStatus NEW/ENRICHED/DECIDED/SENT
applied_on datetime Application timestamp
scraped_at datetime Collection timestamp
updated_at datetime Last update timestamp

๐Ÿง  Decision Logic

1. Rule-Based Scoring (Fast โšก)

score = 0
+ 30 points  # Experience match (0-3 years)
+ 20 points  # Tech stack match (Python, React, FastAPI, etc)
+ 15 points  # Location match (Bengaluru, Remote, etc)
+ 10 points  # Company stage match
+ 15 points  # Posted recently (within 7 days)
= Total Score (0-100)

2. LLM Reasoning (When Needed ๐Ÿค–)

  • Ambiguous job descriptions โ†’ AI analysis
  • Cultural fit assessment โ†’ Sentiment analysis
  • Email personalization โ†’ Context-aware writing
  • Profile matching โ†’ Deep skill comparison

3. Final Decision Tree

Score >= 75    โ†’ โœ… APPLY         (Send email immediately)
Score 50-74    โ†’ โฐ APPLY_LATER   (Review manually first)
Score 30-49    โ†’ ๐Ÿ‘€ WATCH         (Monitor for changes)
Score < 30     โ†’ โญ๏ธ SKIP          (Not a good match)

๐Ÿ›ก๏ธ Reliability Features

Feature Implementation Benefit
Idempotency Job hash (MD5 of company+role+url) No duplicate applications
Circuit Breakers Pause failing sources after 3 errors Prevent cascade failures
Retry Logic Exponential backoff (1s, 2s, 4s, 8s) Handle transient errors
Partial Failures Continue pipeline if one step fails Maximize job collection
Daily Backups Automated CSV snapshots to backups/ Data loss prevention
Rate Limiting Max 10 emails/hour (Gmail limits) Avoid spam filters
Error Logging Structured logs with traceback Easy debugging

โฐ Cron Automation

Daily Job Search (Recommended)

# Run daily at 9 AM
0 9 * * * cd /path/to/job_agentic && /path/to/venv/bin/python3 cli.py run >> /var/log/jobctl.log 2>&1

Weekly Analytics Report

# Weekly summary (Sundays at 10 AM)
0 10 * * 0 cd /path/to/job_agentic && /path/to/venv/bin/python3 cli.py stats --last 7d

Hourly Profile Updates

# Update GitHub/LinkedIn profile every 6 hours
0 */6 * * * cd /path/to/job_agentic && /path/to/venv/bin/python3 cli.py analyze-profile

โœ… Production Checklist

Before running in production, ensure:

  • Ollama Installed - ollama list shows llama3.1:8b
  • Gmail App Password - Created and added to .env
  • GitHub Token - Personal access token for 5000 requests/hour
  • Environment Variables - .env file configured with all settings
  • Playwright Browser - playwright install chromium (for LinkedIn)
  • Test Run Completed - python3 cli.py run --dry-run (no errors)
  • Email Sending Tested - Verify emails reach inbox (not spam)
  • Cron Job Scheduled - Automated daily execution configured
  • Log Rotation Configured - Prevent disk space issues
  • Backup Location Verified - backups/ directory accessible
  • Resume File Present - resume.pdf in project root

๐Ÿ“ˆ Monitoring & Analytics

Terminal Report (Auto-generated)

After each run, the system displays:

  • All Jobs Table - Complete list with scores and decisions
  • Company Breakdown - Top 15 hiring companies with percentages
  • Location Analysis - Top 10 locations with visual distribution
  • Skills Demand - Most/least in-demand technologies
  • Score Distribution - Job quality breakdown (Excellent/Good/Fair/Poor)
  • Summary Stats - Avg score, remote jobs, unique companies

Key Metrics to Track

jobs_scraped_per_source     # Collection efficiency
decision_distribution       # APPLY/SKIP ratio
email_send_success_rate     # Outreach effectiveness
source_failure_rate         # Scraper health
pipeline_execution_time     # Performance monitoring

Log Files

  • logs/jobctl.log - Main application log
  • logs/errors.log - Error-only log
  • logs/email_sent.log - Outreach audit trail

๐Ÿ”ง Configuration

Environment Variables (.env)

# SMTP Email Configuration
SMTP_HOST=smtp.gmail.com
SMTP_PORT=587
SMTP_USERNAME=your-email@gmail.com
SMTP_PASSWORD=your-app-password
EMAIL_FROM=your-email@gmail.com
EMAIL_FROM_NAME=Your Name

# GitHub API (for 5000 requests/hour)
GITHUB_TOKEN=ghp_your_personal_access_token

# Job Search Preferences
TARGET_ROLES=Software Engineer,Backend Engineer,AI Engineer
TARGET_LOCATIONS=Remote,Bengaluru,San Francisco
MIN_EXPERIENCE=0
MAX_EXPERIENCE=3
REQUIRED_SKILLS=Python,FastAPI,React,LangChain

# Enabled Collectors (comma-separated)
ENABLED_COLLECTORS=linkedin,github,naukri,ycombinator,wellfound

# Profile Information
YOUR_NAME=Vicky Kumar
YOUR_GITHUB=https://github.com/algsoch
YOUR_LINKEDIN=https://www.linkedin.com/in/algsoch/
YOUR_PORTFOLIO=https://ai-engineer-chatbot.onrender.com/
RESUME_PATH=./resume.pdf

๐ŸŽจ Sample Output

Pipeline Execution

โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚  ๐Ÿš€ JOB INTELLIGENCE OPERATING SYSTEM  โ”‚
โ”‚     Autonomous Job Application Agent    โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

๐Ÿ” COLLECTING JOBS FROM ALL SOURCES

โ”Œโ”€ LinkedIn Collector โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Found 35 jobs from LinkedIn         โ”‚
โ”‚ โœ“ Software Engineering, New Grad    โ”‚
โ”‚ โœ“ Backend Engineer - Early Career   โ”‚
โ”‚ โœ“ Full Stack Developer (0-2 years) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

โ”Œโ”€ GitHub Collector โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ โœ“ Scraping GitHub Careers Page      โ”‚
โ”‚ โœ“ Searching repos with hiring       โ”‚
โ”‚ โœ“ Searching GitHub issues            โ”‚
โ”‚ Found 12 jobs from GitHub            โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ“Š Pipeline Complete in 69.1s

โ•ญโ”€ RESULTS โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ Jobs Collected:      35           โ”‚
โ”‚ Emails Sent:         0            โ”‚
โ”‚ Decisions Made:                   โ”‚
โ”‚   โœ… APPLY:          15           โ”‚
โ”‚   โญ๏ธ SKIP:           20           โ”‚
โ”‚ Success Rate:        42.9%        โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

๐Ÿ“‹ ALL JOBS FOUND (35 total)

โ”Œโ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ # โ”‚ Company     โ”‚ Role                     โ”‚ Location   โ”‚ Scoreโ”‚ Decision โ”‚
โ”œโ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ 1 โ”‚ Stripe      โ”‚ Software Engineer, New.. โ”‚ Bengaluru  โ”‚  55  โ”‚ โœ… APPLY โ”‚
โ”‚ 2 โ”‚ Notion      โ”‚ Fullstack Early Career   โ”‚ Remote     โ”‚  45  โ”‚ โœ… APPLY โ”‚
โ”‚ 3 โ”‚ Clear       โ”‚ Backend Engineer         โ”‚ New York   โ”‚  14  โ”‚ โญ๏ธ SKIP  โ”‚
โ””โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿข COMPANY BREAKDOWN (Top 15)
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
Stripe          3 jobs (8.6%)  โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ
Notion          2 jobs (5.7%)  โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ
GitHub          2 jobs (5.7%)  โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ

๐Ÿš€ Future Enhancements

  • More Job Sources - Indeed, Glassdoor, Stack Overflow Jobs
  • Application Tracking - Monitor responses, interviews, rejections
  • A/B Testing - Test different email templates and measure success
  • Telegram/Slack Notifications - Real-time alerts for high-priority matches
  • ML-Based Scoring - Train model on historical application outcomes
  • Resume Tailoring - Auto-generate customized resumes per job
  • Interview Prep - AI-generated company/role-specific prep materials
  • Salary Negotiation - Data-driven compensation recommendations

๐Ÿ“œ License

MIT License - See LICENSE file for details

๐Ÿค Contributing

Contributions welcome! Please:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

๐Ÿ’ฌ Support

๐ŸŒŸ Acknowledgments

Built with:


๐Ÿ’ญ Philosophy

"This system should run for 6 months without human intervention, make intelligent decisions, and never embarrass you with spam."

Built for reliability, not novelty. Production-ready, not prototype.


โญ Star this repo if you find it useful!

Made with โค๏ธ by Vicky Kumar

About

job searching online ai agent

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages