Argus

A Python-based job search agent that automatically crawls company career pages to find relevant job listings. It supports multiple Applicant Tracking Systems (ATS), user profiles for different preferences, and provides flexible filtering options.

Features

Multi-ATS Support: Automatically detects and crawls jobs from:
- Greenhouse
- Lever
- Ashby
- Workday
- Amazon (custom API)
- Google (custom fetcher)
- TikTok (custom API)
- Uber (custom API)
- Custom career pages (via Playwright)
User Profiles: Support for multiple users with different job preferences
- Each profile has its own titles, locations, and filters
- Results are stored separately per profile
- Optionally customize company lists per profile
Smart Filtering:
- Filter by job titles (with fuzzy matching)
- Filter by location (states, cities, remote)
- Exclude specific levels (staff, principal, lead, etc.)
Auto-Detection: Automatically detects ATS type from career URLs and finds direct API endpoints
Incremental Results: Saves results organized by date and company, avoiding duplicates across runs

Installation

# Clone the repository
git clone https://github.com/mshen1019/Argus.git
cd Argus

# (Optional) Create and activate a virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Install Playwright browsers (for custom ATS sites)
python -m playwright install chromium

Quick Start

# Run with the default profile
python run_search.py

# Run with a specific profile
python run_search.py alice

This will:

Load the profile configuration from config/profiles/<name>/
Load companies from the profile or fall back to config/companies.yaml
Crawl all companies and save matching jobs to job_results/<profile>/

User Profiles

Profiles allow multiple users to have different job search preferences. Each profile is a directory under config/profiles/ containing a titles.yaml file.

Profile Structure

config/
├── companies.yaml              # Global company list (shared by all profiles)
└── profiles/
    ├── default/
    │   └── titles.yaml         # Default user preferences
    ├── alice/
    │   ├── titles.yaml         # Alice's job preferences
    │   └── companies.yaml      # (Optional) Alice's custom company list
    └── bob/
        └── titles.yaml         # Bob's job preferences

Creating a New Profile

Create a new directory under config/profiles/:
```
mkdir config/profiles/myprofile
```

Create a titles.yaml with your preferences:

titles:
  - Data Scientist
  - Machine Learning Engineer
  - Research Scientist

locations:
  - California
  - New York
  - Remote

exclude_levels:
  - junior
  - intern

(Optional) Create a custom companies.yaml if you want to search different companies

Run your search:

python run_search.py myprofile
# or
python search.py --profile myprofile

Profile Output

Results are stored separately for each profile:

job_results/
├── default/
│   └── 2026-01-25/
│       ├── OpenAI/
│       │   └── jobs.json
│       └── ...
├── alice/
│   └── 2026-01-25/
│       └── ...
└── bob/
    └── 2026-01-25/
        └── ...

Configuration

Companies (`config/companies.yaml`)

Define the companies to crawl:

companies:
  - name: OpenAI
    career_url: https://jobs.ashbyhq.com/openai
    ats_type: ashby

  - name: Anthropic
    career_url: https://boards.greenhouse.io/anthropic
    ats_type: greenhouse

  - name: Amazon
    career_url: https://www.amazon.jobs
    ats_type: amazon

  - name: Google
    career_url: https://careers.google.com/jobs/results/
    ats_type: google

Supported ATS types:

greenhouse - Greenhouse.io job boards
lever - Lever.co job boards
ashby - Ashby HQ job boards
workday - Workday job sites
amazon - Amazon.jobs (custom API)
google - Google Careers (custom fetcher)
tiktok - TikTok/ByteDance careers (custom API)
uber - Uber careers (custom API)
meta - Meta careers (limited due to bot detection)
custom - Custom career pages (uses Playwright)

Job Titles & Filters (`config/profiles/<name>/titles.yaml`)

Configure target job titles, locations, and exclusions:

titles:
  - Machine Learning Engineer
  - Senior Machine Learning Engineer
  - Research Scientist
  - Applied Scientist

locations:
  - California
  - Remote

# Exclude specific seniority levels
exclude_levels:
  - staff
  - principal

Available exclusion levels:

staff - Staff-level positions
principal - Principal-level positions
lead - Lead roles
director - Director-level positions
manager - Manager roles
head - Head of department roles
vp - VP-level positions
junior - Junior/Associate positions
intern - Internship positions

CLI Usage

For more control, use the CLI directly:

# Using profiles (recommended)
python search.py --profile default
python search.py --profile alice --timeout 60

# Using explicit config files
python search.py \
  --companies config/companies.yaml \
  --titles config/profiles/default/titles.yaml \
  --output job_results/custom

Options:

-p, --profile - Profile name (loads from config/profiles/<name>/)
-c, --companies - Path to companies YAML file (overrides profile)
-t, --titles - Path to job titles YAML file (overrides profile)
-o, --output - Output directory for results
--timeout - Request timeout in seconds (default: 30)

Output

Results are saved in a profile and date-organized structure:

job_results/
└── default/
    └── 2026-01-25/
        ├── OpenAI/
        │   ├── jobs.json
        │   └── jobs.csv
        ├── Anthropic/
        │   ├── jobs.json
        │   └── jobs.csv
        └── ...

Each jobs.json contains:

[
  {
    "company": "OpenAI",
    "title": "Machine Learning Engineer",
    "url": "https://jobs.ashbyhq.com/openai/abc123",
    "location": "San Francisco, CA",
    "team": "Applied AI",
    "source": "ashby",
    "discovered_at": "2026-01-25T10:30:00"
  }
]

Tools

Fix ATS Configuration

Automatically detect and fix incorrect ATS types and career URLs:

python fix_ats_config.py

This will:

Validate each company's career URL
Auto-detect the correct ATS type
Find direct ATS URLs when companies use embedded job boards
Update config/companies.yaml with corrections

Investigate Unverified Companies

For companies that couldn't be automatically verified:

python investigate_unverified.py

Project Structure

Argus/
├── config/
│   ├── companies.yaml          # Global company list
│   └── profiles/               # User profiles
│       ├── default/
│       │   └── titles.yaml
│       ├── Ming/
│       │   └── titles.yaml
│       └── Yaxi/
│           └── titles.yaml
├── Argus/                      # Main package
│   ├── orchestrator.py         # Main orchestration logic
│   ├── filter.py               # Job title/location filtering
│   ├── store.py                # Job persistence
│   ├── registry.py             # Company registry management
│   ├── models.py               # Data models
│   └── ats/                    # ATS-specific adapters
│       ├── greenhouse.py
│       ├── lever.py
│       ├── ashby.py
│       ├── workday.py
│       ├── amazon.py           # Amazon.jobs API
│       ├── google.py           # Google Careers
│       ├── tiktok.py           # TikTok/ByteDance
│       ├── uber.py             # Uber Careers
│       ├── generic.py          # Playwright-based fallback
│       └── detector.py         # ATS auto-detection
├── job_results/                # Output directory
│   ├── default/                # Results for default profile
│   └── <profile>/              # Results for other profiles
├── run_search.py               # Quick runner script
├── search.py                   # CLI entry point
├── fix_ats_config.py           # ATS config fixer tool
└── requirements.txt            # Dependencies

Adding New Companies

Find the company's career page URL
Add to config/companies.yaml:

  - name: New Company
    career_url: https://jobs.lever.co/newcompany
    ats_type: lever

If unsure about ATS type, set to unknown and run fix_ats_config.py

Supported Companies

The default configuration includes 50+ tech companies:

AI Labs: OpenAI, Anthropic, DeepMind, Mistral AI, Cohere, xAI, Perplexity AI
Big Tech: Google, Meta, Apple, Microsoft, Amazon
Finance: Stripe, Block, Coinbase, Plaid, Brex
Rideshare: Uber, Lyft
Social: TikTok, Pinterest, LinkedIn
And many more...

Requirements

Python 3.9+
httpx
playwright
pyyaml

Usage & Responsibility

This project is intended for personal job search and small-scale use.

Users are responsible for ensuring that their use of this tool complies with the terms of service of the websites they access and with applicable laws and regulations.

This tool performs read-only access to publicly available job postings. It does not automate job applications, form submissions, or authentication flows.

The author does not operate any centralized crawling service and does not collect or store user data.

Contact

For questions, suggestions, or issues, feel free to reach out:

Email: mshen1019@gmail.com
GitHub Issues: https://github.com/mshen1019/Argus/issues

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Argus

Features

Installation

Quick Start

User Profiles

Profile Structure

Creating a New Profile

Profile Output

Configuration

Companies (`config/companies.yaml`)

Job Titles & Filters (`config/profiles/<name>/titles.yaml`)

CLI Usage

Output

Tools

Fix ATS Configuration

Investigate Unverified Companies

Project Structure

Adding New Companies

Supported Companies

Requirements

Usage & Responsibility

Contact

License

About

Uh oh!

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Argus		Argus
config/profiles/default		config/profiles/default
.gitignore		.gitignore
Argus_design_doc.md		Argus_design_doc.md
README.md		README.md
fix_ats_config.py		fix_ats_config.py
investigate_unverified.py		investigate_unverified.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run_search.py		run_search.py
search.py		search.py

mshen1019/Argus

Folders and files

Latest commit

History

Repository files navigation

Argus

Features

Installation

Quick Start

User Profiles

Profile Structure

Creating a New Profile

Profile Output

Configuration

Companies (config/companies.yaml)

Job Titles & Filters (config/profiles/<name>/titles.yaml)

CLI Usage

Output

Tools

Fix ATS Configuration

Investigate Unverified Companies

Project Structure

Adding New Companies

Supported Companies

Requirements

Usage & Responsibility

Contact

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Companies (`config/companies.yaml`)

Job Titles & Filters (`config/profiles/<name>/titles.yaml`)

Packages