Skip to content

Mcamin/JobScraper

Repository files navigation

JobScraper API

A FastAPI microservice that scrapes job listings using JobSpy, persists them in MySQL, and exposes a REST API for querying.
Now powered by Poetry for dependency and environment management. The jobScraper is used with an N8n workflow to automate the job application process.


🚀 Features

  • POST /scrape → Run job scraping and persist results
  • GET /jobs → Query stored job postings with filters & pagination
  • GET /jobs/{id} → Fetch individual job
  • Logging (Loguru)
  • Alembic migrations
  • MySQL database
  • Poetry-based dependency management
  • Docker & Docker Compose support

⚙️ Local Development (with Poetry)

  1. Install Poetry

    curl -sSL https://install.python-poetry.org | python3 -
    export PATH="$HOME/.local/bin:$PATH"
  2. Install dependencies

    poetry install
  3. Run migrations

    poetry run alembic upgrade head
  4. Start the API

    poetry run uvicorn app.main:app --reload
  5. Visit docs


🐳 Docker Deployment

  1. Build the container

    docker compose build
  2. Run

    docker compose up
  3. Apply migrations in container

    docker compose exec api poetry run alembic upgrade head

🧰 Environment Variables (.env)

Example:

APP_NAME=JobScraper API
APP_ENV=dev
DB_HOST=db
DB_PORT=3306
DB_USER=jobs
DB_PASSWORD=jobs_pw
DB_NAME=jobsdb

🧠 Common Commands

Task Command
Add new dependency poetry add <package>
Add dev dependency poetry add --group dev <package>
Remove dependency poetry remove <package>
Run migrations poetry run alembic upgrade head
Start dev server poetry run uvicorn app.main:app --reload
Run tests poetry run pytest

📦 Project Structure

app/
├─ main.py
├─ models.py
├─ crud.py
├─ schemas.py
├─ db.py
├─ config.py
├─ scraper.py
docs/
├─ openapi.yaml
migrations/
├─ env.py
├─ versions/
tests/
├─ test_smoke.py

🧩 Alembic Migrations

Alembic is already configured for autogeneration based on app.models.

Generate new migration

poetry run alembic revision --autogenerate -m "add new columns"

Apply migrations

poetry run alembic upgrade head

✅ Health Check

curl http://localhost:8000/health

Response:

{"status": "ok"}

🧹 Cleaning Up

docker compose down -v

Deletes containers, volumes, and networks.

About

A FastAPI microservice that scrapes job listings using JobSpy, persists them in MySQL, and exposes a REST API for querying.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors