Skip to content

ClyrisAI/gitresolve

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

@clyrisai/gitresolve

Resolve candidate portfolios, resumes, and URLs into GitHub/GitLab/Bitbucket profiles and repos.

πŸš€ Enterprise ATS Integrations (Lever, Greenhouse, Ashby) Are you processing massive CSV dumps from your ATS with custom columns? If you need native zero-configuration mapping for your ATS platform, please Open an Issue and let us know! We are prioritizing specific integrations based on user demand.

Install

GitResolve can be run instantly via npx, or installed globally for permanent CLI use.

Option 1: Zero-Install (On-Demand)

npx @clyrisai/gitresolve --help

Option 2: Global Install (Recommended for frequent use) Install globally to use the gitresolve command anywhere. You can also install the puppeteer peer dependency at the same time to permanently enable JavaScript SPA rendering:

# Basic global install (fastest)
npm install -g @clyrisai/gitresolve

# Global install WITH JavaScript rendering support
npm install -g @clyrisai/gitresolve puppeteer

Usage

gitresolve [url] [options]

Arguments:
  [url]                     Direct URL to process (portfolio, profile, repo, or resume)

Options:

  General:
    --output-dir <dir>        Write results to a directory organized by candidate (keeps terminal clean)
    --json                    Output raw JSON to stdout
    -V, --version             Output version number
    -h, --help                Display help

  Single URL:
    --type <type>             Hint the type of URL: 'portfolio' or 'resume'

  Batch Processing:
    --all                     Process both portfolios and resumes
    --portfolios              Process portfolio links from CSV
    --resumes                 Process resume PDFs from directory
    --portfolio-csv <path>    Path to portfolio CSV file (default: ./portfolio_links.csv)
    --resumes-dir <path>      Path to resumes directory (default: ./resumes)

  Browser Options:
    --provider <name>         Provider (auto-uses puppeteer if found, else fetch. Force with 'puppeteer', 'browserless', 'fetch')
    --browserless-url <url>   Browserless instance URL

Examples

Single Input (Standard Usage)

Process a direct URL to a candidate's portfolio, GitHub/GitLab/Bitbucket profile, or resume PDF.

# Process a portfolio website
gitresolve https://janedoe.dev

# Process a GitHub profile β€” discovers all repos
gitresolve https://github.com/janedoe

# Process a GitLab profile (also works with /users/ routes)
gitresolve https://gitlab.com/janedoe

# Process a repo URL β€” scrapes the page and resolves the owner
gitresolve https://github.com/janedoe/my-project

# Process a remote resume PDF
gitresolve https://example.com/resume.pdf

(Tip: If a resume URL doesn't end in .pdf, you can force it to be treated as a resume by adding --type resume)

Batch Processing (Multiple Inputs)

Process a list of candidates from a CSV and a directory of local resumes.

1. Setup Your Data:

  • Create a data/resumes/ folder in your project root and drop in some PDFs.
  • Create a data/portfolio_links.csv file with a url column containing portfolio links. (Note: This data remains strictly on your machine and is .gitignore'd for privacy).

2. Run the Batch CLI:

# Process both portfolios and resumes
gitresolve --all

# Process and silently save the aggregated candidates to a directory
gitresolve --all --output-dir results/

Saving Analysis (Candidate Aggregation)

When you process multiple sources (e.g., a portfolio link and a resume PDF) that resolve to the same GitHub candidate, GitResolve will automatically merge them into a single, unified profile.

# Saves aggregated JSON files to the results/ directory (e.g., results/resolved/janedoe.json)
gitresolve --all --output-dir results/

# Skip the terminal UI entirely and dump the merged JSON array to stdout (great for piping)
gitresolve --all --json

Example Aggregated Output (results/resolved/janedoe.json):

{
  "candidateUsername": "janedoe",
  "sources": [
    "https://janedoe.dev",
    "./data/resumes/janedoe.pdf"
  ],
  "sourceTypes": [
    "portfolio",
    "resume_file"
  ],
  "ownerProfile": { 
    "url": "https://github.com/janedoe", 
    "provider": "github", 
    "type": "profile", 
    "username": "janedoe" 
  },
  "confidence": "high",
  "ownedRepos": [],
  "contributions": [],
  "externalRepos": [],
  "allLinks": [],
  "warnings": []
}

How it works

  1. Classifies input to determine if it's a resume file, portfolio site, git profile, or direct repo URL
  2. Scrapes the page using fetch, Puppeteer (for JS-rendered SPAs), or Browserless
  3. Parses PDF resumes by extracting raw text and deeply buried hyperlink annotations
  4. Extracts and sanitizes GitHub, GitLab, and Bitbucket URLs β€” including PR/Issue links
  5. Disambiguates owners to separate the candidate's actual profile from referenced external repos
  6. Categorizes links into owned repos, contributions (PRs & Issues), and external references

Each processed input returns a structured result:

{
  "source": "https://github.com/janedoe",
  "sourceType": "git_profile",
  "ownerProfile": { 
    "url": "https://github.com/janedoe", 
    "provider": "github", 
    "type": "profile", 
    "username": "janedoe" 
  },
  "confidence": "high",
  "ownedRepos": [],
  "contributions": [],
  "externalRepos": [],
  "allLinks": [],
  "warnings": []
}

Supported Input Types

Input Example What Happens
Portfolio site https://janedoe.dev Scrapes page for git links
GitHub profile https://github.com/janedoe Scrapes profile page, discovers repos
GitLab profile https://gitlab.com/janedoe Scrapes profile, handles /users/ routes
Bitbucket profile https://bitbucket.org/janedoe Scrapes profile page
Repo URL https://github.com/user/repo Scrapes repo page, resolves owner
PR/Issue URL https://github.com/user/repo/pull/42 Extracts as contribution
Resume PDF ./resume.pdf or URL Extracts text + hyperlink annotations

Browser Provider Configuration

GitResolve supports three browser providers for fetching portfolio page content. It auto-detects the best available one:

Provider Requires JavaScript Rendering Best For
puppeteer npm install puppeteer βœ… Full SPAs, JS-heavy sites
browserless Docker container βœ… Full Server environments, CI/CD
fetch Nothing ❌ None Static sites, fallback

Browserless Setup

# Start a Browserless Docker container
docker run -d --name browserless -p 3000:3000 ghcr.io/browserless/chromium

# Use it
gitresolve --portfolios --browserless-url http://localhost:3000

Programmatic API

If you are building an app that needs to extract GitHub profiles on the fly, you can import @clyrisai/gitresolve directly into your Node.js/Bun backend.

npm install @clyrisai/gitresolve
import { classifyInput, scrapePortfolio, parseResume, createProvider } from '@clyrisai/gitresolve';

// 1. Setup a browser provider
const provider = await createProvider('puppeteer' | 'browserless' | 'fetch');

try {
  // --- Example A: Classify and scrape a portfolio URL ---
  const result = await scrapePortfolio('https://janedoe.dev', provider);
  console.log("Candidate Profile:", result.ownerProfile);
  console.log("Confidence Score:", result.confidence);
  console.log("Owned Repos:", result.ownedRepos);
  console.log("Contributions:", result.contributions);

  // --- Example B: Parse a resume PDF ---
  const resumeResult = await parseResume('./resumes/candidate1.pdf');
  console.log("Found links in resume:", resumeResult.allLinks.length);

} finally {
  await provider.cleanup();
}

Requirements

  • Node.js >= 18.0.0
  • Puppeteer (Optional, auto-used for SPAs if installed as a peer dependency)

License

MIT Β© ClyrisAI

About

Resolve candidate portfolios, resumes, and URLs into GitHub/GitLab/Bitbucket profiles and repos.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors