GitHub - 16A9DA/Job-Finder: Find user jobs using resume in pdf and machine learning

Job Finder

Description

This project is a PDF parsing and information extraction tool that converts PDF files (including scanned/image-based PDFs) into structured JSON data. It uses OCR (Optical Character Recognition) via Tesseract to handle scanned documents and AI model such llama3 by using prompt to extract key information like names, email addresses, phone numbers, job titles, and countries.

FastAPI is used to receive uploaded files from users in main.py. Since, file is changed to bytes by Uploadfile type of FastApi. The pdf_to_json() function is using io.BytesIO() to treat the the bytes input from Uploadfile as file.

The tool is particularly useful for processing resumes/CVs, to generate profile. The profile information can be used to find jobs for the user using Machine Learning, user will be represent with available jobs.

mlmodel.py trains a machine learning model to predict job positions based on user skills and retrieve relevant job information such as company, job description, and salary. It uses preprocessing for dataset/cleaning dataset .TF-IDF vectorization for text processing and a Multi-Layer Perceptron (MLP) Classifier for prediction.

Features

Extracts text from both text-based and image-based PDFs upload by user using FastApi
Uses OCR (Tesseract) for scanned documents
Extracts: Name,Email,Location,Job,Education,skills
Outputs structured JSON data that will be represented as profile
Skills from profile of user given to ML model to provide available jobs

flowchart TD
    A["Upload PDF"] --> B["UploadFile (bytes in FastAPI)"]
    B --> C["io.BytesIO (treat bytes as file)"]
    C --> D["PdfReader / OCR (extract text)"]
    D --> E["AI MODEL to structure information / pdf_to_json()"]
    E --> F["JSON → User Profile "]
    F --> G["Skills from Profile (ML) -> Jobs"]

Requirements

pip install -r requirement.txt

brew install ollama

ollama pull llama3

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
MLtraining.ipynb		MLtraining.ipynb
README.MD		README.MD
main.py		main.py
mlmodel.py		mlmodel.py
project.py		project.py
requirement.txt		requirement.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Job Finder

Description

Features

Requirements

About

Uh oh!

Releases

Packages

Languages

16A9DA/Job-Finder

Folders and files

Latest commit

History

Repository files navigation

Job Finder

Description

Features

Requirements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages