Open this notebook in Google Colab : [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Riminder/hrflow-cookbook/blob/main/examples/%5BParsing%5D%20parsing_evaluator.ipynb)

##### Copyright 2024 HrFlow's AI Research Department

Licensed under the Apache License, Version 2.0 (the "License");

In [None]:
# Copyright 2024 HrFlow's AI Research Department. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

Welcome to this Google Colaboratory tutorial for developers. This Jupyter notebook is crafted to streamline **the evaluation of CV parsing** effectiveness using HrFlow's robust AI technology. It enables users to **generate a comprehensive Excel report** assessing the parsing accuracy of resumes previously processed through a specific HrFlow source.

# Getting Started

In [None]:
!pip install -q -U hrflow==3.3.0a2

In [None]:
import os
from getpass import getpass
from hrflow import Hrflow

In [None]:
REPORT_PATH = "./parsing-evaluation.xlsx"

API_SECRET = getpass("YOUR_API_SECRET")
API_USER = getpass("USER@EMAIL.DOMAIN")
SOURCE_KEY = getpass("YOUR_SOURCE_KEY")

client = Hrflow(api_secret=API_SECRET, api_user=API_USER)

# 0. 📥 (optional) Parse resumes and store them in a source

Before we proceed, please ensure that you have :
- **Created a source**: [Connectors Source Documentation](https://developers.hrflow.ai/docs/connectors-source)
- **Parsed profiles and stored in this source**: You can run the following code to parse resumes and store them in a source.

In [None]:
STORAGE_DIRECTORY_PATH = "./resumes" # FILL ME with the path where you have your resumes
FAILURES_DIRECTORY_PATH = "./failures" # FILL ME with the path where you want to store the failures

os.makedirs(STORAGE_DIRECTORY_PATH, exist_ok=True)
os.makedirs(FAILURES_DIRECTORY_PATH, exist_ok=True)

In [None]:
results = client.profile.parsing.add_folder(
    source_key=SOURCE_KEY,
    dir_path=STORAGE_DIRECTORY_PATH,
    is_recurcive=True,
    move_failure_to=FAILURES_DIRECTORY_PATH,
    show_progress=True,
    max_requests_per_minute=30,
    min_sleep_per_request=1,
)

# 1. ⭐ Profile Evaluation and Generate Excel Report
This section outlines the process of generating the parsing evaluation report.
It will retrieve all profiles from the source and evaluate them.
The evaluation will be stored in an Excel file.

The Excel workbook consists of various sections, including metadata, personal info, experience, education, and other skills, offering a holistic view of parsing accuracy.

In [None]:
from hrflow.utils import generate_parsing_evaluation_report

generate_parsing_evaluation_report(
    client,
    source_key=SOURCE_KEY,
    report_path=REPORT_PATH,
    show_progress=True,
)

The output is an Excel file named `parsing-evaluation.xlsx`, summarizing the parsing accuracy of CVs stored in the specified HrFlow source.

The report contains 2 sheets: 
1. **Definition**: This page explains each field and how to interpret the results.
2. **Statistics**: This page presents the comprehensive set of results."