This notebook shows how to use the functions:

- extract_whether_qualified_indicators

- *name of function(s) that takes a bunch of postings, uses the above function to extract indicators of what the company wants, and then makes the 'trees of skills' and then allows you do analytics on the trees of skills'* (<-- in progress)

There are the tools developed to help people understand ways their skillset are suited for various jobs, as opposed to the infrastructure in [labeling_helpers](./training/labeling_helpers), used to expedite the training of these models.

In [1]:
import os 
import sys
from dotenv import load_dotenv

In [2]:
load_dotenv()
path_to_root_dir = os.getenv('PATH2ROOT_DIR')
sys.path.append(path_to_root_dir)

In [3]:
import json
from career_fit_tools.job_analytics_tools import get_if_qualified_indicators
from career_fit_tools.get_if_qualified_indicators_helpers import get_symetric_diff


In [4]:
f_name = os.getenv('PATH2SAMPLE_PIPELINE_INPUTS')

with open(f_name, 'r') as file:
    sample_inputs = json.load(file)

## extract_whether_qualified_indicators

This function takes the following arguments:

- job_descr (str): the job description to extract entities from
- use_sentence_classifier (int): determines whether or not to use the sentence classifier to pre filter sentences to feed into the token classifier ('0' do not use the sentence classifier; '1' use the sentence classifier)
- print_descr_w_predictions (int): determines whether or not to print the reconstructed posting, with words color coded according to label ('0' do not print; '1' print)

It outputs a list of entities relavent to determining whether or not a possible applicant would be qualified for the position.

**Below is an example of using the function on the same job posting, with using the sentence classifier to pre-filter sentences, and without using the sentence classifier.** 

In [5]:
entities_w_sentence_classification = \
    get_if_qualified_indicators(sample_inputs[1], use_sentence_classifier = 1, print_descr_w_predictions = 1)


Some weights of the model checkpoint at has-abi/distilBERT-finetuned-resumes-sections were not used when initializing DistilBertModel: ['classifier.bias', 'pre_classifier.bias', 'classifier.weight', 'pre_classifier.weight']
- This IS expected if you are initializing DistilBertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DistilBertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


summary
$53 - $63 per hour
contract
[35;5;9mbachelor[0m [33;5;9mdegree[0m 
category
[35;5;9mcomputer[0m [33;5;9mand[0m [33;5;9mmathematical[0m [33;5;9moccupations[0m 
reference
1016828
job summary:
Description:

Minimum Qualifications

[35;5;9mMaster[0m [35;5;9m'[0m [35;5;9ms[0m [33;5;9mdegree[0m in [33;5;9mstatistics[0m , [33;5;9meconomics[0m , [35;5;9moperations[0m [33;5;9mresearch[0m , [35;5;9mengineering[0m , or related field 
Marketing mix modeling industry experience
8 + years of industry experience in [35;5;9mdata[0m [33;5;9mscience[0m , [35;5;9mmeasurement[0m , [35;5;9mmarketing[0m [33;5;9mstrategy[0m & [33;5;9manalytics[0m 
communication skills to 'tell a story' that provides insight into the business
Proficient coding skills ( [35;5;9mSQL[0m [35;5;9m/[0m [35;5;9mPython[0m [35;5;9m/[0m [35;5;9mR[0m ) and [35;5;9mdatabase[0m knowledge 
Extensive experience with [35;5;9mpredictive[0m [33;5;9mmodeling[0m [33;5;9malgorithm

In [6]:
entities_wo_sentence_classification = \
    get_if_qualified_indicators(sample_inputs[1], use_sentence_classifier = 0, print_descr_w_predictions = 1)

summary 
$ 53 - $ 63 per hour 
contract 
[35;5;9mbachelor[0m [33;5;9mdegree[0m 
category 
[35;5;9mcomputer[0m [33;5;9mand[0m [33;5;9mmathematical[0m [33;5;9moccupations[0m 
reference 
1016828 
job summary : 
Description : 

Minimum Qualifications 

[35;5;9mMaster[0m [35;5;9m'[0m [35;5;9ms[0m [33;5;9mdegree[0m in [33;5;9mstatistics[0m , [33;5;9meconomics[0m , [35;5;9moperations[0m [33;5;9mresearch[0m , [35;5;9mengineering[0m , or related field 
[35;5;9mMarketing[0m [33;5;9mmix[0m [35;5;9mmodeling[0m [33;5;9mindustry[0m experience 
8 + years of industry experience in [35;5;9mdata[0m [33;5;9mscience[0m , [35;5;9mmeasurement[0m , [35;5;9mmarketing[0m [33;5;9mstrategy[0m & [33;5;9manalytics[0m 
communication skills to ' tell a story ' that provides insight into the business 
Proficient coding skills ( [35;5;9mSQL[0m [35;5;9m/[0m [35;5;9mPython[0m [35;5;9m/[0m [35;5;9mR[0m ) and [35;5;9mdatabase[0m knowledge 
Extensive experience 

**Below is the first 20 entities extracted when using both the sentence and token classifiers:**

In [None]:
for entity in entities_w_sentence_classification[0:20]:
    print(entity)

**Below is the symetric difference of the entities extracted when using the sentence classifier versus when not using it:**

In [8]:
sym_diff = get_symetric_diff(entities_w_sentence_classification, entities_wo_sentence_classification)

Elements only in list1: set()


Elements only in list2: {'Data Science Veteran Status education certifications Fair Chance Ordinance Ordinance', 'Marketing mix', 'vaccine / testing requirements', 'modeling industry'}


Symmetric difference: ['Data Science Veteran Status education certifications Fair Chance Ordinance Ordinance', 'Marketing mix', 'vaccine / testing requirements', 'modeling industry']


**Below are examples of the function being used on two additional job descriptions:** (in both cases, the sentence and token classifiers are both used)

In [5]:
entities_w_sentence_classification = \
    get_if_qualified_indicators(sample_inputs[2], use_sentence_classifier = 1, print_descr_w_predictions = 1)

Some weights of the model checkpoint at has-abi/distilBERT-finetuned-resumes-sections were not used when initializing DistilBertModel: ['pre_classifier.bias', 'classifier.weight', 'classifier.bias', 'pre_classifier.weight']
- This IS expected if you are initializing DistilBertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DistilBertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


Embedded in a worldwide network [35;5;9mMercedes[0m - [35;5;9mBenz[0m [33;5;9mResearch[0m [33;5;9m&[0m [33;5;9mDevelopment[0m North America continuously strives to remain at the forefront of successful [35;5;9mautomotive[0m [33;5;9mresearch[0m and [33;5;9mdevelopment[0m .  MBRDNA is headquartered in Silicon Valley , California , with key areas of [35;5;9mAutonomous[0m [33;5;9mDriving[0m , [35;5;9mAdvanced[0m [33;5;9mInteraction[0m [33;5;9mDesign[0m , [35;5;9mDigital[0m [33;5;9mUser[0m [33;5;9mExperience[0m , [35;5;9mMachine[0m [33;5;9mLearning[0m , [35;5;9mCustomer[0m [33;5;9mResearch[0m , and [35;5;9mOpen[0m [33;5;9mInnovation[0m .  In Farmington Hills , Michigan , the focus is on [35;5;9mPowertrain[0m and [35;5;9meDrive[0m [33;5;9mtechnology[0m as well as in [35;5;9mLong[0m [33;5;9mBeach[0m , where the teams test durability of the latest [35;5;9mdriver[0m [33;5;9massistant[0m and telematic [33;5;9msystems[0m .  The Digital H

In [8]:
entities_w_sentence_classification = \
    get_if_qualified_indicators(sample_inputs[0], use_sentence_classifier = 1, print_descr_w_predictions = 1)

We help the world run better
Our company culture is focused on helping our employees enable innovation by building breakthroughs together. How? We focus every day on building the foundation for tomorrow and creating a workplace that embraces differences, values flexibility, and is aligned to our purpose-driven and future-focused work. We offer a highly collaborative, caring team environment with a strong focus on learning and development, recognition for your individual contributions, and a variety of benefit options for you to choose from.Apply now!
SAP Business Network
The global economy has been rapidly evolving from enterprise - centric to network - centric .  No enterprise does business alone today. Suppliers , service providers , contract manufacturers , logistics pros – companies today rely heavily on their connections to their extended ecosystem to operate effectively .  They need an intelligent and open network that delivers enhanced visibility , greater efficiency , and impro

## Next function: 

In progress: a tool that takes entities drawn from numerous job descriptions, and reduces the entity space in a systematic way, so that analytics can be done to draw meaningful conclusions.