# Jobs Classification

Given *onet.csv* and *alternate-titles.csv* as template documents and *jobs.csv* as items to classify, perform a classification of jobs.

Result should be jobs-answers.csv with the following fields "title", "jobdesc", "code", "matched_title", where "title" and "jobdesc" are from *jobs.json* and "code", "matched_title" is the best matching template code, title from *onet.csv*

Any technique is allowed, though utilizing some techniques from NLP domain is preferred.

In [None]:
!pip install sentence_transformers

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
import pandas as pd
import re 
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics import pairwise_distances
from sentence_transformers import SentenceTransformer
import tqdm 
import numpy as np

In [None]:
def read_data():
  onet = pd.read_csv("data/onet.csv")
  alt_titles = pd.read_csv("data/alternate-titles.tsv", delimiter="\t")
  onet = onet.sort_values(by='code')
  alt_titles = alt_titles.sort_values(by='code')
  onet['title2'] = alt_titles.alternate_title
  onet['title2'] = onet['title2'].map(str)
  onet["titledesc"] = onet["title2"] + "-" + onet['description'].astype(str) 
  # remove everything other than numbers and letters
  onet["titledesc"] = onet["titledesc"].map(lambda x:" ".join(re.split("[^a-zA-Z0-9]",x)))
  onet.head()

  jobs = pd.read_json("data/jobs.json")
  jobs["titledesc"] = jobs["title"] + "-" + jobs['jobdesc'].astype(str)
  # remove everything other than numbers and letters
  jobs["titledesc"] = jobs["titledesc"].map(lambda x:" ".join(re.split("[^a-zA-Z0-9]",x)))
  return onet, jobs 

def tfidf_classify(ngram=1):
  """The tfidf_classify method takes the ngram as argument which is used for generating the vectors
   from onet and jobs data, the fit_transform learn vocabulary and idf, return document-term matrix. 
   the transform learn vocabulary and idf from training set.
   Once the onet and jobs are transformed, we perform pairwise distance and compute the argmin(axis=1) for the
   closet description. Return: jobs"""
  onet, jobs = read_data()
  vectorizer = TfidfVectorizer(stop_words='english',ngram_range=(1,ngram))
  onet_transformed = vectorizer.fit_transform(onet.titledesc)
  jobs_transformed = vectorizer.transform(jobs.titledesc)
  
  dists = pairwise_distances(jobs_transformed, onet_transformed)
  closest_descriptions = dists.argmin(axis=1)
  jobs["_score"] = dists.min(axis=1)
  jobs["code"] = [onet.code[i] for i in closest_descriptions]
  jobs["matched_title"] = [onet.title[i] for i in closest_descriptions]
  jobs["matched_title2"] = [onet.title2[i] for i in closest_descriptions]

  return jobs

"""
unigram = "datawarehouse"
bigram = "datawarehouse associate"
"""

def sbert_classify():
  """
  The sbert_classify method apply the sentence bert model(all-mpnet-base-v2), most efficient for semantics serach.
  Once we have transformed the onet and jobs data, we compute consine similarity metric between jobs_transformed 
  and onet_transformed on pairwise distance. The argmin applied which gives the indices of the min element of the vectors in a row axis.
  returns : jobs
  """
  onet, jobs = read_data()
  def encode(L):
    vecs = []
    BS = 200
    for i in tqdm.tqdm(range(0,len(L),BS)):
      vecs.append(model.encode(L[i:i+BS]))
    vecs = np.concatenate(vecs,axis=0)
    return vecs 

  model = SentenceTransformer('all-mpnet-base-v2').cuda()
  onet_transformed = encode(list(onet.titledesc))
  jobs_transformed = encode(list(jobs.titledesc))

  dists = pairwise_distances(jobs_transformed, onet_transformed, metric="cosine")
  closest_descriptions = dists.argmin(axis=1)
  jobs["_score"] = dists.min(axis=1)
  jobs["matched_title"] = [onet.title[i] for i in closest_descriptions]
  jobs["matched_title2"] = [onet.title2[i] for i in closest_descriptions]

  return jobs


In [None]:
jobs_tfidf1 = tfidf_classify(1)
jobs_tfidf2 = tfidf_classify(2)
jobs_tfidf3 = tfidf_classify(3)

In [None]:
jobs_sbert = sbert_classify()

100%|██████████| 6/6 [00:28<00:00,  4.71s/it]
100%|██████████| 6/6 [00:24<00:00,  4.14s/it]


In [None]:
jobs_tfidf1

Unnamed: 0,_id,_score,jobdesc,title,titledesc,code,matched_title,matched_title2
0,7dab585f01fd,1.265592,We are currently seeking a Warehouse Associate...,Warehouse Associate,Warehouse Associate We are currently seeking a...,15-1243.01,Data Warehousing Specialists,"[""Data Management Specialist"",""Data Storage Sp..."
1,f71bc73edb39,1.275454,"For 70 years, Charles River employees have wor...",Supply Chain Associate II,Supply Chain Associate II For 70 years Charle...,11-3071.04,Supply Chain Managers,"[""Demand Planning Manager"",""Global Supply Chai..."
2,c49152708aa9,1.354367,Looking for an individual who will be responsi...,Azure DevOps Engineer,Azure DevOps Engineer Looking for an individua...,15-1253.00,Software Quality Assurance Analysts and Testers,"[""Application Integration Engineer"",""Applicati..."
3,d1b402364b59,1.215624,Salesforce Administrator: REMOTE – C-4 Analyti...,Salesforce Administrator: REMOTE,Salesforce Administrator REMOTE Salesforce Ad...,11-2022.00,Sales Managers,"[""Account Manager"",""Area Sales Manager"",""Artis..."
4,495cc0d7c26d,1.195856,DescriptionHCR ManorCare provides a range of s...,Floor Care,Floor Care DescriptionHCR ManorCare provides a...,31-1122.00,Personal Care Aides,"[""Aide"",""Blind Aide"",""Blind Escort"",""Care Prov..."
...,...,...,...,...,...,...,...,...
1045,05135334b09a,1.309270,Now Brewing – Future Leaders! #tobeapartnerSta...,assistant store manager,assistant store manager Now Brewing Future L...,13-1041.03,Equal Opportunity Representatives and Officers,"[""Action Officer"",""Affirmative Action Officer ..."
1046,64e6676e0f5b,1.329820,Do you like to work on your feet and keep thin...,Merchandise and Stocking Associate - Sam's Club,Merchandise and Stocking Associate Sam s Clu...,39-7011.00,Tour Guides and Escorts,"[""Admitting Office Escort"",""Adventure Guide"",""..."
1047,719b46a61e7e,1.261629,If you're a licensed CDL Driver looking to adv...,CDL Delivery Truck Driver (458),CDL Delivery Truck Driver 458 If you re a li...,53-3032.00,Heavy and Tractor-Trailer Truck Drivers,"[""Aircraft Refueler"",""Armored Truck Driver"",""A..."
1048,c202907c89ba,1.202378,Truck Driver CDL (OTR) – Hazmat trucker | $117...,Truck Driver CDL (OTR) - Hazmat trucker,Truck Driver CDL OTR Hazmat trucker Truck ...,53-3032.00,Heavy and Tractor-Trailer Truck Drivers,"[""Aircraft Refueler"",""Armored Truck Driver"",""A..."


In [None]:
jobs_sbert

Unnamed: 0,_id,_score,jobdesc,title,titledesc,matched_title,matched_title2
0,7dab585f01fd,0.494148,We are currently seeking a Warehouse Associate...,Warehouse Associate,Warehouse Associate We are currently seeking a...,Logistics Engineers,"[""Acquisition Logistics Engineer"",""Continuous ..."
1,f71bc73edb39,0.355585,"For 70 years, Charles River employees have wor...",Supply Chain Associate II,Supply Chain Associate II For 70 years Charle...,Pharmacy Technicians,"[""Accredited Pharmacy Technician"",""Certified P..."
2,c49152708aa9,0.452614,Looking for an individual who will be responsi...,Azure DevOps Engineer,Azure DevOps Engineer Looking for an individua...,Computer Systems Engineers/Architects,"[""Automation Engineer"",""Computer Systems Archi..."
3,d1b402364b59,0.436016,Salesforce Administrator: REMOTE – C-4 Analyti...,Salesforce Administrator: REMOTE,Salesforce Administrator REMOTE Salesforce Ad...,Computer User Support Specialists,"[""Applications Analyst"",""Automatic Data Proces..."
4,495cc0d7c26d,0.391316,DescriptionHCR ManorCare provides a range of s...,Floor Care,Floor Care DescriptionHCR ManorCare provides a...,Home Health Aides,"[""Care Giver"",""Care Worker"",""Caregiver"",""Certi..."
...,...,...,...,...,...,...,...
1045,05135334b09a,0.480261,Now Brewing – Future Leaders! #tobeapartnerSta...,assistant store manager,assistant store manager Now Brewing Future L...,Baristas,"[""Barista"",""Catering Barista"",""Coffee Bar Atte..."
1046,64e6676e0f5b,0.374481,Do you like to work on your feet and keep thin...,Merchandise and Stocking Associate - Sam's Club,Merchandise and Stocking Associate Sam s Clu...,Fast Food and Counter Workers,"[""Bakery Associate"",""Bistro Team Member"",""Cafe..."
1047,719b46a61e7e,0.330817,If you're a licensed CDL Driver looking to adv...,CDL Delivery Truck Driver (458),CDL Delivery Truck Driver 458 If you re a li...,Light Truck Drivers,"[""Baggageman"",""Bulk Delivery Driver"",""Car Esco..."
1048,c202907c89ba,0.453261,Truck Driver CDL (OTR) – Hazmat trucker | $117...,Truck Driver CDL (OTR) - Hazmat trucker,Truck Driver CDL OTR Hazmat trucker Truck ...,Recreational Vehicle Service Technicians,"[""Custom Van Converter"",""Hitch Technician"",""Ma..."


In [None]:
jobs_tfidf1.to_csv('jobs-tfidf1-answers.csv')  
jobs_tfidf2.to_csv('jobs-tfidf2-answers.csv')  
jobs_tfidf3.to_csv('jobs-tfidf3-answers.csv')  
jobs_sbert.to_csv('jobs-sbert-answers.csv') 

In [None]:
def analyze(idx):
  """Helper function used in results analysis."""
  return f"""**Input:**<br>
  jobs_tfidf1.title[{idx}] = {jobs_tfidf1.title[idx]} <br>
  jobs_tfidf1.jobdesc[{idx}] = {jobs_tfidf1.jobdesc[idx]}<br><br>
  **Output:**<br>
  jobs_tfidf1.matched_title[{idx}] = {jobs_tfidf1.matched_title[idx]} <br>
  jobs_tfidf2.matched_title[{idx}] = {jobs_tfidf2.matched_title[idx]}<br>
  jobs_tfidf3.matched_title[{idx}] = {jobs_tfidf3.matched_title[idx]}<br>
  jobs_sbert.matched_title[{idx}] = {jobs_sbert.matched_title[idx]}<br>"""

In [None]:
analyze(3)

#Result Analysis

Since the dataset is not associated with ground truth labels, I selected a few examples to compare the TF-IDF and SBERT methods.

#Example 1

**Input:**<br>
  jobs_tfidf1.title[0] = Warehouse Associate <br>
  jobs_tfidf1.jobdesc[0] = We are currently seeking a Warehouse Associate at our Manassas, VA branch to receive, store, and distribute material, equipment, and products within warehouse and branch by performing the following duties: Provide quality service to customers using clear communication skills.Be knowledgeable regarding all Company products and services.Count, verify and manually unload incoming orders and shipments.Utilize a forklift for moving inventoryVerify and manually load orders on outgoing trucks.Verify and manually load outgoing orders onto customer vehicles.General yard and office maintenance to include cleaning and painting.Maintain neatness and cleanliness of warehouse.Maintain inventory in appropriate/designated storage areas in warehouse.Perform other duties as assigned by management.The Ideal Candidate Will Have:Must have some experience with warehouse, forklift, or other machineryPrefer high school graduate or GEDAbility to lift up to 100 pounds.Operation of forklift.Team player with good customer service skills.Ability to read, speak and write the English language.Must have reliable transportation to work each day.Must be able to physically and safely deliver products on roof.An Equal Opportunity Employerwww.becn.com<br><br>
  **Output:**<br>
  jobs_tfidf1.matched_title[0] = Data Warehousing Specialists <br>
  jobs_tfidf2.matched_title[0] = Stockers and Order Fillers<br>
  jobs_tfidf3.matched_title[0] = Stockers and Order Fillers<br>
  jobs_sbert.matched_title[0] = Logistics Engineers<br>


**Analysis**<br>
The TDIDF(ngram=1) has found the matched title as "Data Warehousing Specialists," which is the least appropriate, and the TDIDF(ngram=2&3)has found the title as "Stockers and Order Fillers" which is perhaps the most appropriate amongst the alternatives. In the SBERT case, the matched title is "Logistics Engineers," and is not entirely disconnected from the job description and title as compared to 'Data Warehousing Specialists'.

Since TFIDF looks for exact word matches instead of the semantics and context of the words, unigram TFIDF found an incorrect search result which could possibly be improved by considering higher n-grams as evidenced by this example. SBERT, despite not being fine-tuned on this specific corpus, was able to identify a reasonably good match not solely based on lexical matches. 

#Example 2

**Input:**<br>
  jobs_tfidf1.title[1] = Supply Chain Associate II <br>
  jobs_tfidf1.jobdesc[1] = For 70 years, Charles River employees have worked together to assist in the discovery, development and safe manufacture of new drug therapies. When you join our family, you will have a significant impact on the health and well-being of people across the globe. Whether your background is in life sciences, finance, IT, sales or another area, your skills will play an important role in the work we perform. In return, we’ll help you build a career that you can feel passionate about.Job Summary The Supply Chain Associate II receives materials, ensures proper documentation of FEFO, MDS and MOS. Tracks overdue material. Adheres to cGMP compliance. Interacts with Manufacturing and QA/QC Departments for GMP materials release.  Essential Responsibilities:•    Receive controlled materials following established procedures. Ensure all materials are received and stored in controlled materials storage locations to ensure cGMP compliance.•    Receive all materials and deliver received shipment to recipients each day.•    Follow Good Manufacturing Practices (GMPs) and Standard Operating Procedures (SOPs) in daily activities across all three Vigene sites.•    Distribute controlled materials to manufacturing following FEFO and complete the required documentation as the materials are picked and distributed. Ensure all expired materials from Inventory are identified each month and discarded following SOP.•    Complete MSD’s and MOS or any other QC required forms.•    Assist Sr. Supply Chain Associate with inventory. Help with yearly cycle counting.•    Tracking of overdue materials. Work with procurement to successfully resolve the documentation required and request certificates or any other required documentation with Suppliers.•    Back-up Sr. Supply Chain Associate with picking and staging GMP materials requested by Manufacturing, tech transfer, Plasmid, or other departments.•    Assist in the training of support staff.Job Qualifications •    High School Diploma required. •    2+ years of GMP experience required.•    Knowledge of inventory management systems and understanding of Good Manufacturing Practices in Supply Chain/Manufacturing areas.•    Possess excellent communication and organizational skills.•    Microsoft Office experience (Word, Excel, PowerPoint, Teams etc.)•    Must be able to work flexible hours – must be willing to work outside of normally-scheduled hours as necessary.•    Serve as a backup as needed to drive company vehicle between sites in the Rockville area with valid driver’s license.Vaccine MandateCharles River is a U.S. Federal Contractor.  As a result, we must follow the Presidential Executive Order to mandate vaccinations, and ensure our employees are fully vaccinated against COVID-19.  Our main priority is the wellbeing, health, and safety of our people. We require proof of vaccination from all employees. Anyone with requests for disability-related and/or religious exemptions should contact Talent Acquisition (crrecruitment_US@crl.com) so that information can be provided about the accommodation process at Charles River. About Biologics Testing SolutionsWith more than 50 years of experience and proven regulatory expertise, the Charles River Biologics group can address challenging projects for biotechnology and pharmaceutical companies worldwide. Offering a variety of services such as contamination and impurity testing, protein characterization, bioassays, viral clearance studies and stability and lot release programs, we support clients throughout the biologic development cycle, from the establishment and characterization of cell banks through preclinical and clinical studies to marketed products. Whether clients need stand-alone services, a unique package of testing, or insourced support, our Biologics group can create a custom solution to suit their needs.  Each year more than 20,000 biologic testing reports are sent each and over 200 licenses products are supported by our biologics testing solutions team.About Charles RiverCharles River is an early-stage contract research organization (CRO). We have built upon our foundation of laboratory animal medicine and science to develop a diverse portfolio of discovery and safety assessment services, both Good Laboratory Practice (GLP) and non-GLP, to support clients from target identification through preclinical development. Charles River also provides a suite of products and services to support our clients’ clinical laboratory testing needs and manufacturing activities. Utilizing this broad portfolio of products and services enables our clients to create a more flexible drug development model, which reduces their costs, enhances their productivity and effectiveness to increase speed to market.With over 18,000 employees within 100 facilities in over 20 countries around the globe, we are strategically positioned to coordinate worldwide resources and apply multidisciplinary perspectives in resolving our client’s unique challenges. Our client base includes global pharmaceutical companies, biotechnology companies, government agencies and hospitals and academic institutions around the world. At Charles River, we are passionate about our role in improving the quality of people’s lives. Our mission, our excellent science and our strong sense of purpose guide us in all that we do, and we approach each day with the knowledge that our work helps to improve the health and well-being of many across the globe. We have proudly supported the development of &gt;80% of the drugs approved by the FDA for the past 3 years.Equal Employment OpportunityCharles River Laboratories is an Equal Opportunity Employer - all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status. If you are interested in applying to Charles River Laboratories and need special assistance or an accommodation due to a disability to complete any forms or to otherwise participate in the resume submission process, please contact a member of our Human Resources team by sending an e-mail message to crrecruitment_US@crl.com. This contact is for accommodation requests for individuals with disabilities only and cannot be used to inquire about the status of applications. For more information, please visit www.criver.com.<br><br>
  **Output:**<br>
  jobs_tfidf1.matched_title[1] = Supply Chain Managers <br>
  jobs_tfidf2.matched_title[1] = Supply Chain Managers<br>
  jobs_tfidf3.matched_title[1] = Supply Chain Managers<br>
  jobs_sbert.matched_title[1] = Pharmacy Technicians<br>

  **Analysis:**<br>
  The TDIDF(ngram=1,2&3) have found the matched title as "Supply Chain Managers" which is perhaps the most appropriate amongst the alternatives. In the SBERT case, the matched title is "Pharmacy Technicians" and is entirely disconnected from the title but the job description has the words like "drugs","Laboratories" are appropriate to the context and the title defined as "Pharmacy Technician" sounds right.



#Example 3
**Input:**<br>
  jobs_tfidf1.title[2] = Azure DevOps Engineer <br>
  jobs_tfidf1.jobdesc[2] = Looking for an individual who will be responsible for automating the application build and release pipeline as well as ensuring uptime for various projects and environments. The primary focus will be on automating the source code lifecycle from development through distribution. This role will also provide development and operational technical support, rotating on-call, project participation, and continuity support for the applications. The candidate will also participate in detection and troubleshooting of issues that affect delivery of our services across multiple platforms A comprehensive understanding of UnixLinux shell scripting, Windows environment and Oracle operations is required. Experience with Tandem is desired. Responsibilities include  bull Implement tools and automation for build, configuration management, continuous integration (CI), static and dynamic code analysis, deployment and application monitoring. Automate and evolve infrastructure, server, deployment strategies and testing to support and quick turnaround of deployments. bull Work with Development at all levels to assist with implementing CICD pipelines for all Services applications. bull Work closely with Monitoring Engineering to ensure all relevant KPIrsquos are implemented within the monitoring framework. bull Participate in all Production Support activities during incidents and outages as needed. Hands-on technical resource capable of resolving all technical issues within lower and upper environments, and making recommendation for performance and capacity improvements bull Continuously improve our infrastructure to be easy to deploy, scalable, secure and fault-tolerant. Participate in capacity planning, tuning systems stability, provisioning, performance, and scaling of the application infrastructure. bull Participate in on-call rotation  Basic Qualifications for Consideration  bull BS degree in Computer Science or equivalent experience bull Minimum of 6+ years of relevant, broad engineering experience is required. bull Works effectively both independently and as a member of a cross functional team bull An obsessive desire to automate as much as possible bull Experience in an agile development environment bull Experience with Azure DevOps andor TFS bull 4+ yearsrsquo experience with Linux and Unix systems bull 4+ yearsrsquo experience with Windows systems bull 3+ yearsrsquo experience supporting services, micro-services, and n-tier systems bull 4+ yearsrsquo experience with scripting languages such as Bash, Python, Perl bull 4+ yearsrsquo experience with software automation technologies bull 4+ yearsrsquo experience with Continuous Integration  Continuous Delivery practices bull 4+ yearsrsquo experience with Infrastructure-As-Code practices bull 4+ yearsrsquo experience with Configuration Management tools such as Ansible, Salt, Chef, Puppet bull 4+ yearsrsquo experience with build tools such as Jenkins, Maven, Ivy, Ant bull 4+ yearsrsquo experience with Version Control tools and platforms such as Git, GitHub, Subversion bull 4+ yearsrsquo experience with Collaboration platforms such as JIRA, Confluence, Wiki, Version one bull 4+ yearsrsquo experience with Containerization technologies such as Docker, Pivotal bull 4 yearsrsquo experience with Virtualization technologies such as KVM, Xen, VMWare bull 4+ yearsrsquo experience with Services such as Apache, NGINX, HAProxy, Varnish, PHP, Tomcat, Node.js, MySQL, MariaDB  Experience with the Tandem non-stop platforms is a plus<br><br>
  **Output:**<br>
  jobs_tfidf1.matched_title[2] = Software Quality Assurance Analysts and Testers <br>
  jobs_tfidf2.matched_title[2] = Software Quality Assurance Analysts and Testers<br>
  jobs_tfidf3.matched_title[2] = Software Developers<br>
  jobs_sbert.matched_title[2] = Computer Systems Engineers/Architects<br>

  **Analysis:**<br>
  The TFIDF(ngram=1,2&3) and SBERT have found the matched title as "Software Quality Assurance Analysts", "Computer Systems Engineers/Architects" and "Software Developers" and all of these are valid choices. 

#Example 4
**Input:**<br>
  jobs_tfidf1.title[3] = Salesforce Administrator: REMOTE <br>
  jobs_tfidf1.jobdesc[3] = Salesforce Administrator: REMOTE – C-4 AnalyticsC-4 Analytics is a fast-growing, private, digital marketing company that excels at helping automotive dealerships increase sales, increase market share, and lower cost per acquisition. We are currently hiring for a Salesforce Administrator: REMOTE, as we look to expand our team and support our growing roster of local and national clients.Who We're Looking For: Salesforce Administrator - REMOTEYou are eager, curious, Salesforce master who understands sales process needs, you have strong communication skills and have a love of organization and teamwork including outreach to sales team members and communication with many different personalities. Love of the automotive industry is a definite plus, as we are a leader in that space.As Salesforce Administrator, you'll be part of a strong sales support team responsible for supporting the larger sales team on all needs required to keep the sales process running smoothly. You will have a significant, measurable impact on the company's sales process and overall sales success.A day in the life of a Salesforce Administrator: REMOTEWork closely with the Chief Sales Officer, other executives and sales operations team members to implement and manage critical sales related processesComplete responsibility for the Salesforce.com system, and additional application integrations (Salesloft, Pardot, Outreach, Zoom Info, etc.)Working closely with the Marketing Team to configure Pardot and develop best practicesProvide reporting and dashboards as requested by leadershipHandle system architecture and design to meet business needsResponsible for high data integrity in SalesforceEnsuring optimal performance of Salesforce systems and productsUpgrading and configuring Salesforce systems and productsManaging Salesforce roles, profiles, sharing rules, workflows, and groupsPerforming database maintenance tasks, including diagnostic tests and duplicate entry cleansingEvaluating and installing new Salesforce releases, as well as providing training and supportDocumenting processes, including error reports and changes to field history tablesThis is a fast-paced role with a rapidly growing company. Communication skills (verbal and written), the ability to work in an interrupt-drive environment, multi-tasking, organization, and collaborative skills are prerequisites for success. The Salesforce Administrator must be comfortable leading others, directing others, and working within a highly visible position.Requirements:Bachelor's degree in Computer Science, Management Information Systems, or related fieldSalesforce certified administrator or Salesforce advanced administrator certification2-3 years of experience as a Salesforce administrator in a similar environment Extensive experience in the administration and maintenance of Salesforce systemsProficient with all aspects of Salesforce and Salesforce integrationsExperience in converting historical Salesforce Classic Automations and validation rules to Lightning using Flows and Process BuilderExcellent interpersonal and customer service skillsExcellent organizational skills and attention to detailStrong analytical and problem-solving skillsC-4 Analytics is a full-service advertising and digital marketing company committed to developing innovative solutions for every dealer in every market, and to providing the highest levels of accountability and customer service. Key details about our company include:We have 200-plus employees across Client Services, SEO/Content, Paid Search, Creative, Social Media, Product, Sales and Operations teamsWe specialize in automotive digital marketing, and the rest of the industry follows our leadWe have three main offices: our headquarters in Wakefield, MA, and offices in Chicago, IL and Ann Arbor, MIEmployee Perks & Bragging RightsCompetitive salaries and benefits packages, including 401k matchHands-on training opportunities with leading companies like Google and FacebookWeekly Innovation Hours and Lunch-and-Learns for employee development8-time National Best and Brightest Places to Work For WinnerAgency-wide volunteer days and company-sponsored team outingsBest-in-industry client-to-employee ratioWhat our Employees Say:Want to know what makes working at C-4 Analytics so rewarding? Take it from the true experts: our current teammates. Recent surveys about our workplace and culture suggest that our staff loves:The People: It's not just a cliché; we have the best, hungriest and smartest team in the business.The Culture: Teamwork. Camaraderie. Perseverance. We hire for these traits, and it shows.The Growth: We place a real emphasis on training, development and career planning.The Trust: Our managers empower their people and teams to thrive in their own ways.The Challenge: We work in a competitive industry and a dynamic field. You'll never be bored!More About C-4 AnalyticsC-4 Analytics is a full-service advertising and digital marketing company. We take the guesswork out of advertising. We don’t over-promise: we over-deliver. We provide real value to our clients because we really value them as partners. We love Google and Facebook, but also love Instagram and Bing. We innovate, educate and instigate. We are forward-thinking, but we learn from the past. We are results-driven and our strategies drive results. We love the practical applications of psychology to marketing, but we aren’t above a good practical joke. We are team players, but we love to crush our competitors. We create an environment of respect and we respect the environment. We are the brains and the good looks. We are very humble. We are nerds, but cool, likeable nerds. We are never gonna give you up. Never gonna let you down. We are all work and all play. We calculated that only 15.8% of visitors who started this paragraph would actually read this far down. We are C-4 Analytics.Want to know more? Want to become part our of SalesOps team? Ready to step up to the challenge? Send us your resume, along with a brief introduction explaining how you can help us continue to grow and deliver the highest level of client service.Powered by JazzHR3md273awV2<br><br>
  **Output:**<br>
  jobs_tfidf1.matched_title[3] = Sales Managers <br>
  jobs_tfidf2.matched_title[3] = Sales Managers<br>
  jobs_tfidf3.matched_title[3] = Sales Managers<br>
  jobs_sbert.matched_title[3] = Computer User Support Specialists<br>

  **Analysis:**<br>
  The TDIDF(ngram=1,2&3) have found the matched title as "Sales Managers" which is the least appropriate. SBERT has found the matched title as "Computer User Support Specialists" and is the most appropriate alternative.

Since TFIDF looks for exact word matches instead of the semantics and context of the words, unigram TFIDF and higher n-grams have found an incorrect search result. SBERT, despite not being fine-tuned on this specific corpus, was able to identify a reasonably good match not solely based on lexical matches. 

#Conclusion

From these examples, we can see how the bag-of-words TFIDF method can retrieve entirely incorrect/irrevelant matches (Examples 1,4) whereas SBERT can always identify related job-titles even if better alternatives exist. For optimal performace, we can consider combining bag-of-words lexical matching with semantic similarity measures such as SBERT. As this is essentially an information-retrieval task, we can first use job-title as a query and use BM25 to retrieve some top N(say, 10) relevant jobs from ONET and then use SBERT for reranking these search results: this is inspired by [1].

[1] Nogueira, Rodrigo, and Kyunghyun Cho. "Passage Re-ranking with BERT." arXiv preprint arXiv:1901.04085 (2019).
