## Beat the Applicant Tracking System
- This script identifies which words are in the job description you are applying for, but not in your resume
- You can add these words to the bottom of your resume, reduce the font size and change the text color to white so they are not visible by the naked eye
- The applicant tracking system will detect these words in your resume and as a result your resume will be a 100% match everytime. This will get your resume in front of a real human more often

In [1]:
import pandas as pd
import re
import string
from nltk.corpus import stopwords
from nltk.stem.wordnet import WordNetLemmatizer
from nltk.tag import pos_tag
from nltk.tokenize import word_tokenize
import itertools

In [2]:
def clean_data(full_job_description):
    # Cleaning data
    cleaner = re.sub("\n", " ", full_job_description)
    clean_job_description = re.sub("[%s]" % re.escape(string.punctuation), "  ", cleaner)
    cleaner = re.sub("[–]|  |", "", cleaner)
    cleaner = re.sub('[‘’“”…]', '', cleaner)
    print(f"Original Word Count: {len([word for word in clean_job_description.split(' ')])}")

    # Tokenizing words
    words = word_tokenize(clean_job_description)

    # Cleaning stopwords
    stop_words = stopwords.words('english')
    clean_words=[word.lower() for word in words if not word in stop_words]
    print(f"First Clean: {len(clean_words)}")

    # Lemmatizing words (running = run)
    cleanest_words = []
    for word, tag in pos_tag(clean_words):
        # Assigning new position tags for WordNetLemmatizer() function
        if tag.startswith("NN"):
            pos = "n"
        elif tag.startswith("VB"):
            pos = "v"
        else:
            pos = "a"

        lemmatizer = WordNetLemmatizer()
        lemmatized_word = lemmatizer.lemmatize(word, pos)

        cleanest_words.append(lemmatized_word)

    # Checking for duplicates and long words
    final_words = []
    for word in cleanest_words:
        if word not in final_words and len(word)<20 and len(word)!=1:
            final_words.append(word)
    print(f"Second Clean: {len(final_words)}")
    
    return final_words

### Upload and Clean Job Description
- Copy and paste the job description you are applying for here
- Only paste information related to the job duties, avoid pasting information about the company
- The clean_data() function uses NLP to clean the data for unnecessary words

In [3]:
job_description = """Who we are looking for:
Wunderman Thompson is seeking a Programmer/Analyst. You will be responsible for the loading, summarization, quality control, analysis, and reporting of all data being ingested by the Co-op and other marketing databases.

What you’ll do: 
Deliverables | Develop and execute data conversion programs, using regular expressions and programs for flat file transformation, development and maintenance, ad hoc data extract, variable edit, and report generation, in a PC/Server environment. Take responsibility for data integrity; deliver on programming requirements and meet required deadlines.
Process | Develop an understanding of data processing issues related to Wunderman Thompson Data's evolving business strategy, and provide possible solutions.  Develop, execute, and adhere to Quality Assurance (QA) processes and procedures
Work | Handle multiple channels of communication across departments and prioritize key tasks. Take on special projects that may require coding and maintenance
Who you are:
Open and collaborative | Our team is close-knit and supportive and we’re working with a lot of unknowns – you must champion of team environments that are comfortable and encouraging.
Optimistic and resilient | Dig in and figure out how to work around problems. Yes and why not posture. Takes care of self and team. Balance needed to maintain stamina and positivity.
Ego-less | We all wear the hats that need wearing, it’s a mentality that makes the team successful.
What you’ll need: 
Technical aptitude and problem solving skills.
2+ years Linux/Unix.
2+ years using SQL in Oracle environment.
Shell scripting, Perl scripting is a plus.
Regular expressions.
Experience and proven performance in direct marketing and support of direct channel sales, along with use of targeting tools in implementation.
Proficient in Microsoft Outlook, Word, Excel, and PowerPoint.
Extreme attention to detail and data quality.
Ability to work in fast paced and highly dynamic work environment.
Have excellent interpersonal, communications, and customer service skills.
Ability to prioritize, self-manage, and seek help or direction when necessary.
Manage multiple tasks and high volume workload."""

In [4]:
description_words = clean_data(job_description)

Original Word Count: 468
First Clean: 228
Second Clean: 167


In [5]:
print(description_words)

['who', 'look', 'wunderman', 'thompson', 'seek', 'programmer', 'analyst', 'you', 'responsible', 'load', 'summarization', 'quality', 'control', 'analysis', 'report', 'data', 'ingest', 'co', 'op', 'marketing', 'databases', 'what', 'deliverable', 'develop', 'execute', 'conversion', 'program', 'use', 'regular', 'expression', 'flat', 'file', 'transformation', 'development', 'maintenance', 'ad', 'hoc', 'extract', 'variable', 'edit', 'generation', 'pc', 'server', 'environment', 'take', 'responsibility', 'integrity', 'deliver', 'programming', 'requirement', 'meet', 'required', 'deadline', 'process', 'understand', 'issue', 'relate', 'evolve', 'business', 'strategy', 'provide', 'possible', 'solution', 'adhere', 'assurance', 'qa', 'procedure', 'work', 'handle', 'multiple', 'channel', 'communication', 'across', 'department', 'prioritize', 'key', 'task', 'special', 'project', 'may', 'require', 'cod', 'open', 'collaborative', 'our', 'team', 'close', 'knit', 'supportive', 'lot', 'unknowns', 'must', '

### Upload and Clean Resume
- We will do the same as above with our current resume

In [6]:
resume = """DATA ANALYST & FULL STACK WEB DEVELOPER

Multidisciplinary Data Analyst and Full Stack Web Developer with a passion for extracting actionable insights from big data to help inform decision-makers and drive growth. Building on 3+ years of experience in sales and operations management and graduating from the University of Denver's Data Science program in Dec 2020, I combine my public speaking background to deliver a uniquely skilled data technologist with a business focus. Prepared to excel in solving complex business problems. Experienced in big data analysis/visualization and working cross-functionally to collect data and develop models to determine trends utilizing a variety of data sources.

TECHNICAL SKILLS

Languages: Python, SQL, Javascript, HTML, CSS, Excel VBA, R, JSON
Data Manipulation & Visualization: Pandas, Numpy, Matplotlib, Plotly, D3, BeautifulSoup, Selenium
Database: MySQL, MongoDB, SQLAlchemy, Tableau, Hadoop 
Other: Flask, Heroku, Git, Microsoft Office Suite, Machine Learning, Natural Language Processing

PROJECTS
Twitter Sentiment Analysis – github.com/loganbonsignore/twitter-sentiment-analysis            
Full-stack web app that calculates market sentiment for any business based on recently published tweets
●	Uses AI and NLP to analyze and classify tweets in order to calculate a sentiment score. Returns graphical representations of sentiment trends and examples of tweets used during the analysis.
●	Tools used: Pandas, Matplotlib, Flask, JavaScript, Selenium, BeautifulSoup, RegEx, Twitter API

Foodies API – github.com/RoarkJ/foodies_api    
RESTful API providing Colorado business data to interested parties
●	Used the ETL Process to build a database of 7000 restaurant and retail businesses in Colorado through different web scraping techniques and API’s.
●	Tools used: Pandas, Flask, RegEx, Selenium, BeautifulSoup, MongoDB, Yelp API

Media Consumption Analysis – github.com/loganbonsignore/media-consumption-analysis    
In-depth analysis of Covid-19’s impact on America’s media consumption habits
●	Examines which historically important voter issues Americans are most focused on today and if those issues have changed when compared to recent election years.
●	Tools used: Pandas, Matplotlib, Numpy, JSON, Jupyter Notebook, New York Times API

EXPERIENCE
Cintas Corporation										          Denver, CO
Training and Compliance Instructor							         2019 – 2020
Delivered engaging instructor-led training courses to corporate audiences.
●	Responsible for $30,000 revenue per month (Achievers Club award, Q1 2020)
●	Provided official consultation at customer sites to ensure government compliance
●	Negotiated and scheduled training courses with customer decision makers
●	Worked closely with service and sales team to ensure customer satisfaction

Cintas Corporation										          Denver, CO
Management Trainee							         		         2017 – 2019
Rotational program geared toward developing future leaders of Cintas.
●	Service rotation: Managed relationships with 200 current customers to leverage product and service sales growth (Achievers Club award, Q4 2018)
●	Warehouse rotation: Managed team of 3 responsible for $600,000 of annual inventory turnover
●	Training rotation: Delivered engaging in-person training courses to corporate audiences (sold $50,000 revenue, 2H 2019)
●	Office rotation: Received insights into fundamental business practices including financial statements, invoicing, high-level decision making and more
●	Additional duties: Fleet Manager - Responsible for the safety, maintenance and government compliance of 40+ vehicles and drivers

Rabobank, N.A.										          Fresno, CA
Intern, Food and Agribusiness Research and Advisory					         2016 – 2016
Supported key research projects for industry leading food and agribusiness analysts
●	Developed global agricultural research and business intelligence data
●	Co-authored an externally published industry research report
●	Developed and delivered a presentation outlining industry research

EDUCATION

Data Science Program
University of Denver – Denver, CO
A 24-week intensive program focused on gaining technical programming skills in Excel, VBA, Python, JavaScript, SQL Databases, Tableau, Big Data, Machine Learning and more.

Bachelor of Science in Business Administration
Texas A&M University – College Station, TX
Member of the Texas A&M Men’s Club Lacrosse Team.
"""

In [7]:
resume = re.sub('\d', '', resume)
resume_words = clean_data(resume)

Original Word Count: 928
First Clean: 487
Second Clean: 264


In [8]:
print(resume_words)

['data', 'analyst', 'full', 'stack', 'web', 'developer', 'multidisciplinary', 'passion', 'extract', 'actionable', 'insight', 'big', 'help', 'inform', 'decision', 'maker', 'drive', 'growth', 'building', 'year', 'experience', 'sale', 'operation', 'management', 'graduate', 'university', 'denver', 'science', 'program', 'dec', 'combine', 'public', 'speaking', 'background', 'deliver', 'uniquely', 'skilled', 'technologist', 'business', 'focus', 'prepare', 'excel', 'solve', 'complex', 'problem', 'analysis', 'visualization', 'work', 'cross', 'functionally', 'collect', 'develop', 'model', 'determine', 'trend', 'utilizing', 'variety', 'source', 'technical', 'skill', 'languages', 'python', 'sql', 'javascript', 'html', 'cs', 'vba', 'json', 'manipulation', 'panda', 'numpy', 'matplotlib', 'plotly', 'beautifulsoup', 'selenium', 'database', 'mysql', 'mongodb', 'sqlalchemy', 'tableau', 'hadoop', 'other', 'flask', 'heroku', 'git', 'microsoft', 'office', 'suite', 'machine', 'learn', 'natural', 'language',

### Find Words In Job Description & Not In Your Resume

In [11]:
not_in_resume = [word for word in description_words if word not in resume_words]
not_in_resume

['who',
 'look',
 'wunderman',
 'thompson',
 'seek',
 'programmer',
 'you',
 'load',
 'summarization',
 'quality',
 'control',
 'ingest',
 'op',
 'marketing',
 'what',
 'deliverable',
 'execute',
 'conversion',
 'regular',
 'expression',
 'flat',
 'file',
 'transformation',
 'development',
 'ad',
 'hoc',
 'variable',
 'edit',
 'generation',
 'pc',
 'server',
 'environment',
 'take',
 'responsibility',
 'integrity',
 'requirement',
 'meet',
 'required',
 'deadline',
 'understand',
 'relate',
 'evolve',
 'strategy',
 'possible',
 'solution',
 'adhere',
 'assurance',
 'qa',
 'procedure',
 'handle',
 'multiple',
 'channel',
 'communication',
 'across',
 'department',
 'prioritize',
 'task',
 'special',
 'may',
 'require',
 'cod',
 'open',
 'collaborative',
 'our',
 'close',
 'knit',
 'supportive',
 'lot',
 'unknowns',
 'must',
 'champion',
 'comfortable',
 'encouraging',
 'optimistic',
 'resilient',
 'dig',
 'figure',
 'around',
 'yes',
 'posture',
 'care',
 'self',
 'balance',
 'need',
 '

### Result
* Paste this text at the bottom of your resume
* Reduce the font size and change the text color to white
* Your resume is now a 100% match to the applicant tracking system!

In [12]:
(" ").join(not_in_resume)

'who look wunderman thompson seek programmer you load summarization quality control ingest op marketing what deliverable execute conversion regular expression flat file transformation development ad hoc variable edit generation pc server environment take responsibility integrity requirement meet required deadline understand relate evolve strategy possible solution adhere assurance qa procedure handle multiple channel communication across department prioritize task special may require cod open collaborative our close knit supportive lot unknowns must champion comfortable encouraging optimistic resilient dig figure around yes posture care self balance need maintain stamina positivity ego less we wear hat mentality successful aptitude linux unix oracle shell script perl plus proven performance direct along targeting implementation proficient outlook word powerpoint extreme attention detail ability fast pace highly dynamic have excellent interpersonal direction necessary volume workload'