# Entity Extraction with Generative Models

This notebook demonstrates how to use Cohere's generative models to extract the entity label from the job descriptions This demonstrates Named Entity Recognition (NER) of entities which are harder to isolate using other NLP methods (and where pre-training provides the model with some context on these entities).

In [None]:
!pip install cohere requests tqdm

import packages

In [1]:
import cohere
import pandas as pd
import requests
import datetime
from tqdm import tqdm
import json
pd.set_option('display.max_colwidth', None)

In [2]:
def get_post_titles(**kwargs):
    """ Gets data from the pushshift api. Read more: https://github.com/pushshift/api """
    base_url = f"https://api.pushshift.io/reddit/search/submission/"
    payload = kwargs
    request = requests.get(base_url, params=payload)
    return [a['title'] for a in request.json()['data']]

In [3]:
# api key
api_key = 'mxiQHQDT91By3aDK7pjlghcJOm9qHfNZJ8a9Tyur'

# Create and retrieve a Cohere API key from os.cohere.ai
co = cohere.Client(api_key)

In [4]:
df = pd.read_json('../data/relations_dev.txt')

In [5]:
df

Unnamed: 0,document,tokens,relations
0,"Bachelor's degree in Mechanical Engineering or Physical Science 3+ years track record of developing or specifying fiber optic cables and connector related products Knowledge of fiber optic component, cabling, and interconnect products, technologies, and standards Experience in statistical data analysis Experience with product life cycle management (PLM) process Experience providing solutions to problems and meeting deadlines Experience engaging stakeholders PREFERRED Advanced degree Experience using a software tool for statistical data analysis such as JMP Experience using Agile as product life-cycle management tool Data center or other mission critical development experience","[{'text': 'Bachelor', 'start': 0, 'end': 8, 'token_start': 0, 'token_end': 0, 'entityLabel': 'DIPLOMA'}, {'text': 'Mechanical Engineering', 'start': 21, 'end': 43, 'token_start': 4, 'token_end': 5, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': 'Physical Science', 'start': 47, 'end': 63, 'token_start': 7, 'token_end': 8, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': '3+ years', 'start': 64, 'end': 72, 'token_start': 9, 'token_end': 11, 'entityLabel': 'EXPERIENCE'}, {'text': 'developing', 'start': 89, 'end': 99, 'token_start': 15, 'token_end': 15, 'entityLabel': 'SKILLS'}, {'text': 'fiber optic cables', 'start': 114, 'end': 132, 'token_start': 18, 'token_end': 20, 'entityLabel': 'SKILLS'}, {'text': 'connector related products', 'start': 137, 'end': 163, 'token_start': 22, 'token_end': 24, 'entityLabel': 'SKILLS'}]","[{'child': 4, 'head': 0, 'relationLabel': 'DEGREE_IN'}, {'child': 7, 'head': 0, 'relationLabel': 'DEGREE_IN'}, {'child': 15, 'head': 9, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 18, 'head': 9, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 22, 'head': 9, 'relationLabel': 'EXPERIENCE_IN'}]"
1,"10+ years of software engineering work experience. Technical experience in release automation engineering, CI/CD or related roles. Experience building and leading a software organization through product design, delivery and commercialization of consumer electronics devices. Experience recruiting and managing technical teams, including performance management. BS/MS in Computer Science. Experience in leading timeline, multi-partner initiatives. Organizational communication and coordination experience. PREFERRED 5+ years of experience with hands-on technical management, release engineering, tools engineering, DevOps, or related area.","[{'text': '10+ years', 'start': 0, 'end': 9, 'token_start': 0, 'token_end': 2, 'entityLabel': 'EXPERIENCE'}, {'text': 'software engineering', 'start': 13, 'end': 33, 'token_start': 4, 'token_end': 5, 'entityLabel': 'SKILLS'}, {'text': '5+ years', 'start': 515, 'end': 523, 'token_start': 77, 'token_end': 79, 'entityLabel': 'EXPERIENCE'}, {'text': 'technical management', 'start': 552, 'end': 572, 'token_start': 86, 'token_end': 87, 'entityLabel': 'SKILLS'}, {'text': 'release engineering', 'start': 574, 'end': 593, 'token_start': 89, 'token_end': 90, 'entityLabel': 'SKILLS'}, {'text': 'tools engineering', 'start': 595, 'end': 612, 'token_start': 92, 'token_end': 93, 'entityLabel': 'SKILLS'}, {'text': 'DevOps', 'start': 614, 'end': 620, 'token_start': 95, 'token_end': 95, 'entityLabel': 'SKILLS'}, {'text': 'BS/MS', 'start': 361, 'end': 366, 'token_start': 53, 'token_end': 55, 'entityLabel': 'DIPLOMA'}, {'text': 'Computer Science', 'start': 370, 'end': 386, 'token_start': 57, 'token_end': 58, 'entityLabel': 'DIPLOMA_MAJOR'}]","[{'child': 4, 'head': 0, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 86, 'head': 77, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 89, 'head': 77, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 92, 'head': 77, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 95, 'head': 77, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 57, 'head': 53, 'relationLabel': 'DEGREE_IN'}]"
2,"3+ years Swift & Objective-C and experience with iOS internals Experience building an entire app from scratch and ideally a portfolio of apps featured in the App Store Someone who knows every trick in the book on UI transitions, network communication and memory/battery efficiency Strong UI/design skill experience is a plus","[{'text': '3+ years', 'start': 0, 'end': 8, 'token_start': 0, 'token_end': 2, 'entityLabel': 'EXPERIENCE'}, {'text': 'Swift & Objective-C', 'start': 9, 'end': 28, 'token_start': 3, 'token_end': 7, 'entityLabel': 'SKILLS'}]","[{'child': 3, 'head': 0, 'relationLabel': 'EXPERIENCE_IN'}]"
3,"8+ years experience in software engineering leadership 5+ years people management experience including managing leaders and managing remotely across regions Strategic thinker with proven track record of transforming operations to provide customer experience through innovation and improvement Track record of working with VP, C-level Executives Experience deploying operational support models across enterprise organizations Communication/presentations experience Experience working with all levels of management internally and externally Experience meeting objectives in an entrepreneurial environment Collaboration and relationship-building experience BA/BS degree or higher","[{'text': '8+ years', 'start': 0, 'end': 8, 'token_start': 0, 'token_end': 2, 'entityLabel': 'EXPERIENCE'}, {'text': 'software engineering', 'start': 23, 'end': 43, 'token_start': 5, 'token_end': 6, 'entityLabel': 'SKILLS'}, {'text': '5+ years', 'start': 55, 'end': 63, 'token_start': 8, 'token_end': 10, 'entityLabel': 'EXPERIENCE'}, {'text': 'people management', 'start': 64, 'end': 81, 'token_start': 11, 'token_end': 12, 'entityLabel': 'SKILLS'}, {'text': 'managing leaders', 'start': 103, 'end': 119, 'token_start': 15, 'token_end': 16, 'entityLabel': 'SKILLS'}]","[{'child': 5, 'head': 0, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 11, 'head': 8, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 15, 'head': 8, 'relationLabel': 'EXPERIENCE_IN'}]"
4,"BS degree in Computer Science or related field. 7+ years C++ experience, including C++11 features and principles. 5+ years experience creating software for real-time environments such as games or robotics. 2+ years experience managing software engineers. Proven track record of software development, including shipping one or more products on large code bases that span platforms and tools. Problem solving and optimization experience. Communication experience and demonstrated experience working across disciplines and teams to drive solutions. PREFERRED Hands-on experience with 3D computer vision algorithms including Calibration, SLAM, Reconstruction, Mapping, Localization, Sensor Fusion, State Estimation and Image Processing Experience with designing (products or open-source/git software) of inertial/optical sensing devices Publication in leading workshops or conferences such as CVPR, ECCV, ICCV, SIGGRAPH, ICCP, RSS, ICRA, etc.","[{'text': 'BS', 'start': 0, 'end': 2, 'token_start': 0, 'token_end': 0, 'entityLabel': 'DIPLOMA'}, {'text': 'Computer Science', 'start': 13, 'end': 29, 'token_start': 3, 'token_end': 4, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': '7+ years', 'start': 48, 'end': 56, 'token_start': 9, 'token_end': 11, 'entityLabel': 'EXPERIENCE'}, {'text': 'C++', 'start': 57, 'end': 60, 'token_start': 12, 'token_end': 12, 'entityLabel': 'SKILLS'}, {'text': 'C++11', 'start': 83, 'end': 88, 'token_start': 16, 'token_end': 16, 'entityLabel': 'SKILLS'}, {'text': '5+ years', 'start': 114, 'end': 122, 'token_start': 21, 'token_end': 23, 'entityLabel': 'EXPERIENCE'}, {'text': 'creating software for real-time environments', 'start': 134, 'end': 178, 'token_start': 25, 'token_end': 31, 'entityLabel': 'SKILLS'}, {'text': 'games', 'start': 187, 'end': 192, 'token_start': 34, 'token_end': 34, 'entityLabel': 'SKILLS'}, {'text': 'robotics', 'start': 196, 'end': 204, 'token_start': 36, 'token_end': 36, 'entityLabel': 'SKILLS'}, {'text': '2+ years', 'start': 206, 'end': 214, 'token_start': 38, 'token_end': 40, 'entityLabel': 'EXPERIENCE'}, {'text': 'managing software engineers', 'start': 226, 'end': 253, 'token_start': 42, 'token_end': 44, 'entityLabel': 'SKILLS'}]","[{'child': 3, 'head': 0, 'relationLabel': 'DEGREE_IN'}, {'child': 12, 'head': 9, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 16, 'head': 9, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 25, 'head': 21, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 34, 'head': 21, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 36, 'head': 21, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 42, 'head': 38, 'relationLabel': 'EXPERIENCE_IN'}]"
5,"8+ years of development experience with Python and Go or C++ Experience with SDN (e.g. OpenFlow, VPP, Open vSwitch) A proven track record of performing on a winning team with a demonstration of bringing products and services to market. A proven track record of system software development. A proven track record of developing software for extremely large scale, distributed environments. A deep understanding of complex system integration and methodologies for proving the correct operation under nominal and faulted conditions of a hyperscale cloud environment. A deep understanding of developing for high availability and its inherent problems. A deep understanding of containers and container clusters such as kubernetes. A deep understanding of virtual machine environment including VMware, KVM, or Xen. Understanding of Open Systems Interconnection Reference Model (OSI-RM), the TCP/IP stack, and development experience in L2, L3, L4. Preferred Tech and Prof Experience Familiarity with networking protocols which might include: BGP, DHCP, DNS, OSPF, IGMP, IPv4/IPv6, IS-IS, and others. Familiarity with Network function virtualization (NFV) functionality and network encapsulation including L4 Load Balancers, Firewalls, etc. Working knowledge of git and CICD tools such as Jenkins and Zuul. EO Statement IBM is committed to creating a diverse environment and is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, or veteran status. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status. .","[{'text': '8+ years', 'start': 0, 'end': 8, 'token_start': 0, 'token_end': 2, 'entityLabel': 'EXPERIENCE'}, {'text': 'development', 'start': 12, 'end': 23, 'token_start': 4, 'token_end': 4, 'entityLabel': 'SKILLS'}, {'text': 'Python', 'start': 40, 'end': 46, 'token_start': 7, 'token_end': 7, 'entityLabel': 'SKILLS'}, {'text': 'C++', 'start': 57, 'end': 60, 'token_start': 11, 'token_end': 11, 'entityLabel': 'SKILLS'}]","[{'child': 4, 'head': 0, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 7, 'head': 0, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 11, 'head': 0, 'relationLabel': 'EXPERIENCE_IN'}]"
6,"Strong problem solving and analytical skills in the areas of Linux OS, packaged software, middleware and Cloud virtualization enviornments Strong SW skills with good working knowledge of Cloud, server virtualization hypervisor technology, storage SAN/fabric and network topologies Strong software engineering skills and experience in the areas Python, Java, PySpark, Bash, KShell, SQL database, noSQL database, syslog device data Strong working knowledge of all processor, memory, storage and network statistics analytics and measurement techniques Experience in working with Elastic, which will be used as the spurce of the logs and metrics data for the performance analysis Knowledge of Dynatrace and other application monitoring tools and their API interfaces Must have 2+ years in performance modeling and analysis of IT infrastructure environments that require high availability and high user access concurrency Preferred Tech and Prof Experience Outstanding written and verbal communications - Excellent collaboration skills Proficiency in computer/SoC architecture and performance trade-offs Knowledge and experience with simulators and performance monitoring debugging frameworks/tools Ability to conduct experiments in all phases of design, gathering and analyzing data; and utilize scripting/spread sheet to document and present the results BS / MS in EE/CS is ; 5 Years of related experience EO Statement IBM is committed to creating a diverse environment and is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, or veteran status. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status. .","[{'text': '2+ years', 'start': 773, 'end': 781, 'token_start': 125, 'token_end': 127, 'entityLabel': 'EXPERIENCE'}, {'text': 'performance modeling', 'start': 785, 'end': 805, 'token_start': 129, 'token_end': 130, 'entityLabel': 'SKILLS'}, {'text': 'IT infrastructure', 'start': 822, 'end': 839, 'token_start': 134, 'token_end': 135, 'entityLabel': 'SKILLS'}, {'text': 'access concurrency', 'start': 898, 'end': 916, 'token_start': 144, 'token_end': 145, 'entityLabel': 'SKILLS'}, {'text': 'BS / MS', 'start': 1351, 'end': 1358, 'token_start': 210, 'token_end': 212, 'entityLabel': 'DIPLOMA'}, {'text': 'EE', 'start': 1362, 'end': 1364, 'token_start': 214, 'token_end': 214, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': 'CS', 'start': 1365, 'end': 1367, 'token_start': 216, 'token_end': 216, 'entityLabel': 'DIPLOMA_MAJOR'}]","[{'child': 129, 'head': 125, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 134, 'head': 125, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 144, 'head': 125, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 214, 'head': 210, 'relationLabel': 'DEGREE_IN'}, {'child': 216, 'head': 210, 'relationLabel': 'DEGREE_IN'}]"
7,"Experience in backend services, systems, and data infrastructure. BS/MS in Computer Science or a related technical field. 5+ years of work experience architecting, developing, and shipping software products and/or services. 4+ years of experience managing cross-functional engineering teams.","[{'text': 'BS/MS', 'start': 66, 'end': 71, 'token_start': 11, 'token_end': 13, 'entityLabel': 'DIPLOMA'}, {'text': 'Computer Science', 'start': 75, 'end': 91, 'token_start': 15, 'token_end': 16, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': '5+ years', 'start': 122, 'end': 130, 'token_start': 23, 'token_end': 25, 'entityLabel': 'EXPERIENCE'}, {'text': 'architecting', 'start': 150, 'end': 162, 'token_start': 29, 'token_end': 29, 'entityLabel': 'SKILLS'}, {'text': 'developing', 'start': 164, 'end': 174, 'token_start': 31, 'token_end': 31, 'entityLabel': 'SKILLS'}, {'text': '4+ years', 'start': 224, 'end': 232, 'token_start': 40, 'token_end': 42, 'entityLabel': 'EXPERIENCE'}, {'text': 'managing cross-functional engineering teams', 'start': 247, 'end': 290, 'token_start': 45, 'token_end': 50, 'entityLabel': 'SKILLS'}]","[{'child': 15, 'head': 11, 'relationLabel': 'DEGREE_IN'}, {'child': 29, 'head': 23, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 31, 'head': 23, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 45, 'head': 40, 'relationLabel': 'EXPERIENCE_IN'}]"
8,"experience: 5 + years of full-stack software development experience to include 5+ years of C# (or similar) experience with the ability to contribute to technical architecture across web, mobile, middle tier, data pipeline and storage components. 5 + years of experience delivering scalable and resilient services at large enterprise scale, including experience in data platforms including large-scale analytics on relational, structured and unstructured data. 3+ years of experience as a SWE lead in an agile environment including 1+ years of experience operating in a DevOps model. 2+ years of experience designing secure, scalable and cost-efficient PaaS services on the Microsoft Azure (or similar) platform. Desired attributes: People management experience, in addition to multiple years of experience mentoring early in career talent. Project Dev Lead experience in an agile environment. Partners with Product Owner and Scrum Master to optimize implementation of agile practices. Experience with microservices, serverless architectures, Azure IoT suite, and the Azure event grid. Passion for innovation and a vision for how technology can add ambient intelligence to the workplace. Can articulate the opportunity and challenges of enterprise IoT, from devices to big data Experience shaping a DevOps culture with a vision for building and nurturing an innovative culture while maintaining a Live Site focus. Capability to work independently on the most complex code contributions of a feature. Experience with applying data science and machine learning to unlock new capabilities. Passion for mentoring early in career engineering talent. Expertise in latest web client technology frameworks (e.g., Angular, React, etc.). Deep customer empathy and vision for innovative productivity scenarios Track record as someone who can solve the tough technical challenges If you possess the above, and are a strong perpetuator of an action-oriented and fun work culture this might be the role for you! #CSEO #EUSEJOBS","[{'text': '5 + years', 'start': 12, 'end': 21, 'token_start': 2, 'token_end': 4, 'entityLabel': 'EXPERIENCE'}, {'text': 'full-stack software development', 'start': 25, 'end': 56, 'token_start': 6, 'token_end': 10, 'entityLabel': 'SKILLS'}, {'text': '5+ years', 'start': 79, 'end': 87, 'token_start': 14, 'token_end': 16, 'entityLabel': 'EXPERIENCE'}, {'text': 'C#', 'start': 91, 'end': 93, 'token_start': 18, 'token_end': 19, 'entityLabel': 'SKILLS'}, {'text': '5 + years', 'start': 246, 'end': 255, 'token_start': 47, 'token_end': 49, 'entityLabel': 'EXPERIENCE'}, {'text': 'delivering scalable and resilient services', 'start': 270, 'end': 312, 'token_start': 52, 'token_end': 56, 'entityLabel': 'SKILLS'}, {'text': 'data platforms', 'start': 364, 'end': 378, 'token_start': 65, 'token_end': 66, 'entityLabel': 'SKILLS'}, {'text': 'large-scale analytics', 'start': 389, 'end': 410, 'token_start': 68, 'token_end': 71, 'entityLabel': 'SKILLS'}, {'text': '3+ years', 'start': 460, 'end': 468, 'token_start': 80, 'token_end': 82, 'entityLabel': 'EXPERIENCE'}, {'text': 'SWE lead', 'start': 488, 'end': 496, 'token_start': 87, 'token_end': 88, 'entityLabel': 'SKILLS'}, {'text': '1+ years', 'start': 531, 'end': 539, 'token_start': 94, 'token_end': 96, 'entityLabel': 'EXPERIENCE'}, {'text': 'DevOps', 'start': 569, 'end': 575, 'token_start': 102, 'token_end': 102, 'entityLabel': 'SKILLS'}, {'text': '2+ years', 'start': 583, 'end': 591, 'token_start': 105, 'token_end': 107, 'entityLabel': 'EXPERIENCE'}, {'text': 'designing', 'start': 606, 'end': 615, 'token_start': 110, 'token_end': 110, 'entityLabel': 'SKILLS'}]","[{'child': 6, 'head': 2, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 18, 'head': 14, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 52, 'head': 47, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 65, 'head': 47, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 68, 'head': 47, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 87, 'head': 80, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 102, 'head': 94, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 110, 'head': 105, 'relationLabel': 'EXPERIENCE_IN'}]"
9,"BS in Electrical Engineering or Computer Engineering, or related Engineering Degree, with 3+ years of industry experience. 3+ years of experience with PCB design with both integrated and discrete circuits, and high-speed interfaces. Experience with development and bring-up of data center or Enterprise infrastructure system equipment. Experience with analyzing and resolving system hardware issues. PREFERRED MS in electrical engineering or computer engineering. 5+ years of industry experience. Familiar with storage server architectures with HDD. Experience with end-to-end enabling, design, and deployment cycle of a product. Experience with working with thermal, mechanical, management firmware, system firmware, and software teams during product development. Experience with design and troubleshooting data buses used in server such as PCIe, SAS, SATA, I2C. Experience with industry standard EDA tools for developing PCB designs. Knowledge of using Linux for testing.","[{'text': 'BS', 'start': 0, 'end': 2, 'token_start': 0, 'token_end': 0, 'entityLabel': 'DIPLOMA'}, {'text': 'Electrical Engineering', 'start': 6, 'end': 28, 'token_start': 2, 'token_end': 3, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': 'Computer Engineering', 'start': 32, 'end': 52, 'token_start': 5, 'token_end': 6, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': 'Engineering', 'start': 65, 'end': 76, 'token_start': 10, 'token_end': 10, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': '3+ years', 'start': 90, 'end': 98, 'token_start': 14, 'token_end': 16, 'entityLabel': 'EXPERIENCE'}, {'text': 'industry', 'start': 102, 'end': 110, 'token_start': 18, 'token_end': 18, 'entityLabel': 'SKILLS'}, {'text': '3+ years', 'start': 123, 'end': 131, 'token_start': 21, 'token_end': 23, 'entityLabel': 'EXPERIENCE'}, {'text': 'PCB design', 'start': 151, 'end': 161, 'token_start': 27, 'token_end': 28, 'entityLabel': 'SKILLS'}, {'text': 'integrated', 'start': 172, 'end': 182, 'token_start': 31, 'token_end': 31, 'entityLabel': 'SKILLS'}, {'text': 'discrete circuits', 'start': 187, 'end': 204, 'token_start': 33, 'token_end': 34, 'entityLabel': 'SKILLS'}, {'text': 'MS', 'start': 410, 'end': 412, 'token_start': 68, 'token_end': 68, 'entityLabel': 'DIPLOMA'}, {'text': 'electrical engineering', 'start': 416, 'end': 438, 'token_start': 70, 'token_end': 71, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': 'computer engineering', 'start': 442, 'end': 462, 'token_start': 73, 'token_end': 74, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': '5+ years', 'start': 464, 'end': 472, 'token_start': 76, 'token_end': 78, 'entityLabel': 'EXPERIENCE'}, {'text': 'industry experience', 'start': 476, 'end': 495, 'token_start': 80, 'token_end': 81, 'entityLabel': 'SKILLS'}]","[{'child': 2, 'head': 0, 'relationLabel': 'DEGREE_IN'}, {'child': 5, 'head': 0, 'relationLabel': 'DEGREE_IN'}, {'child': 10, 'head': 0, 'relationLabel': 'DEGREE_IN'}, {'child': 18, 'head': 14, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 27, 'head': 21, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 31, 'head': 21, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 33, 'head': 21, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 70, 'head': 68, 'relationLabel': 'DEGREE_IN'}, {'child': 73, 'head': 68, 'relationLabel': 'DEGREE_IN'}, {'child': 80, 'head': 76, 'relationLabel': 'EXPERIENCE_IN'}]"


## Preparing examples for the prompt

In our prompt, we'll present the model with examples for the type of output we're after. 



In [6]:
 job_examples = [
     ("text", "Bachelor",
        "start", '0',
        "end", '8',
        "token_start", '0',
        "token_end", '0',
        "entityLabel", "DIPLOMA"),
     ("text", "Mechanical Engineering",
        "start", '21',
        "end", '43',
        "token_start", '4',
        "token_end", '5',
        "entityLabel", "DIPLOMA_MAJOR"),
     ("text", "developing",
        "start", '89',
        "end", '99',
        "token_start", '15',
        "token_end", '15',
        "entityLabel", "SKILLS"),
     ("text", "connector related products",
        "start", '137',
        "end", '163',
        "token_start", '22',
        "token_end", '24',
        "entityLabel", "SKILLS")
 ]
    



## Creating the extraction prompt

We'll create a prompt that demonstrates the task to the model. 

In [8]:
#@title Create the prompt (Run this cell to execute required code) {display-mode: "form"}

class cohereExtractor():
    def __init__(self, examples, example_labels, labels, task_desciption, example_prompt):
        self.examples = examples
        self.example_labels = example_labels
        self.labels = labels
        self.task_desciption = task_desciption
        self.example_prompt = example_prompt

    def make_prompt(self, example):
        examples = self.examples + [example]
        labels = self.example_labels + [""]
        return (self.task_desciption +
                "\n---\n".join( [examples[i] + "\n" +
                                self.example_prompt + 
                                 labels[i] for i in range(len(examples))]))

    def extract(self, example):
      extraction = co.generate(
          model='large',
          prompt=self.make_prompt(example),
          max_tokens=10,
          temperature=0.1,
          stop_sequences=["\n"])
      return(extraction.generations[0].text[:-1])


cohereJobExtractor = cohereExtractor([e[1] for e in job_examples], 
                                       [e[11] for e in job_examples], [],
                                       "", 
                                       "extract the EntityLabel:")



In [10]:
# This is what the prompt looks like:
print(cohereJobExtractor.make_prompt('connector related products'))

Bachelor
extract the EntityLabel:DIPLOMA
---
Mechanical Engineering
extract the EntityLabel:DIPLOMA_MAJOR
---
developing
extract the EntityLabel:SKILLS
---
connector related products
extract the EntityLabel:SKILLS
---
connector related products
extract the EntityLabel:


In [12]:
num_posts = 5

job_list = get_post_titles(size=num_posts, 
      after=str(int(datetime.datetime(2021,1,1,0,0).timestamp())), 
      before=str(int(datetime.datetime(2022,1,1,0,0).timestamp())), 
      subreddit="jobs", 
      sort_type="score", 
      sort="desc")

# Show the list
job_list

['Boss actually tried demanding me to donate to the companies “charity”',
 'Imposter Syndrome',
 'Oh no. How do I say “I enjoyed not having a biweekly mental breakdown over work and school” professionally?',
 "3 minutes into the interview the interviewer goes on a tangent about how I'm not a fit for the role.",
 "My dad, with 35 years experience as an engineer, just discovered the bullshit that is today's application process."]

## Running the model
And now we loop over the posts and process each one of them with our extractor.

In [13]:
results = []
for text in tqdm(job_list):
    try:
        extracted_text = cohereJobExtractor.extract(text)
        results.append(extracted_text)
    except Exception as e:
        print('ERROR: ', e)

100%|████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:07<00:00,  1.55s/it]


Let's look at the results:

In [14]:
pd.DataFrame(data={'text': job_list, 'extracted_text': results})

Unnamed: 0,text,extracted_text
0,Boss actually tried demanding me to donate to the companies “charity”,SKILLS
1,Imposter Syndrome,SKILLS
2,Oh no. How do I say “I enjoyed not having a biweekly mental breakdown over work and school” professionally?,SKILLS
3,3 minutes into the interview the interviewer goes on a tangent about how I'm not a fit for the role.,SKILLS
4,"My dad, with 35 years experience as an engineer, just discovered the bullshit that is today's application process.",SKILLS


## How well does this work?
We can better measure the performance of this extraction method using a larger labeled dataset. So let's load a test set of :

In [15]:
df = pd.read_json('../data/relations_test.txt')
df

Unnamed: 0,document,tokens,relations
0,"\nCurrently holding a faculty, industry, or government researcher position.\nPh.D. and publications in machine learning, AI, computer science, statistics, applied mathematics, data science, or related technical fields.\nExperience leading a team in solving analytical problems using quantitative approaches.\nExperience manipulating and analyzing data from different sources.\nExperience in theoretical and empirical research and for answering questions with research.\nAbility to communicate research for public audiences of peers.\nKnowledge in a programming language.\nAbility to obtain and maintain work authorization in the country of employment in 2018.\n\nPREFERRED \n1+ year(s) of work experience in a university, industry, or government lab(s), in a role with primary emphasis on AI research.\nExperience driving original scholarship in collaboration with a team.\nFirst-author publications at peer-reviewed AI conferences (e.g. NIPS, CVPR, ICML, ICLR, ICCV, and ACL).\nExperience in developing and debugging in C/C++, Python, C# and/or Java.","[{'text': 'Ph.D.', 'start': 75, 'end': 80, 'token_start': 14, 'token_end': 14, 'entityLabel': 'DIPLOMA'}, {'text': 'machine learning', 'start': 101, 'end': 117, 'token_start': 18, 'token_end': 19, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': 'AI', 'start': 119, 'end': 121, 'token_start': 21, 'token_end': 21, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': 'computer science', 'start': 123, 'end': 139, 'token_start': 23, 'token_end': 24, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': 'statistics', 'start': 141, 'end': 151, 'token_start': 26, 'token_end': 26, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': 'applied mathematics', 'start': 153, 'end': 172, 'token_start': 28, 'token_end': 29, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': 'data science', 'start': 174, 'end': 186, 'token_start': 31, 'token_end': 32, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': '1+ year(s', 'start': 664, 'end': 673, 'token_start': 113, 'token_end': 115, 'entityLabel': 'EXPERIENCE'}, {'text': 'university', 'start': 699, 'end': 709, 'token_start': 122, 'token_end': 122, 'entityLabel': 'SKILLS'}, {'text': 'industry', 'start': 711, 'end': 719, 'token_start': 124, 'token_end': 124, 'entityLabel': 'SKILLS'}, {'text': 'government lab(s)', 'start': 724, 'end': 741, 'token_start': 127, 'token_end': 129, 'entityLabel': 'SKILLS'}, {'text': 'AI', 'start': 778, 'end': 780, 'token_start': 138, 'token_end': 138, 'entityLabel': 'SKILLS'}]","[{'child': 18, 'head': 14, 'relationLabel': 'DEGREE_IN'}, {'child': 21, 'head': 14, 'relationLabel': 'DEGREE_IN'}, {'child': 23, 'head': 14, 'relationLabel': 'DEGREE_IN'}, {'child': 26, 'head': 14, 'relationLabel': 'DEGREE_IN'}, {'child': 28, 'head': 14, 'relationLabel': 'DEGREE_IN'}, {'child': 31, 'head': 14, 'relationLabel': 'DEGREE_IN'}, {'child': 122, 'head': 113, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 124, 'head': 113, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 127, 'head': 113, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 138, 'head': 113, 'relationLabel': 'EXPERIENCE_IN'}]"
1,"\n2+ years experience in the online advertising or research\nBS/BA Degree in Economics, Statistics, Political Science, History, Psychology\nExperience with presenting and partnering with technical and non-technical teams\nExperience communicating analyses and results to any audience\nExperience working with structured and unstructured data-sets, statistical software such as R, STATA, SPSS, SAS as well as data extraction tools such as Hive and/or SQL\n\nPREFERRED \nExperience with causal measurement, machine learning or lab-based research\nExperience in quantitative field, such as consulting, market research, strategy and planning, or user experience research\nStrong organizational and project management skills\nMasters in quantitative field or a MBA","[{'text': '2+ years', 'start': 1, 'end': 9, 'token_start': 1, 'token_end': 3, 'entityLabel': 'EXPERIENCE'}, {'text': 'online advertising', 'start': 28, 'end': 46, 'token_start': 7, 'token_end': 8, 'entityLabel': 'SKILLS'}, {'text': 'research', 'start': 50, 'end': 58, 'token_start': 10, 'token_end': 10, 'entityLabel': 'SKILLS'}, {'text': 'BS/BA', 'start': 59, 'end': 64, 'token_start': 12, 'token_end': 14, 'entityLabel': 'DIPLOMA'}, {'text': 'Economics', 'start': 75, 'end': 84, 'token_start': 17, 'token_end': 17, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': 'Statistics', 'start': 86, 'end': 96, 'token_start': 19, 'token_end': 19, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': 'Political Science', 'start': 98, 'end': 115, 'token_start': 21, 'token_end': 22, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': 'History', 'start': 117, 'end': 124, 'token_start': 24, 'token_end': 24, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': 'Psychology', 'start': 126, 'end': 136, 'token_start': 26, 'token_end': 26, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': 'Masters', 'start': 710, 'end': 717, 'token_start': 126, 'token_end': 126, 'entityLabel': 'DIPLOMA'}, {'text': 'quantitative field', 'start': 721, 'end': 739, 'token_start': 128, 'token_end': 129, 'entityLabel': 'DIPLOMA_MAJOR'}]","[{'child': 7, 'head': 1, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 10, 'head': 1, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 17, 'head': 12, 'relationLabel': 'DEGREE_IN'}, {'child': 19, 'head': 12, 'relationLabel': 'DEGREE_IN'}, {'child': 21, 'head': 12, 'relationLabel': 'DEGREE_IN'}, {'child': 24, 'head': 12, 'relationLabel': 'DEGREE_IN'}, {'child': 26, 'head': 12, 'relationLabel': 'DEGREE_IN'}, {'child': 128, 'head': 126, 'relationLabel': 'DEGREE_IN'}]"
2,"\nBA/BS\n5+ years of program or project management experience\n2+ years of technical project/program management experience\nTrack record of operating independently\nExperience understanding user needs, gathering requirements, and defining scope\nCommunication experience interacting with a variety of audiences from engineers, to vendors, to research leaders\nTrack record of building cross-functional relationships\n\nPREFERRED \nExperience working with UX Research and/or UX Design","[{'text': '5+ years', 'start': 7, 'end': 15, 'token_start': 5, 'token_end': 7, 'entityLabel': 'EXPERIENCE'}, {'text': 'project management', 'start': 30, 'end': 48, 'token_start': 11, 'token_end': 12, 'entityLabel': 'SKILLS'}, {'text': '2+ years', 'start': 60, 'end': 68, 'token_start': 15, 'token_end': 17, 'entityLabel': 'EXPERIENCE'}, {'text': 'technical project', 'start': 72, 'end': 89, 'token_start': 19, 'token_end': 20, 'entityLabel': 'SKILLS'}, {'text': 'program management', 'start': 90, 'end': 108, 'token_start': 22, 'token_end': 23, 'entityLabel': 'SKILLS'}]","[{'child': 11, 'head': 5, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 19, 'head': 15, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 22, 'head': 15, 'relationLabel': 'EXPERIENCE_IN'}]"
3,"\nCurrently enrolled in a full-time degree program and returning to the program after the completion of the internship.\nPublications or experience in fields related to in machine learning, AI, computer vision, natural language processing, computational neuroscience, optimization, computer science, statistics, applied mathematics, or data science.\nExperience solving analytical problems using quantitative approaches.\nAbility to manipulate and analyze large scale, high-dimensionality data from varying sources.\nExperience in utilizing theoretical and empirical research to solve problems.\nAbility to communicate research for public audiences of peers.\nKnowledge in a programming language.\nAbility to obtain and maintain work authorization in the country of employment in 2018.\n\nPREFERRED \nPursuing a Ph.D. degree in Computer Science or related field\nFirst-author publications at peer-reviewed AI conferences (e.g. NIPS, ICML, ICLR, CVPR, ICCV, ECCV, ACL, NAACL, and EMNLP).\nExperience building systems based on machine learning and/or deep learning methods.\nResearch and software engineer experience demonstrated via an internship, work experience, or coding competitions.\nKnowledge in Python, Lua, C++, C, C# and/or Java.","[{'text': 'Ph.D.', 'start': 801, 'end': 806, 'token_start': 137, 'token_end': 137, 'entityLabel': 'DIPLOMA'}, {'text': 'Computer Science', 'start': 817, 'end': 833, 'token_start': 140, 'token_end': 141, 'entityLabel': 'DIPLOMA_MAJOR'}]","[{'child': 140, 'head': 137, 'relationLabel': 'DEGREE_IN'}]"
4,"\nCurrently enrolled in a full-time degree program and returning to the program after the completion of the internship.\nPublications or experience in fields related to in machine learning, AI, computer vision, natural language processing, computational neuroscience, optimization, computer science, statistics, applied mathematics, or data science.\nExperience solving analytical problems using quantitative approaches.\nAbility to manipulate and analyze large scale, high-dimensionality data from varying sources.\nExperience in utilizing theoretical and empirical research to solve problems.\nAbility to communicate research for public audiences of peers.\nKnowledge in a programming language.\nAbility to obtain and maintain work authorization in the country of employment in 2019.\n\nPREFERRED \nPursuing a Ph.D. degree in Computer Science or related field.\nFirst-author publications at peer-reviewed AI conferences (e.g. NIPS, ICML, ICLR, CVPR, ICCV, ECCV, ACL, NAACL, and EMNLP).\nExperience building systems based on machine learning and/or deep learning methods.\nResearch and software engineer experience demonstrated via an internship, work experience, or coding competitions.\nKnowledge in Python, Lua, C++, C, C# and/or Java.\n","[{'text': 'Ph.D.', 'start': 801, 'end': 806, 'token_start': 137, 'token_end': 137, 'entityLabel': 'DIPLOMA'}, {'text': 'Computer Science', 'start': 817, 'end': 833, 'token_start': 140, 'token_end': 141, 'entityLabel': 'DIPLOMA_MAJOR'}]","[{'child': 140, 'head': 137, 'relationLabel': 'DEGREE_IN'}]"
5,"\nMS in a quantitative field such as Operations Research, Computer Science, Quantitative Finance, Math, Physics or a related Engineering degree\nKnowledge of Statistics & Probability (e.g. Hypothesis testing, Regression, Stochastic modeling, Markov Chains and etc.)\n7+ years experience in building models and developing algorithms for machine learning, statistics, mathematical programming, and simulation in industry and/or academia\n5+ years experience in managing and analyzing large-scale structured and unstructured data using R or Python\n7+ years experience with algorithms and optimizations using CPLEX or related tools\n7+ years experience in SQL and data modeling\n\nPREFERRED \nPhD in Operation Research or Industrial Engineering.\nFamiliarity with object-oriented programming languages (such as C++ or Java) and visualization tools (such as Tableau).\nExperience working with or in support of diverse communities.","[{'text': 'MS', 'start': 1, 'end': 3, 'token_start': 1, 'token_end': 1, 'entityLabel': 'DIPLOMA'}, {'text': 'Operations Research', 'start': 36, 'end': 55, 'token_start': 8, 'token_end': 9, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': 'Computer Science', 'start': 57, 'end': 73, 'token_start': 11, 'token_end': 12, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': 'Quantitative Finance', 'start': 75, 'end': 95, 'token_start': 14, 'token_end': 15, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': 'Math', 'start': 97, 'end': 101, 'token_start': 17, 'token_end': 17, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': 'Physics', 'start': 103, 'end': 110, 'token_start': 19, 'token_end': 19, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': 'Engineering', 'start': 124, 'end': 135, 'token_start': 23, 'token_end': 23, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': '7+ years', 'start': 264, 'end': 272, 'token_start': 48, 'token_end': 50, 'entityLabel': 'EXPERIENCE'}, {'text': 'building models', 'start': 287, 'end': 302, 'token_start': 53, 'token_end': 54, 'entityLabel': 'SKILLS'}, {'text': 'developing algorithms', 'start': 307, 'end': 328, 'token_start': 56, 'token_end': 57, 'entityLabel': 'SKILLS'}, {'text': 'machine learning', 'start': 333, 'end': 349, 'token_start': 59, 'token_end': 60, 'entityLabel': 'SKILLS'}, {'text': 'statistics', 'start': 351, 'end': 361, 'token_start': 62, 'token_end': 62, 'entityLabel': 'SKILLS'}, {'text': 'mathematical programming', 'start': 363, 'end': 387, 'token_start': 64, 'token_end': 65, 'entityLabel': 'SKILLS'}, {'text': 'simulation', 'start': 393, 'end': 403, 'token_start': 68, 'token_end': 68, 'entityLabel': 'SKILLS'}, {'text': 'industry', 'start': 407, 'end': 415, 'token_start': 70, 'token_end': 70, 'entityLabel': 'SKILLS'}, {'text': 'academia', 'start': 423, 'end': 431, 'token_start': 72, 'token_end': 72, 'entityLabel': 'SKILLS'}, {'text': '5+ years', 'start': 432, 'end': 440, 'token_start': 74, 'token_end': 76, 'entityLabel': 'EXPERIENCE'}, {'text': 'managing', 'start': 455, 'end': 463, 'token_start': 79, 'token_end': 79, 'entityLabel': 'SKILLS'}, {'text': 'analyzing large-scale structured and unstructured data', 'start': 468, 'end': 522, 'token_start': 81, 'token_end': 88, 'entityLabel': 'SKILLS'}, {'text': 'R', 'start': 529, 'end': 530, 'token_start': 90, 'token_end': 90, 'entityLabel': 'SKILLS'}, {'text': 'Python', 'start': 534, 'end': 540, 'token_start': 92, 'token_end': 92, 'entityLabel': 'SKILLS'}, {'text': '7+ years', 'start': 541, 'end': 549, 'token_start': 94, 'token_end': 96, 'entityLabel': 'EXPERIENCE'}, {'text': 'algorithms', 'start': 566, 'end': 576, 'token_start': 99, 'token_end': 99, 'entityLabel': 'SKILLS'}, {'text': 'optimizations', 'start': 581, 'end': 594, 'token_start': 101, 'token_end': 101, 'entityLabel': 'SKILLS'}, {'text': 'CPLEX', 'start': 601, 'end': 606, 'token_start': 103, 'token_end': 103, 'entityLabel': 'SKILLS'}, {'text': '7+ years', 'start': 624, 'end': 632, 'token_start': 108, 'token_end': 110, 'entityLabel': 'EXPERIENCE'}, {'text': 'SQL', 'start': 647, 'end': 650, 'token_start': 113, 'token_end': 113, 'entityLabel': 'SKILLS'}, {'text': 'data modeling', 'start': 655, 'end': 668, 'token_start': 115, 'token_end': 116, 'entityLabel': 'SKILLS'}, {'text': 'PhD', 'start': 681, 'end': 684, 'token_start': 120, 'token_end': 120, 'entityLabel': 'DIPLOMA'}, {'text': 'Operation Research', 'start': 688, 'end': 706, 'token_start': 122, 'token_end': 123, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': 'Industrial Engineering', 'start': 710, 'end': 732, 'token_start': 125, 'token_end': 126, 'entityLabel': 'DIPLOMA_MAJOR'}]","[{'child': 8, 'head': 1, 'relationLabel': 'DEGREE_IN'}, {'child': 11, 'head': 1, 'relationLabel': 'DEGREE_IN'}, {'child': 14, 'head': 1, 'relationLabel': 'DEGREE_IN'}, {'child': 17, 'head': 1, 'relationLabel': 'DEGREE_IN'}, {'child': 19, 'head': 1, 'relationLabel': 'DEGREE_IN'}, {'child': 23, 'head': 1, 'relationLabel': 'DEGREE_IN'}, {'child': 53, 'head': 48, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 56, 'head': 48, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 59, 'head': 48, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 62, 'head': 48, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 64, 'head': 48, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 68, 'head': 48, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 70, 'head': 48, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 72, 'head': 48, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 79, 'head': 74, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 81, 'head': 74, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 90, 'head': 74, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 92, 'head': 74, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 99, 'head': 94, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 101, 'head': 94, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 103, 'head': 94, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 113, 'head': 108, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 115, 'head': 108, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 122, 'head': 120, 'relationLabel': 'DEGREE_IN'}, {'child': 125, 'head': 120, 'relationLabel': 'DEGREE_IN'}]"
6,"\nPh.D. with 5+ years of experience, MS with 7+ years of experience, or BS with 10+ years of experience in Physics, Electrical Engineering, Computer Science, or a related technical field such us architecting, developing, and launching hardware/software projects and/or services\nDemonstrated knowledge dissemination through authored publications, international conference presentations or shipped products\nML/AI basics, and systems basics, including the requisite programming experience (python or equivalent, and at least one systems-level programming language: C, C++, Java, Go, Rust, or equivalent)\nExperience with data analytics (data collection, storage, cleaning, processing with statistics, visualization, and other data related processes)\nExperience working on communication systems in a research and/or development capacity\nTechnical leadership in leading research efforts with a demonstrated experience handling multiple priorities\n\nPREFERRED \nDeep understanding of how culture and team dynamics create exceptional teams",[],[]
7,"\nGraduating with a Ph.D. in Computer Science or related field or relevant experience.\nFirst-author publications at peer-reviewed AI conferences (e.g. NIPS, CVPR, ICML, ICLR, ICCV, and ACL).\nExperience leading a team in solving analytical problems using quantitative approaches.\nExperience manipulating and analyzing data from different sources.\nExperience in leading theoretical and empirical research and for answering questions with research.\nExperience communicating research for public audiences of peers.\nKnowledge in a programming language.\nMust be able to obtain and maintain work authorization in the country of employment in 2018.\n\nPREFERRED \nPrevious internship(s) and/or research assistantship(s) in an AI research organization.\nExperience in developing and debugging in C/C++, Python, C# and/or Java.\nHas previously completed a postdoctoral role in AI research.","[{'text': 'Ph.D.', 'start': 19, 'end': 24, 'token_start': 4, 'token_end': 4, 'entityLabel': 'DIPLOMA'}, {'text': 'Computer Science', 'start': 28, 'end': 44, 'token_start': 6, 'token_end': 7, 'entityLabel': 'DIPLOMA_MAJOR'}]","[{'child': 6, 'head': 4, 'relationLabel': 'DEGREE_IN'}]"
8,"\n12+ years of research experience, in-house or as a vendor (including applicable consulting experience)\nPh.D in human-computer interaction, human factors, cognitive psychology or related field\nUser safety demonstrated knowledge through authored publications, international conference presentations or shipped products\nExperience collaborating and working across teams to develop concepts that advance the entire product pipeline (hardware, software, computer vision, etc.)","[{'text': '12+ years', 'start': 1, 'end': 10, 'token_start': 1, 'token_end': 3, 'entityLabel': 'EXPERIENCE'}, {'text': 'research', 'start': 14, 'end': 22, 'token_start': 5, 'token_end': 5, 'entityLabel': 'SKILLS'}, {'text': 'Ph.D', 'start': 104, 'end': 108, 'token_start': 22, 'token_end': 24, 'entityLabel': 'SKILLS'}, {'text': 'human-computer interaction', 'start': 112, 'end': 138, 'token_start': 26, 'token_end': 29, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': 'human factors', 'start': 140, 'end': 153, 'token_start': 31, 'token_end': 32, 'entityLabel': 'DIPLOMA_MAJOR'}, {'text': 'cognitive psychology', 'start': 155, 'end': 175, 'token_start': 34, 'token_end': 35, 'entityLabel': 'DIPLOMA_MAJOR'}]","[{'child': 5, 'head': 1, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 26, 'head': 22, 'relationLabel': 'DEGREE_IN'}, {'child': 31, 'head': 22, 'relationLabel': 'DEGREE_IN'}, {'child': 34, 'head': 22, 'relationLabel': 'DEGREE_IN'}]"
9,"\nPhD with 3+ years, or MS with 5+ years of experience in a field emphasizing people research in organizations (e.g., Industrial/Organizational Psychology, Organizational Behavior, Labor Economics, Management, Policy Analysis, etc.)\n3+ years of experience conducting quantitative people research in an organization or consulting environment, including working with stakeholders to understand and clarify their people research needs, and communicating analyses to technical and non-technical audiences\nExperience managing and analyzing structured and unstructured data\nExperience communicating technical results to technical and non-technical audiences\nExperience operating in a consultative capacity with stakeholders\nExperience defining a roadmap of data requirements, metrics reporting, and research to drive insights\nExperience in communicating and weaving together multiple data sources and deliver insights to stakeholders\nKnowledge of statistical analysis (e.g., descriptive statistics and knowledge with statistical significance), quantitative research, and experience gathering and interpreting data\n\nPREFERRED \n3+ years of experience applying multivariate statistical methods to people research problems in organizations\n5+ years of progressive consulting or corporate work experience focused on using data to uncover insights that drive action and communicating analyses to technical and non-technical audiences\nRelevant experience includes analytics, management consulting, sales operations, finance, compensation, or related fields\nPeople management experience\nExperience developing tools, conducting empirical research, practical implementation, knowledge of best practices in at least two of the following areas: employee engagement, performance management, diversity, compensation (e.g., pay equity), or retention\nExperience conceiving, developing, running pilot tests, implementing, and continuously improving large-scale people processes\nExperience managing and analyzing structured and unstructured data with R and SQL","[{'text': '5+ years', 'start': 31, 'end': 39, 'token_start': 10, 'token_end': 12, 'entityLabel': 'EXPERIENCE'}, {'text': 'emphasizing people research in organizations', 'start': 65, 'end': 109, 'token_start': 18, 'token_end': 22, 'entityLabel': 'SKILLS'}, {'text': 'Industrial', 'start': 117, 'end': 127, 'token_start': 26, 'token_end': 26, 'entityLabel': 'SKILLS'}, {'text': 'Organizational Psychology', 'start': 128, 'end': 153, 'token_start': 28, 'token_end': 29, 'entityLabel': 'SKILLS'}, {'text': 'Organizational Behavior', 'start': 155, 'end': 178, 'token_start': 31, 'token_end': 32, 'entityLabel': 'SKILLS'}, {'text': 'Labor Economics', 'start': 180, 'end': 195, 'token_start': 34, 'token_end': 35, 'entityLabel': 'SKILLS'}, {'text': 'Management', 'start': 197, 'end': 207, 'token_start': 37, 'token_end': 37, 'entityLabel': 'SKILLS'}, {'text': 'Policy Analysis', 'start': 209, 'end': 224, 'token_start': 39, 'token_end': 40, 'entityLabel': 'SKILLS'}, {'text': '3+ years', 'start': 232, 'end': 240, 'token_start': 46, 'token_end': 48, 'entityLabel': 'EXPERIENCE'}, {'text': 'conducting quantitative people research', 'start': 255, 'end': 294, 'token_start': 51, 'token_end': 54, 'entityLabel': 'SKILLS'}, {'text': 'working with stakeholders', 'start': 351, 'end': 376, 'token_start': 63, 'token_end': 65, 'entityLabel': 'SKILLS'}, {'text': 'communicating', 'start': 436, 'end': 449, 'token_start': 76, 'token_end': 76, 'entityLabel': 'SKILLS'}, {'text': '3+ years', 'start': 1119, 'end': 1127, 'token_start': 176, 'token_end': 178, 'entityLabel': 'EXPERIENCE'}, {'text': 'statistical methods', 'start': 1164, 'end': 1183, 'token_start': 183, 'token_end': 184, 'entityLabel': 'SKILLS'}, {'text': '5+ years', 'start': 1229, 'end': 1237, 'token_start': 192, 'token_end': 194, 'entityLabel': 'EXPERIENCE'}, {'text': 'progressive consulting', 'start': 1241, 'end': 1263, 'token_start': 196, 'token_end': 197, 'entityLabel': 'SKILLS'}, {'text': 'corporate work experience', 'start': 1267, 'end': 1292, 'token_start': 199, 'token_end': 201, 'entityLabel': 'SKILLS'}, {'text': 'using data to uncover insights that drive action', 'start': 1304, 'end': 1352, 'token_start': 204, 'token_end': 211, 'entityLabel': 'SKILLS'}, {'text': 'communicating', 'start': 1357, 'end': 1370, 'token_start': 213, 'token_end': 213, 'entityLabel': 'SKILLS'}]","[{'child': 18, 'head': 10, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 26, 'head': 10, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 28, 'head': 10, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 31, 'head': 10, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 34, 'head': 10, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 37, 'head': 10, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 39, 'head': 10, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 51, 'head': 46, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 63, 'head': 46, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 76, 'head': 46, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 183, 'head': 176, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 196, 'head': 192, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 199, 'head': 192, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 204, 'head': 192, 'relationLabel': 'EXPERIENCE_IN'}, {'child': 213, 'head': 192, 'relationLabel': 'EXPERIENCE_IN'}]"


Let's run the extractor on these post titles (calling the API in parallel for quicker results):