# QAP Job Poster

This is the main code for posting open jobs for QA Professionals!

At a high-level, this notebook goes through the process. Following the code will:

* Collect jobs from different sources (ie Indeed, Google Jobs)
* Filter relevant jobs using AI
* Post those jobs to the `#jobs` channel in the QAP Slack Community!

### 1. Create the SQLite Database to save our results from collecting jobs

* import everything we'll need for this notebook
* delete the database (if exists) and create a new one

In [1]:
from typing import Dict, List

from pylenium.driver import Pylenium, PyleniumConfig

from jobs import ai, database, indeed, slack
from jobs.models import Job

database.delete()
database.create()

### 2. Use Pylenium to scrape jobs from different sources

* Currently, uses `Indeed.com`
* Eventually, will also use `Google Jobs`

In [2]:
py = Pylenium(PyleniumConfig())
scraped_jobs = indeed.scrape(py, indeed.REMOTE_QA_ENGINEER)

### 3. Save scraped jobs to our `indeed_jobs` table

We don't want to lose these results in case an error happens or if we want to use it for something else.

In [3]:
database.insert_indeed_jobs(scraped_jobs)

### 4. Use AI to filter irrelevant jobs

Unfortunately, Indeed doesn't do a great job of returning relevant results given their search and filters.

Fortunately, we can use AI to not only give us the relevant jobs, but to also rank them:

* Find jobs that would be relevant to Software QA Professionals
* Order them from junior-level to senior-level

In [4]:
# Use the least amount of tokens to get the relevant jobs from all the scraped jobs
# The AI responds with a string
ai_response: str = ai.get_relevant_jobs(scraped_jobs)

# Convert the string output from the AI to a list of dictionaries that we can use in code
ai_ranked_jobs: List[Dict] = ai.convert_relevant_jobs_to_list(ai_response)

### 5. Use the `ai_ranked_jobs` to match the Jobs in `scraped_jobs`

We can do this by creating a simple "lookup table"

In [5]:
def create_jobs_lookup(all_jobs: List[Job]) -> dict:
    """This 'lookup table' makes it easier to find relevant jobs within all jobs."""
    jobs_lookup = {}
    for job in all_jobs:
        key = f"{job.company} | {job.title}"
        if key not in jobs_lookup:
            jobs_lookup[key] = job
    return jobs_lookup

With this lookup, we can get back the full `Job` objects that match the AI's results!

In [6]:
lookup = create_jobs_lookup(scraped_jobs)
jobs = [job for j in ai_ranked_jobs if (job := lookup.get(f"{j['company']} | {j['title']}"))]

Save the jobs found by AI to our `relevant_jobs` table

In [7]:
database.insert_relevant_jobs(jobs)

### 6. Send the Top 20 Jobs to the Slack Channel

* Slack has a `Block Limit`, so we can only send the top 20

In [8]:
top_20_jobs = jobs[:20]
payload = slack.create_jobs_payload(top_20_jobs)
response = slack.post_to_channel(payload)

In [9]:
# If this cell runs without error, then the jobs were posted to Slack!
assert response.ok, f"Failed to post jobs to Slack: {response.text}"

## (Optional) Explore the data

In [10]:
print("Relevant Jobs Found:", len(jobs))
jobs

Relevant Jobs Found: 20


[IndeedJob(origin='https://indeed.com', title='Junior QA Engineer', company='Remote Technology, Inc.', location='Remote', share_link='https://indeed.com/rc/clk?jk=88b705f19f529a30&bb=fLYvD_7LG2S8G6wuMT_izznnKiMZKUXUrAB0eikxJH6-5bSJpMp3uABiRT5eX2EXvv9ZzZb96rnNkGbEUzdFmM_9_YTfIUYeddPyIAhr6PY%3D&xkcb=SoDi67M3FNK1vMzRkR0AbzkdCdPP&fccid=55bfdb22b1429971&vjs=3', salary=None),
 IndeedJob(origin='https://indeed.com', title='Jr. QA Engineer', company='Kyla', location='Remote', share_link='https://indeed.com/rc/clk?jk=b0133ba0f944c93d&bb=_Ot3_dDWnzAC2JfHSA8-p3HOaYy2G1yB33CxksZ0iphEAo4APfNCDhCVz_LRpR849ptztWbeEeQ3VA9_9xfIjKrnRZK-z9F3wR7ZaK7fCmQ%3D&xkcb=SoB-67M3FNKzkOTCqh0KbzkdCdPP&fccid=1f46884a3fc52544&vjs=3', salary=None),
 IndeedJob(origin='https://indeed.com', title='QA Tester', company='Bad Robot Games', location='Remote', share_link='https://indeed.com/rc/clk?jk=4f4c457be06e6c19&bb=D9smhDLbuqbhzoo73rnjKu2S60G-vyhp1JoWhfvxXhrIadUdXsiN2faU7zhXBa4XtBu19uB1jV2LeEIyg9rg1MHBAoF2QA5SVuGitazBOgo%3D

In [11]:
print("Total Jobs Scraped:", len(scraped_jobs))
scraped_jobs

Total Jobs Scraped: 75


[IndeedJob(origin='https://indeed.com', title='Angular Software Engineer', company='ES&S Voter Registration LLC', location='Remote', share_link='https://indeed.com/rc/clk?jk=69b0c9eb016e1f38&bb=fLYvD_7LG2S8G6wuMT_iz9ey8ftR8AjI6qRVXD8bkI4srxGTa5j_5sICe9cQakPijG-WE48t62z-d8RgW8BJuEsLbrDFdGTqtP-cvk0uNck%3D&xkcb=SoAY67M3FNK1vMzRkR0LbzkdCdPP&fccid=dd616958bd9ddc12&vjs=3', salary='$95,000 - $120,000 a year'),
 IndeedJob(origin='https://indeed.com', title='Software Engineer', company='Anonyome Labs', location='Remote', share_link='https://indeed.com/rc/clk?jk=b503805dba3cd9d8&bb=fLYvD_7LG2S8G6wuMT_iz5_zNMhWUQCKFIq24qwQtGPqoXG3H1wjMrXBiA4rUm1kcPCxoCOn5m85BIYAfdtxVP8nAaH2JbEO82Dzvt-4HqI%3D&xkcb=SoCs67M3FNK1vMzRkR0KbzkdCdPP&fccid=e140ab80cac32f76&vjs=3', salary=None),
 IndeedJob(origin='https://indeed.com', title='Frontend Software Engineer', company='Resourcely', location='Remote', share_link='https://indeed.com/rc/clk?jk=0edbac704d39d4f9&bb=fLYvD_7LG2S8G6wuMT_iz5_zNMhWUQCKqaf56NzfKJQ4YNFbHx6zg