---
format:
    html:
        embed-resources: true
---

# Crawling 

## Overview 

In this portion of the project, we will be crawling google jobs to collect various job-descriptions for later processing. 

We will be using the `serpapi` API to crawl google jobs. `serpapi` is a paid API, but they have a free tier which should be more than enough for this project. The API allows us to search google programatically, which has a wealth of practical applications.

The following are additional useful reference resources 

For instructions on the API see the following

* [https://serpapi.com/google-jobs-api](https://serpapi.com/google-jobs-api)
* [https://serpapi.com/blog/scrape-google-jobs-organic-results-with-python/](https://serpapi.com/blog/scrape-google-jobs-organic-results-with-python/)
* [https://serpapi.com/integrations/python](https://serpapi.com/integrations/python)

## Starter code 

Here is some starter code:

`Note: uule parameter`

The uule parameter is an encoded location parameter used in Google search queries. It stands for "Unique User Location Encoding" and is used to specify the geographic location from which the search is being conducted. This can influence the search results to be more relevant to the specified location.

This can be set to `'w+CAIQICINVW5pdGVkIFN0YXRlcw'`, which is an encoded string representing a specific location, i.e. the United States. This encoding helps simulate searches as if they are being conducted from that location, which can be useful for testing or gathering location-specific data.

In [5]:
from serpapi import GoogleSearch
import json

Save our API key in a centralized location, e.g. `~/.api-keys.json`

Read it in with `import json` 

In [2]:
import json
with open('C:/Users/admin/.api-keys.json') as f:
    keys = json.load(f)
API_KEY = keys['serpapi']

In [3]:
search_query = 'data science'
params = {
	'api_key':API_KEY,                          # https://serpapi.com/manage-api-key
	'uule': 'w+CAIQICINVW5pdGVkIFN0YXRlcw',		# encoded location (USA)
	'q': search_query,              			# search query
    'hl': 'en',                         		# language of the search
    'gl': 'us',                         		# country of the search
	'engine': 'google_jobs',					# SerpApi search engine
}

Lets do one search and explore the output 

In [6]:
search = GoogleSearch(params)   			# where data extraction happens on the SerpApi backend
result_dict = search.get_dict() 			# JSON -> Python dict

if 'error' in result_dict:
    print("ERROR FOUND IN SEARCH")

In [7]:
for result in result_dict['jobs_results']:
    print(result)
    # google_jobs_results.append(result)

{'title': 'Senior Director, R&D Data Science and Digital Health, Real-World Evidence and Advanced Analytics', 'company_name': 'Johnson & Johnson', 'location': 'Raritan, NJ', 'via': 'LinkedIn', 'share_link': 'https://www.google.com/search?ibp=htl;jobs&q=data+science&htidocid=xk8fCo2_LI2kOQQJAAAAAA%3D%3D&hl=en-US&shndl=-1&source=sh/x/job/li/m1/1#fpstate=tldetail&htivrt=jobs&htiq=data+science&htidocid=xk8fCo2_LI2kOQQJAAAAAA%3D%3D', 'thumbnail': 'https://serpapi.com/searches/671d8d5e7ebde59febea6cb1/images/38d82cd27cffaaa7d885388f748141ba8257bfd16e379fa3e64fc0cf0bc18251.jpeg', 'extensions': ['1 day ago', 'Full-time', 'Health insurance', 'Dental insurance', 'Paid time off'], 'detected_extensions': {'posted_at': '1 day ago', 'schedule_type': 'Full-time', 'health_insurance': True, 'dental_coverage': True, 'paid_time_off': True}, 'description': 'Description:\n\nJohnson & Johnson Innovative Medicine is recruiting for R&D Data Science, Real-World Evidence (RWE) Programming and Tools. This positi

In web crawling, pagination involves retrieving data across multiple pages of a website. It helps manage large datasets by fetching a limited number of results per request, enabling efficient data extraction without overwhelming the server or exceeding resource limits.

You could use pagination to get more results for a given search, however you need to start where the last search left off.

You can do this by adding the `next_page_token` to the `params` dictionary.

You get the last `next_page_token` from the `result_dict["serpapi_pagination"]`

If you don't do pagination, and search 10 times, you will just get the first 10 results over and over again.

In [8]:
print(result_dict["serpapi_pagination"],"\n")
print(result_dict["serpapi_pagination"]["next_page_token"])

{'next_page_token': 'eyJmYyI6IkVvd0ZDc3dFUVVwSE9VcHJUbkV3TWtNd1dXNUxiSGxSVkU1UE5uTnlRVVpNT1VWMFVXY3liekkyUVdGaVdWUmFNbVoxYjBaSWVGVmZObTlrTkdKSmRVczNRVTlmUlRaVFFWSkZUV3hrTkZwME5XNTZZV1J0WDAxT2JFMTBSRFkxUm1kYWNXSkRPRE4xU0RkV1puTklVamhwU1hST1QyVnNVVzAzU0RkeVJYVlhaVUZYUmtkbFQwUjFWR3gyVUVGNVQyUmZiVWhzYkZCeE1tWkhSMlJNVEZGaFExTldRMjgxV1hNeVZuaGpSa3g0YjBwT2IzUXRiV055ZDBkWk5rcFdTSEV5ZDIxVVJFdEpPVU5aTmtONFpYWlRSVkZ6V2s1eVREWkZNR1JKY1hkVVdsSmZiMjlJYkU1aldHWXpOMDUzTFVsRWJtbEZOVlpmYjBwd2JUQTJMV0Y1VEdReU5qQk1XRXg0TnpWVlFUVjJTVjlPUTFGUmFUSnNUMUpYTWpkZmFYUnlhRlJxY0ZaRGRYSlBjSGg1V1dSamJEZHphMTlrZGtKaWVrdzJOamxUVlhkS2JpMWFWazlWT0hsdlMyMU1iWGhUY214WGRYZG5XRGsxT0VKaFgzWmlVM1JoWmpGUFNscGFhak5LWm1SdU5qbHpSRGxsUVdjd1lYUkhOWE5wYzNsNWN6bERTRlYzWDA1eldGQjZhVUY1VUVGdE4wWlJhamM1YW1SRU5EWjRjazB3VlRsUVF6Rm5iazltZWxjNGJGRmtWa1Y1T0c1NWIxTkZabUp3VlhaQ1EyTXRNMnhrY1hoVFpWWXlSalpMTTFOS05rTkNaRVUwTXpWUk5sVk1jemhFVHpacGRtVXdVMTlKUWtZM09WbElSR0ZmWVhCb2FEQlJUMlEyVW5WdmRsOHhlR0ZIZFhsaE1UZFBTMUZYTVROQ2FFVmlURU5yTlU1Q2JGZGZXbVp

Lets do one more search, but this time with pagination, starting where the last search left off.

In [12]:
search_query = 'data science'
params = {
	'api_key':API_KEY,                          # https://serpapi.com/manage-api-key
	'uule': 'w+CAIQICINVW5pdGVkIFN0YXRlcw',		# encoded location (USA)
	'q': search_query,              			# search query
    'hl': 'en',                         		# language of the search
    'gl': 'us',                         		# country of the search
    "num": 10,									# number of results per page
	'engine': 'google_jobs',					# SerpApi search engine
    'next_page_token': result_dict["serpapi_pagination"]["next_page_token"]
}

In [13]:
search = GoogleSearch(params)   			# where data extraction happens on the SerpApi backend
result_dict = search.get_dict() 			# JSON -> Python dict

if 'error' in result_dict:
    print("ERROR FOUND IN SEARCH")

In [14]:
for result in result_dict['jobs_results']:
    print(result)
    # google_jobs_results.append(result)

{'title': 'Principal Data Scientist', 'company_name': 'Hp', 'location': 'Colorado, TX', 'via': 'ZipRecruiter', 'share_link': 'https://www.google.com/search?ibp=htl;jobs&q=data+science&htidocid=Q8aAld1zBtNh7HBHAAAAAA%3D%3D&hl=en-US&shndl=-1&source=sh/x/job/li/m1/1#fpstate=tldetail&htivrt=jobs&htiq=data+science&htidocid=Q8aAld1zBtNh7HBHAAAAAA%3D%3D', 'thumbnail': 'https://serpapi.com/searches/67154854400bd50bf3e5c040/images/72889b0014d22365ed2a0cd83cdc7b94235cb5df4ed96b34b8259ad67bd9691f.gif', 'extensions': ['2 days ago', 'Full-time', 'Dental insurance', 'Health insurance', 'Paid time off'], 'detected_extensions': {'posted_at': '2 days ago', 'schedule_type': 'Full-time', 'dental_coverage': True, 'health_insurance': True, 'paid_time_off': True}, 'description': "Principal Data Scientist\n\nDescription -\n\nHP's Digital and Transformation Organization (D&TO) is focused on building world-class digital capabilities, and our Data Science team is responsible for leading the development the data

# Utility function

Create utility function to search google jobs, and save the results to a file.

Here is one sketch of what the function might look like:

- Imports the current date and time using `datetime`.
- Defines `search_google_jobs` to perform a Google Jobs search with a default or custom query.
- Accepts parameters for the search query, pagination token, and verbosity.
- Sets search parameters like API key, location, language, and search engine.
- Appends the pagination token if provided.
- Creates a timestamped output filename based on the query and time.
- Does a search and data extraction.
- Optionally prints the data if `verbose` is `True` and saves results to a JSON file.
- Returns the `next_page_token` for pagination or handles errors.

In [28]:
# INSERT CODE HERE
import os
from datetime import datetime
import re

def search_google_jobs(search_query="data scientist", next_page_token=None, verbose=False, return_jobs=False):
    # Load API key from a centralized location
    path_to_keys = os.path.expanduser("~/.api-keys.json")
    with open(path_to_keys, 'r') as f:
        keys = json.load(f)
    API_KEY = keys['serpapi']

    # Define search parameters
    params = {
        'api_key': API_KEY,
        'uule': 'w+CAIQICINVW5pdGVkIFN0YXRlcw',  # Encoded location for United States
        'q': search_query,
        'hl': 'en',  # Language of search
        'gl': 'us',  # Country of search
        'engine': 'google_jobs',  # SerpApi search engine
        'num': 10  # Number of results per page
    }

    # Append pagination token if provided
    if next_page_token:
        params['next_page_token'] = next_page_token

    # Create a timestamped filename based on query and time, and ensure it's saved to the "data" folder
    timestamp = datetime.now().strftime("%Y-%m-%d-%H-%M-%S")
    formatted_query = re.sub(r'[ /()]', '-', search_query).lower()
    output_filename = f"data/{formatted_query}-{timestamp}.json"

    # Ensure the "data" directory exists
    os.makedirs("data", exist_ok=True)

    try:
        # Perform the search
        search = GoogleSearch(params)
        result_dict = search.get_dict()

        # Check for errors in the result
        if 'error' in result_dict:
            print("Error found in search:", result_dict['error'])
            return None, []

        # Extract next page token for pagination
        next_page_token = result_dict.get("serpapi_pagination", {}).get("next_page_token")

        # Optionally print the results if verbose is True
        if verbose:
            print(json.dumps(result_dict['jobs_results'], indent=2))

        # Save results to a JSON file in the "data" directory
        with open(output_filename, 'w') as outfile:
            json.dump(result_dict['jobs_results'], outfile, indent=2)

        # Return next_page_token and job_results
        if return_jobs:
            return next_page_token, result_dict.get('jobs_results', [])
        else:
            return next_page_token

    except Exception as e:
        print("An error occurred:", e)
        return None, []


In [10]:
next_page_token = search_google_jobs(search_query="machine learning engineer", verbose=True)

[
  {
    "title": "Machine Learning Research Engineer, Agent Applications",
    "company_name": "Scale AI",
    "location": "Seattle, WA",
    "via": "LinkedIn",
    "share_link": "https://www.google.com/search?ibp=htl;jobs&q=machine+learning+engineer&htidocid=YATcA0O0R4zdR1UZAAAAAA%3D%3D&hl=en-US&shndl=-1&source=sh/x/job/li/m1/1#fpstate=tldetail&htivrt=jobs&htiq=machine+learning+engineer&htidocid=YATcA0O0R4zdR1UZAAAAAA%3D%3D",
    "thumbnail": "https://serpapi.com/searches/671d8ed174f0a4fc8a98aff6/images/1721331cca54d3a6bd2659205cb921ad0ac69ef13e72054cb9b92c9640189527.jpeg",
    "extensions": [
      "20 hours ago",
      "Full-time",
      "Paid time off",
      "Dental insurance",
      "Health insurance"
    ],
    "detected_extensions": {
      "posted_at": "20 hours ago",
      "schedule_type": "Full-time",
      "paid_time_off": true,
      "dental_coverage": true,
      "health_insurance": true
    },
    "description": "About Scale\n\nAt Scale AI, our mission is to accelerate

In [12]:
next_page_token

'eyJmYyI6IkVxSUZDdUlFUVVwSE9VcHJUbFpMVDFkcFYxUnFUVTV2TWtrMlUxTkZTblJyWVc0MFgxRldibFpWV2pKbGFHeEdjVFZCUkc5d01IWk5jV3RvUVc5VVUyTlVTM2x2Y1VkbVJFRkhTV1ZZY1haSlVtRk1PV05vZDBGd2VWaHRhVVpVWVY5MVVqUTJTekppU3pKeFoyMUVZWFJQTmsxcFMxVmpNSFZOWlVSa1ZEZ3dVRE5tZFhOM1ZtMVhNRGcxWDBaNFRIWk9SMGhUUm14d2IwTTFkblU1TkZwblNGaGlNekpZUVZrMk4zRnlja2xYVTI0NVdWZHZiR05EZUZWRU1HbHJlSHBtYUMxVFJFVnRPR2hqYzBzeE9EWm1ZMFkxWDJkVFJFRTRaamxmYlVsT1lVUkRSMjVxVWs1R1UyTkxWRzVqYVRKQlJETnZVRUZoVEdKU2VqTkxia1UxUVd0MU0zQm1OWE5pTmxNNUxUSkZORWhTWkRaME5UaG1kbUZLWDFGRVpETjZWSFZDTWw4eWQwZFBNeTFhZEVWSWVWWlZZa2cyYlZsU2JYa3dhWEJ5TkdsdVlscHJVMkp0TVU5VGFYRlZOek00YlVwWVJVeDFWVEJKUzNremIwbHNhbVV0VVZabmVFcDBMWGhaTW1SWFFuSlphSE5mU0ROdGJWZEZVR0ZMUTFweVFYaDBlRE42YmpkcExVSnBVMUJ6ZUdoT1ZFMVNSRlF0WXpCTFZsTk9SbDlLVVRBM1pYWnRRMWN4TVZJM1ZXVTVja0pVTWxwU1gwVjRXSEZRY1VWRmNFeGhVMUZTYVc5Uk1tUlRhSFEzZFRGRGVFTkdhRTUwUTI5SmVFTlBibFl3U1Zrd1RFdDZjR0l6YzNWMExUZEZSMGhMWVZsclZqSllUSFZoUlZONVprMDJUMnhFWmxkcFNrOW1hbU5pWkROUE1HUnRNM1F5VXpaMGQya3dZek0zTkZjMVRWcEZTbWRNTjI

In [13]:
search_google_jobs(search_query="machine learning engineer", next_page_token=next_page_token, verbose=True)

[
  {
    "title": "Staff Machine Learning Engineer",
    "company_name": "ServiceNow",
    "location": "Anywhere",
    "via": "Careers | ServiceNow",
    "share_link": "https://www.google.com/search?ibp=htl;jobs&q=machine+learning+engineer&htidocid=ltBhKtzmYa2AVL6NAAAAAA%3D%3D&hl=en-US&shndl=-1&source=sh/x/job/li/m1/1#fpstate=tldetail&htivrt=jobs&htiq=machine+learning+engineer&htidocid=ltBhKtzmYa2AVL6NAAAAAA%3D%3D",
    "thumbnail": "https://serpapi.com/searches/671d916b1fc7c5b27b453a80/images/42984474773b94fccc9d01fa7cd2fd871a6b1cdbceb949d6942f297cf836c9c8.gif",
    "extensions": [
      "9 days ago",
      "Work from home",
      "Full-time",
      "Health insurance",
      "Dental insurance"
    ],
    "detected_extensions": {
      "posted_at": "9 days ago",
      "work_from_home": true,
      "schedule_type": "Full-time",
      "health_insurance": true,
      "dental_coverage": true
    },
    "description": "Company Description\n\nIt all started in sunny San Diego, California in

'eyJmYyI6IkVvd0ZDc3dFUVVwSE9VcHJUbmRrYW5aMU5rVnlVVkowTm1jeGVsbHhTVjlzY21Nd2RGVnRabkEyVHpkTlgyZG1ja0oyZURBNFdFZzVkRnBXTjBwd2VFUlJkREJ6ZUd3M2JXRXhjRUpETTFkeU5XSlNRUzFRTldWaWJtTjBiVFZoZHpGYWVEUk9RbHB1WW1NdFZIVnVWa3QyUVc0NVUzVnFNbEkzZW5oMVRFaEtkbFUwVkZkSU1VcHNka1pOWlU5VE5VVnFXbTlFUkZGT1pFSTBVRTFZYXpaWmVYWkpYMTh4WVd4clN6UkdhWEZvWkVWeGQxcFNjR0pEWWtjeVdGZHNRbTF0YUU1S1QwdGxSMk53VlRWWFdFRkdWR1pVV1V0ak9FbFNkMHR1Y2xoeE5FRjJSM2RqYUZkaFltRmlUakJaTURsUFNVeFNRbVZtZGtOWlVEZENha3RxTnkxVFRVVTVNR0V0WDI5NVh6VmxWRlZ4VkdaTFFtY3pNVUZsVGtKNFh6SnpVMmxRTmxZdFkzQTVVMDkxVERGU2RuTk9PRkJTWTFVNFMxZEdiWG8yVFRGeVJXUkhTa1ZmV0c1RlNtMVVSR1pHUjFkemJIWmtWV1p1VDA5NGVETjJkSEZhU1MxSU0wUkRXbUUyU3pKdUxUQkVPRlJHV0RaeGVuWTFMVXgxT1hOblpETktVVmt0TVZkdWJWVTNTM0ZWYTNoQ1l6Ukxia1ZXVGtWUWF6SnlTa3N3YkRSSlQxRlBlVWhuU2xkaFh6QlRNV0ZaVUROTlNGQnJSMFJqTkVWaWFrbHdRMjVMYlRKbVlWRTNNR1k1Y21KTGJqWlNXbkJvY1VrdGMxUlhlRzFYV2tOcE1uZG5RWGhLYUZOek5YbzVPRXRRY0hGeU5HaFVUbVpEUzJ4c1dtODRTM3A0U0VsaVJrNHdjMHBoWVRGT1FtUm9hSFJvUVhOVlZIVk9SMVJwZWpsT2NGTkdiVVJMV0Z

In [14]:
for result in result_dict['jobs_results']:
    print(result["title"], result["company_name"])

Senior Director, R&D Data Science and Digital Health, Real-World Evidence and Advanced Analytics Johnson & Johnson
Intern Data Science AI&I - PHD (On-site) Mayo Clinic
Data Scientist, Paramount Advertising Paramount
Data Scientist (NO SUBCONTRACTING) VALERE
Data Scientist/Statistician Intern Summer 2025 (MS/PhD) Lubrizol Corporation
Senior Data Scientist, ASE iCloud Data Organization [Executive Communications] Apple
Director of Data Science Cooley LLP
Principal Associate, Data Scientist - Card Partnerships Capital One
Data Science Solution Specialist - Generative AI Deloitte
Data Scientist – Generative AI Passcreator


## Iterate over job titles

These titles reflect a wide range of roles that leverage data science and machine learning skills in various industries and specialties.

- Data Scientist
- Machine Learning Engineer
- Artificial Intelligence Specialist
- Data Analyst
- Business Intelligence Analyst
- Research Scientist (AI/ML)
- Deep Learning Engineer
- NLP Engineer (Natural Language Processing)
- Computer Vision Engineer
- Data Engineer
- Applied Scientist
- Quantitative Analyst (Quant)
- Predictive Modeler
- AI Solutions Architect
- Statistician
- Big Data Engineer
- Data Science Consultant
- Automation Engineer
- Analytics Manager
- Decision Scientist
- Operations Research Analyst
- Robotics Engineer
- Bioinformatics Data Scientist
- Healthcare Data Analyst
- Financial Data Scientist
- Customer Insights Analyst
- Marketing Data Analyst
- Data Strategy Manager
- Cloud AI Engineer
- Computational Scientist
- Fraud Detection Specialist
- Risk Analyst
- Data Architect
- Algorithm Engineer

For each keyword, do three searches, using pagination, this will result in around 30 jobs per keyword (assuming there are at least 30 jobs for the particular keyword), save each search results to a file. 

Note, just to be safe, wait a one second between each request e.g. using `time.sleep(1)`

In [15]:
job_titles = [
    "Data Scientist",
    "Machine Learning Engineer",
    "Artificial Intelligence Specialist",
    "Data Analyst",
    "Business Intelligence Analyst",
    "Research Scientist (AI-ML)",
    "Deep Learning Engineer",
    "NLP Engineer (Natural Language Processing)",
    "Computer Vision Engineer",
    "Data Engineer",
    "Applied Scientist",
    "Quantitative Analyst (Quant)",
    "AI Solutions Architect",
    "Statistician",
    "Big Data Engineer",
    "Data Science Consultant",
    "Automation Engineer",
    "Analytics Manager",
    "Operations Research Analyst",
    "Robotics Engineer",
    "Bioinformatics Data Scientist",
    "Financial Data Scientist",
    "Customer Insights Analyst",
    "Marketing Data Analyst",
    "Data Strategy Manager",
    "Cloud AI Engineer",
    "Computational Scientist",
    "Fraud Detection Specialist",
    "Risk Analyst",
    "Data Architect"
]

print(len(job_titles)*3)

90


Now insert code to iterate over the job titles, and perform the searches.

Be very careful, this needs to be 100% correct before running it, otherwise you will burn through your free searches.

I would recommend doing just one iteration of the loop as a trial run, if that looks good, then do do the next iteration and carefully check the results, if everything looks good then do remaining 28 iterations.

Note: sometimes the Pagination will return less than 10 results, so you may end up with slightly less than 30 results per keyword, e.g. 25 to 30

Remember to clean the job tiles to remove any characters like spaces, `/` or `()`

In [24]:
# INSERT CODE HERE
import time

# Assuming the modified search_google_jobs function is already defined

# Define a function to clean job titles
def clean_job_title(title):
    # Remove special characters, such as spaces, slashes, and parentheses, for filename safety
    cleaned_title = re.sub(r'[ /()]', '-', title)
    cleaned_title = re.sub(r'--+', '-', cleaned_title)  # Replace multiple consecutive '-' with a single '-'
    cleaned_title = cleaned_title.strip('-')  # Remove leading and trailing '-'
    return cleaned_title.lower()


In [23]:
# Iterate over the full list of job titles
# Iterate over the full list of job titles
for job_title in job_titles:
    print('-' * 24)
    print(job_title)
    # Initialize next_page_token as None
    next_page_token = None

    # Perform 3 searches, using pagination
    for page in range(3):
        print(f"SEARCH- {page}")
        # Call the search function and get output_filename
        next_page_token, output_filename = search_google_jobs(
            search_query=job_title,
            next_page_token=next_page_token,
            verbose=False,
            return_jobs=True
        )

        # Read job results from the saved file
        try:
            with open(output_filename, 'r') as infile:
                job_results = json.load(infile)
        except FileNotFoundError:
            print(f"Could not find file {output_filename}")
            job_results = []

        # Print job titles from the search results
        for job in job_results:
            title = job.get('title', 'N/A')
            company = job.get('company_name', 'N/A')
            print(f"{title} : {company}")

        # Check if next_page_token is None, indicating no more pages
        if not next_page_token:
            print(f"No more results for '{job_title}' after page {page + 1}.")
            break  # Exit the pagination loop

        # Wait for 1 second to avoid making requests too quickly
        time.sleep(1)

------------------------
Data Scientist
SEARCH- 0
Data Scientist : KBR
Data Scientist Junior : Parsons Corporation
Data Scientist in Residence : Apziva
Senior Data Scientist - Advanced Analytics and Machine Learning : Highmark Health
Product Data Scientist (Decisions Alliance) : HelloFresh
Palantir Data Scientist Consultant : Deloitte
Data Scientist - Linguistics and Data Modeling : Tahzoo
Data Scientist Junior : 001 Parsons Government Services Inc.
Senior Data Scientist - Proficiency in Python for data analysis and visualization : Resiliency LLC
Sr. Data Scientist (Remote) : Irvine Technology Corporation
SEARCH- 1
Wildfire Data Scientist : Xcel Energy
Research Data Scientist II/Senior : Iambic Therapeutics
Full Stack Data Scientist : Cardinal Health
Data Scientist - 100% US Remote : Sprinklr Inc.
Data Scientist, Product (Marketplace) : Thumbtack
Senior Scientist - Computational Biology/Data Science, AI/ML : Merck & Co., Inc
Data Scientist I - ML : Cotiviti
Data Scientist - Forecasting