# LinkedIn Market Analysis


### Process:
1. Scrape data from Linkedin and Glassdoor, using Selenium.
2. EDA, cleaning, and export to CSV.
3. Compare to previous years' data (2 and 4 years ago) using tableau.

### Additional Datasets:
- 2018: https://www.kaggle.com/datasets/discdiver/data-scientist-general-skills-2018-revised (skills specific)
- 2020: https://www.kaggle.com/datasets/andrewmvd/data-analyst-jobs (jobs, salary, and location)
- 2022: [my submission to Kaggle]

### Reference Notebooks:
- https://www.kaggle.com/code/gawainlai/us-data-science-job-salary-regression-w-visuals (beyond my skill level)
- https://www.kaggle.com/code/discdiver/the-most-in-demand-skills-for-data-scientists (top skills)

## Import Libraries

In [37]:
# selenium imports
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import Select
from selenium.webdriver.common.action_chains import ActionChains

# web scraping imports
from bs4 import BeautifulSoup
import requests

# database imports
import re as re
import time
import pandas as pd
import os
import numpy as np

# import function to save different files to csv (ie, job links)
import csv
import datetime

# import for description scrape
import urllib.request

# import and load file to login to LinkedIn
from dotenv import load_dotenv
load_dotenv()

True

## Scrape LinkedIn for Job Links Using the Selenium Driver

In [38]:
# initialize the LinkedIn scrape

# Options
options = webdriver.ChromeOptions() # init for chrome
options.add_argument('--incognito') # runs chrome in a 'clean slate' window
#options.add_argument('--headless') # runs chromedriver in the background, without opening a window

# Initialize the selenium driver
driver = webdriver.Chrome(options = options, executable_path='./chromedriver')
login_url = "https://www.linkedin.com/uas/login"

# Start the page
driver.get(login_url)
time.sleep(3)

# Target the login elements
email = driver.find_element("id", "username")
password = driver.find_element("id", "password")

# Load env variables
my_email = os.getenv("linkedin_username")
my_password = os.getenv("linkedin_password")

# Input in the form
email.send_keys(my_email)
password.send_keys(my_password)
password.send_keys(Keys.RETURN)

  driver = webdriver.Chrome(options = options, executable_path='./chromedriver')


In [39]:
def scrape_links(data_role, location):
    """ Scrape 40 pages of a LinkedIn job search for job links, using the given data role as the search term """
    
    # SCRAPE 40 PAGES
    for i in range(10): # FIX: change back to 40 for final analysis
        print(f'Scraping {i+1} of 40 pages for {data_role} in {location}.')
        
        # navigate to the correct page
        scrape_url = f"https://www.linkedin.com/jobs/search/?&keywords={data_role}&location={location}&refresh=true&start={i*25}"
        # TEST: https://www.linkedin.com/jobs/search/?&keywords=data%20analyst&location=Los%20Angeles%2C%20California%2C%20United%20States&refresh=true&start=1
        if i == 0:
            scrape_url = f"https://www.linkedin.com/jobs/search/?&keywords={data_role}&location={location}&refresh=true&start={1}"
        driver.get(scrape_url)
        time.sleep(5)

        # convert page text to beautiful soup
        src = driver.page_source
        soup_for_page = BeautifulSoup(src, 'lxml')
        
        # create a list of jobs on the current page, to iterate through after each scrape
        job_links = []
        jobs_on_page = soup_for_page.find_all("a", attrs={"class":"disabled ember-view job-card-container__link job-card-list__title"})
        for k in jobs_on_page: # length of jobs varies by page
            job_links.append(k["href"])
        print(f'Job links collected from page {i+1}:', len(job_links)) # DEBUG
        #print(f'job links from page {i+1}:',job_links) # DEBUG
        
    return job_links

In [40]:
def flatten_2d_list(list_):
    """ Flatten a 2d list of lists into a 1d list """
    list_flattened = [a for y in list_ for a in y]
    return list(set(list_flattened))

In [41]:
# LISTS FOR SCRAPING for job links

# 5 titles taken from market analysis, used to capture all links to scrape with matching search term.
data_roles = ['data analyst','data scientist','data engineer','data architect','data manager']
# removed from final scrape to reduce noise and risk of account ban: 'finance analyst','data warehouse analyst','data manager','data marketing analyst'

# 32 locations chosen from top tech cities across US (excluding search results yielding the same listings on LinkedIn)
locations = ['San Francisco, California, United States','Los Angeles, California, United States','San Jose, California, United States',
             'San Diego, California, United States','Portland, Oregon, United States','Seattle, Washington, United States',
             'Denver, Colorado, United States', 'Colorado Springs, Colorado, United States','Indianapolis, Indiana, United States',
             'New York, New York, United States','Secaucus, New Jersey', 'Boston, Massachusetts, United States', 
             'Baltimore, Maryland, United States','Chicago, Illinois, United States','Philadelphia, Pennsylvania, United States'
             'Phoenix, Arizona, United States','Salt Lake City, Utah, United States','Minneapolis, Minnesota, United States',
             'Detroit, Michigan, United States','Columbus, Ohio, United States','Kansas City, Missouri, United States',
             'Austin, Texas, United States','Dallas, Texas, United States','Houston, Texas, United States', 
             'Atlanta, Georgia, United States','Jackson, Mississippi, United States','Washington, District of Columbia, United States',
             'Charlotte, North Carolina, United States','Raleigh, North Carolina, United States',
             'Jacksonville, Florida, United States','Miami, Florida, United States','Tampa, Florida, United States']

"""
HOW MUCH DATA IS ENOUGH ?
35 locations * 9 titles * 2 pages = 630 pages. At the full 40 pages, the real total of my scrape will be 
12,600 pages. Assuming I don't get banned for scraping 10,000 pages, let alone 100 pages, I will still need 
to scrape the links that come from them. That's 630 pages * 7 links = 4,410 links for the sample and 
12,600 * 25 = 315,000 links for the real scrape. Of course, most jobs will be duplicates, but that doesn't 
change that I will have to wait a long time for data and I may get banned several times before the scrape is
complete. In reality, it may be safer to limit my searches to fewer titles, locations, pages and links.

The new scrape of 5 roles * 32 cities * 40 pages = 6,400 (or 320 for the 2-page sample) is more reasonable.

state_locations = ['Washington', 'California', 'Colorado', 'Texas', 'Illinois', 'Florida', 'Atlanta', 'New York']

global_locations = [Barcelona, Madrid, Berlin, Munich, Amsterdam, London, Dublin, Stockholm, Copenhagen, Oslo,
             Luxembourg, Eindhoven, Manchester, Belfast, Bristol, Paris, Budapest, Bucharest, Warsaw, Prague, 
             Lisbon, Rome, Zurich, vancouver, ontario, montreal, toronto, 
             Melbourne, Moscow, Seoule, Jakarta, Kyiv, tokyo, rejkjavik,
             argentina, mexico city, lima, rio, buenos aires, sao paolo, panama,] 
"""

job_links = [] # init list to capture all job links

# SCRAPE search LinkedIn for each role title and location given above and return a list of up to 1,000 jobs
for title in data_roles:
    for location in locations:
        print(f'Searching for {title} jobs in {location}...')
        job_links.append(scrape_links(title.replace(' ','%20'), location.replace(',','%2C').replace(' ','%20')))
#print(job_links) # DEBUG

# DUPLICATES remove dupliate links and flatten 2d array before scraping
job_links_cleaned = flatten_2d_list(job_links)

Searching for data analyst jobs in San Francisco, California, United States...
Scraping 1 of 40 pages for data%20analyst in San%20Francisco%2C%20California%2C%20United%20States.
Job links collected from page 1: 7
Scraping 2 of 40 pages for data%20analyst in San%20Francisco%2C%20California%2C%20United%20States.
Job links collected from page 2: 7
Scraping 3 of 40 pages for data%20analyst in San%20Francisco%2C%20California%2C%20United%20States.
Job links collected from page 3: 7
Scraping 4 of 40 pages for data%20analyst in San%20Francisco%2C%20California%2C%20United%20States.


KeyboardInterrupt: 

## Export (and Import) Job Links

In [None]:
# export links to csv for future use
with open(f'output/job_links_{datetime.date.today()}.csv', 'w', newline='') as job_links:
    csv_out = csv.writer(job_links)
    csv_out.writerows([job_links_cleaned[index]] for index in range(0, len(job_links_cleaned)))

In [8]:
# import job links from csv
with open('output/job_links_sample.csv', newline='') as f: # change for desired search date
    reader = csv.reader(f)
    job_links_imported = list(reader)

In [12]:
links = flatten_2d_list(job_links_imported)
print(links)

['/jobs/view/3245937555/?eBP=JOB_SEARCH_ORGANIC&recommendedFlavor=ACTIVELY_HIRING_COMPANY&refId=Syt68zSCTGmNiywIn1R4VA%3D%3D&trackingId=sVgDrv1FfLWq0%2F42qsUdDg%3D%3D&trk=flagship3_search_srp_jobs', '/jobs/view/3244805465/?eBP=CwEAAAGC89ex2AnmOMrDc_VarrOleVTK96HnJxzrr5Se00BAIDnkdIjqpN7OjYTGu8PjAToNlj3jENoSr3iCHZgXgshBNJXYdResCkO4L9BphHWzn-SQ41-0c5RlwkMtaFFVQv1tDv_aElFaoRpgAAAWxturYo7aWHxTEW8CMxBfzLIVycmqGIWOHYURBI8CNKnGg0VpICLIUjqjQI9iRKmKAkAAf49e89DJhdXovjSgV3uPqBnvRG_8h-fOi1UAumWy93N-4bouQYaBeeMudwbyIzyBBtp9gwta2xzOMUZNZ7WfmUox8TxX0hLA61JhMNoatJCM1V7ZBrOxttl8YItUYU1BAQhRcINRpDQJQvOZtNMNEvvCWYd2DfzbZBE7RV71&refId=LTvWF2Ay6BwXYTJJHHFrLw%3D%3D&trackingId=YTKgq%2BvPYb%2BpAiw%2B3ZWydw%3D%3D&trk=flagship3_search_srp_jobs', '/jobs/view/3059412442/?eBP=JOB_SEARCH_ORGANIC&recommendedFlavor=SCHOOL_RECRUIT&refId=9V3GUqN1X2oa2yoYpKWC%2BA%3D%3D&trackingId=aN3DDnzo6vWcGvmhYnXRdw%3D%3D&trk=flagship3_search_srp_jobs', '/jobs/view/3244569629/?eBP=JOB_SEARCH_ORGANIC&recommendedFlavor=ACTIVELY_HIRING_C

## Scrape All Job Links Using Beautiful Soup

In [20]:
# SCRAPE FOR ALL DATA SANS DESCRIPTION ?

def scrape_listing(links):
    """ Returns all scraped data for each job listing from the links passed into the function """

    # VARIABLE ASSIGNMENT create lists to store all scraped data (10 criteria)
    titles, companies, locations, remote, post_dates, num_applicants, contract, size, desc, salaries = [], \
        [], [], [], [], [], [], [], [], []
    
    # SCRAPE ALL LINKS scrape all jobs on the current page using passed in links
    for idx, link in enumerate(links[:1000]):
        print(f'\nScraping job {idx} of {len(links)}.') # DEBUG TEXT
        
        # GO TO PAGE Navigate to page
        #print('\n\n\nkey:', key, '\nvalue:', value[0][0]) # DEBUG TEXT
        driver.get(f'https://linkedin.com{link}')
        time.sleep(3)
        
        # SEE FULL PAGE click 'see more' and scroll down
        #see_more() # FIX
        
        # BEAUTFUL SOUP EXTRACTION convert page text to beautiful soup
        src = driver.page_source
        soup = BeautifulSoup(src, 'lxml')
        
        # DATA COLLECTION return data of results on current selected sub-page to new lists
        
        # TITLE
        try:
            titles.append(soup.select('h1.t-24.t-bold.jobs-unified-top-card__job-title')[0].get_text().replace('\n','').strip())
        except:
            print('title could not be found')
            titles.append(None)
         
        # COMPANY
        try:
            companies.append(soup.select('span.jobs-unified-top-card__company-name')[0].get_text().replace('\n','').strip())
        except:
            print('company could not be found')
            companies.append(None)
            
        # LOCATION
        try:
            locations.append(soup.select('span.jobs-unified-top-card__bullet')[0].get_text().replace('\n','').strip())
        except:
            print('location could not be found')
            locations.append(None)
        
        # REMOTE POSITION
        try: # check in header
            remote.append(soup.select('span.jobs-unified-top-card__workplace-type')[0].get_text().replace('\n','').strip())
        except:
            try: # check the description for remote term.
                desc_temp = soup.select('div.jobs-box__html-content.jobs-description-content__text.t-14.t-normal.jobs-description-content__text--stretch')[0].get_text().strip()
                print('remote could not be found in header')
                if 'remote' in desc_temp:
                    remote.append('remote?')
                elif 'hybrid' in desc:
                    remote.append('hybrid?')
                else:
                    remote.append(None)
            except:
                try: # check in title
                    title_temp = soup.select('div.jobs-box__html-content.jobs-description-content__text.t-14.t-normal.jobs-description-content__text--stretch')[0].get_text().strip()
                    if 'remote' in title_temp:
                        remote.append('remote?')
                    else:
                        remote.append(None)
                    print('remote position could not be found in header or description')
                except:
                    print('remote position could not be found in header , description, or title')
                    remote.append(None)
            
        # POST DATE
        try:
            post_dates.append(soup.select('span.jobs-unified-top-card__posted-date')[0].get_text().replace('\n','').strip())
        except:
            print("could not find 'posted date'")
            post_dates.append(None)
        
        # NUMBER of APPLICANTS
        try:
            num_applicants.append(soup.select('span.jobs-unified-top-card__applicant-count')[2].get_text().replace('\n','').strip())
        except:
            try:
                num_applicants.append(soup.select('span.jobs-unified-top-card__bullet')[1].get_text().replace('\n','').strip())
                print("could not find 'number of applicants' in applicant count")
            except:
                print("could not find 'number of applicants' in applicant count or bullet")
                num_applicants.append(None)
        
        # FULL TIME
        try:
            contract.append(soup.select('li.jobs-unified-top-card__job-insight.span').get_text()
                .replace('\n','').strip().rsplit(' ', 1)[-1])
        except:
            try:
                contract.append(soup.select('li.jobs-unified-top-card__job-insight')[0].get_text()
                    .replace('\n','').strip().rsplit(' ', 1)[-1])
                print('could not find "contract type" in job insights.span')
            except:
                print("could not find 'contract type' in job insights.span or job insights")
                contract.append(None)
        
        # COMPANY SIZE
        try:
            size.append(soup.select('li.jobs-unified-top-card__job-insight')[1].get_text().replace('\n','').strip())
        except:
            print("could not find 'company size'")
            size.append(None)
        
        # DESCRIPTION
        try:
            desc.append(soup.select('div.jobs-box__html-content.jobs-description-content__text.t-14.t-normal.jobs-description-content__text--stretch')[0].get_text().strip())
        except:
            
            print("could not find description (probably shouldn't apply!)")
            
        # SALARY
        try:
            salaries.append(soup.select('a.app-aware-link')[6].get_text().replace('\n','').strip())
            if '$' not in salaries[-1]:
                salaries.pop()
                salaries.append(None)
        except:
            try:
                salaries.append(soup.select('p.t-16')[0].get_text().replace('\n','').strip())
                print("could not find 'salary' with #SALARY tag")
            except:
                try:
                    salaries.append(re.find('($.)', desc).replace('\n','').strip())
                    print("could not find 'salary' with #SALARY tag or in p.t-16")
                except:
                    try: 
                        salaries.append(soup.select('li.jobs-unified-top-card__job-insight')[0].get_text()
                            .replace('\n','').strip().rsplit(' ', 1)[0].rstrip(' ·'))
                        print("could not find 'salary' with #SALARY tag, in p.t-16, or in description")
                    except:
                        print("could not find 'salary' with #SALARY tag, in p.t-16, in description, or in full-time")
                        salaries.append(None)

    # DICTIONARY ASSIGNMENT pass data from lists into a dictionary
    dict_from_scrape = {'title':titles, 'company':companies, 'location':locations, 'remote':remote, 
                        'post_date':post_dates, 'num_applicants':num_applicants, 'contract_type':contract, 
                        'company_size':size, 'description':desc, 'salary':salaries}

    # DATAFRAME ASSIGNMENT
    df_from_scrape = pd.DataFrame(dict_from_scrape)
    
    os.system("say -v Monica ayam don escreipin")
    return df_from_scrape

In [4]:
# SCRAPE FOR DATA FROM DESCRIPTION ?

def scrape_listing_with_soup(links):
    """ Returns all scraped data for each job listing from the links passed into the function """

    # VARIABLE ASSIGNMENT create lists to store all scraped data (10 criteria)
    titles, companies, locations, remote, post_dates, num_applicants, contract, size, desc, salaries = [], \
        [], [], [], [], [], [], [], [], []
    
    # SCRAPE ALL LINKS scrape all jobs on the current page using passed in links
    for idx, link in enumerate(links[:10]):
        print(f'\nScraping job {idx+1} of {len(links)}.\n') # DEBUG TEXT
        print(f'https://linkedin.com{link}\n') # DEBUG TEXT
        #opener = urllib.request.FancyURLopener({})
        opener = urllib.request.urlopen(f'https://linkedin.com{link}')
        #with opener.open(f'https://linkedin.com{link}') as f: 
        #    f.read().decode('utf-8')
            #content = f.read()
        #with open('data_page.html', 'r') as f:
            #contents = f.read()
        soup = BeautifulSoup(opener, 'lxml')
        #print('empty soup', soup)

        # DATA COLLECTION return data of results on current selected sub-page to new lists
        
        # TITLE
        try:
            titles.append(soup.select('h1.t-24.t-bold.jobs-unified-top-card__job-title')[0].get_text().replace('\n','').strip())
        except:
            try:
                titles.append(soup.select('h1.top-card-layout__title.font-sans.text-lg.papabear:text-xl.font-bold.leading-open.text-color-text.mb-0.topcard__title').get_text().strip())
            except:
                print('title could not be found')
                titles.append(None)

        # COMPANY
        try:
            companies.append(soup.select('span.jobs-unified-top-card__company-name')[0].get_text().replace('\n','').strip())
        except:
            try:
                companies.append(soup.select('a.topcard__org-name-link.topcard__flavor--black-link').get_text().strip())
            except:    
                print('company could not be found')
                companies.append(None)

        # LOCATION
        try:
            locations.append(soup.select('span.jobs-unified-top-card__bullet')[0].get_text().replace('\n','').strip())
        except:
            try:
                locations.append(soup.select('span.topcard__flavor.topcard__flavor--bullet').get_text().strip())
            except: 
                print('location could not be found')
                locations.append(None)

        # REMOTE POSITION
        try: # check in header
            remote.append(soup.select('span.jobs-unified-top-card__workplace-type')[0].get_text().replace('\n','').strip())
        except:
            try: # check the description for remote term.
                desc_temp = soup.select('div.jobs-box__html-content.jobs-description-content__text.t-14.t-normal.jobs-description-content__text--stretch')[0].get_text().strip()
                print('remote could not be found in header')
                if 'remote' in desc_temp:
                    remote.append('remote?')
                elif 'hybrid' in desc:
                    remote.append('hybrid?')
                else:
                    remote.append(None)
            except:
                try: # check in title
                    title_temp = soup.select('div.jobs-box__html-content.jobs-description-content__text.t-14.t-normal.jobs-description-content__text--stretch')[0].get_text().strip()
                    if 'remote' in title_temp:
                        remote.append('remote?')
                    else:
                        remote.append(None)
                    print('remote position could not be found in header or description')
                except:
                    print('remote position could not be found in header , description, or title')
                    remote.append(None)

        # POST DATE
        try:
            post_dates.append(soup.select('span.jobs-unified-top-card__posted-date')[0].get_text().replace('\n','').strip())
        except:
            print("❌ could not find 'posted date'")
            post_dates.append(None)

        # NUMBER of APPLICANTS
        try:
            num_applicants.append(soup.select('span.jobs-unified-top-card__applicant-count')[2].get_text().replace('\n','').strip())
        except:
            try:
                num_applicants.append(soup.select('span.jobs-unified-top-card__bullet')[1].get_text().replace('\n','').strip())
                print("could not find 'number of applicants' in applicant count")
            except:
                try:
                    num_applicants.append(soup.select('num_applicants__caption').get_text().replace('\n','').strip())
                    print("could not find 'number of applicants' in applicant count or bullet")
                except:
                    print("❌ could not find 'number of applicants' in applicant count, bullet, or caption")
                    num_applicants.append(None)

        # FULL TIME
        try:
            contract.append(soup.select('li.jobs-unified-top-card__job-insight.span').get_text()
                .replace('\n','').strip().rsplit(' ', 1)[-1])
        except:
            try:
                contract.append(soup.select('li.jobs-unified-top-card__job-insight')[0].get_text()
                    .replace('\n','').strip().rsplit(' ', 1)[-1])
                print('could not find "contract type" in job insights.span')
            except:
                print("❌ could not find 'contract type' in job insights.span or job insights")
                contract.append(None)

        # COMPANY SIZE
        try:
            size.append(soup.select('li.jobs-unified-top-card__job-insight')[1].get_text().replace('\n','').strip())
        except:
            print("❌ could not find 'company size'")
            size.append(None)

        # DESCRIPTION
        try:
            desc.append(soup.select('div.jobs-box__html-content.jobs-description-content__text.t-14.t-normal.jobs-description-content__text--stretch')[0].get_text().strip())
        except:
            try:
                desc.append(soup.select('div.jobs-box__html-content.jobs-description-content__text.t-14.t-normal.jobs-description-content__text--stretch.span').get_text().strip())
            except:
                try:
                    desc.append(soup.select('div.jobs-box__html-content.jobs-description-content__text.t-14.t-normal.jobs-description-content__text--stretch').get_text().strip())
                except:
                    try:
                        desc.append(soup.select('div.job-details.jobs-box__html-content.jobs-description-content__text.t-14.t-normal.jobs-description-content__text--stretch')[0].get_text().strip())
                    except:
                        try:
                            print('text rich\n', soup.select('div.description__text.description__text--rich')[0].get_text().strip())
                            desc.append(soup.select('div.description__text.description__text--rich')[0].get_text().strip())
                        except:
                            try:
                                desc.append(soup.select('div.show-more-less-html__markup.show-more-less-html__markup--clamp-after-5')[0])
                                print('clamp after 5\n', soup.select('div.show-more-less-html__markup.show-more-less-html__markup--clamp-after-5')[0])
                            except:
                                desc.append(None)
                                print("⚠️ could not find description.")

        # SALARY
        try:
            salaries.append(soup.select('a.app-aware-link')[6].get_text().replace('\n','').strip())
            if '$' not in salaries[-1]:
                salaries.pop()
                salaries.append(None)
        except:
            try:
                salaries.append(soup.select('p.t-16')[0].get_text().replace('\n','').strip())
                print("could not find 'salary' with #SALARY tag")
            except:
                try:
                    salaries.append(re.find('($.)', desc).replace('\n','').strip())
                    print("could not find 'salary' with #SALARY tag or in p.t-16")
                except:
                    try: 
                        salaries.append(soup.select('li.jobs-unified-top-card__job-insight')[0].get_text()
                            .replace('\n','').strip().rsplit(' ', 1)[0].rstrip(' ·'))
                        print("could not find 'salary' with #SALARY tag, in p.t-16, or in description")
                    except:
                        print("❌ could not find 'salary' with #SALARY tag, in p.t-16, in description, or in full-time")
                        salaries.append(None)

    # DICTIONARY ASSIGNMENT pass data from lists into a dictionary
    dict_from_scrape = {'title':titles, 'company':companies, 'location':locations, 'remote':remote, 
                        'post_date':post_dates, 'num_applicants':num_applicants, 'contract_type':contract, 
                        'company_size':size, 'description':desc, 'salary':salaries}

    # DATAFRAME ASSIGNMENT
    df_from_scrape = pd.DataFrame(dict_from_scrape)

    os.system("say -v Monica ayam don escreipin")
    return df_from_scrape

In [21]:
df = pd.DataFrame(scrape_listing(links))
df


Scraping job 0 of 1085.
could not find 'posted date'
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 1 of 1085.
could not find 'number of applicants' in applicant count
could not find "contract type" in job insights.span

Scraping job 2 of 1085.
remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 3 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 4 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping jo

could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 42 of 1085.
remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 43 of 1085.
could not find 'number of applicants' in applicant count
could not find "contract type" in job insights.span

Scraping job 44 of 1085.
could not find 'number of applicants' in applicant count
could not find "contract type" in job insights.span

Scraping job 45 of 1085.
could not find 'number of applicants' in applicant count
could not find "contract type" in job insights.span

Scraping job 46 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 47 of 1085.
could not find 'number of applicants' in applicant co

could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 87 of 1085.
remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 88 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 89 of 1085.
could not find 'posted date'
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 90 of 1085.
remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job i

could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 129 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 130 of 1085.
could not find 'posted date'
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 131 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 132 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 133 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" i

remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 171 of 1085.
remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 172 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 173 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 174 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 175 of 1085.
could not find 'posted date'
could not find 'number of a

could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 211 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 212 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 213 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 214 of 1085.
remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t

remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 250 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 251 of 1085.
remote could not be found in header
could not find 'posted date'
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 252 of 1085.
remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 253 of 1085.
could not find 'number of applicants' in applicant count or bullet

could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 289 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 290 of 1085.
remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 291 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 292 of 1085.
remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 293 of 1085.
remote could not be found in header
could not find 'post

could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 331 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 332 of 1085.
remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 333 of 1085.
remote could not be found in header
could not find 'posted date'
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 334 of 1085.
remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 335 

could not find 'posted date'
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 372 of 1085.
remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 373 of 1085.
could not find 'posted date'
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 374 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 375 of 1085.
could not find 'number of applicants' 

remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 411 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 412 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 413 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 414 of 1085.
remote could not be found in header
could not find 'posted date'
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in j

could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 450 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 451 of 1085.
remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 452 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 453 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 454 of 1085.
remote could not be f

could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 492 of 1085.
remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 493 of 1085.
remote could not be found in header
could not find 'posted date'
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 494 of 1085.
could not find 'posted date'
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 495 of 1085.
co

could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 532 of 1085.
remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 533 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 534 of 1085.
remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 535 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping jo

remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 572 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 573 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 574 of 1085.
remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 575 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 576 of 1085.
could not find 'posted date'
could not find 'number of a

remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 614 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 615 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 616 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 617 of 1085.
could not find 'posted date'
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 618 o

remote could not be found in header
could not find 'posted date'
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 653 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 654 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 655 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 656 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'sal

could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 696 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 697 of 1085.
remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 698 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 699 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 700 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in

remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 737 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 738 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 739 of 1085.
remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 740 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 741 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could no

remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 778 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 779 of 1085.
remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 780 of 1085.
remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 781 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.

could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 817 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 818 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 819 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 820 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 821 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not

could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 858 of 1085.
remote could not be found in header
could not find 'posted date'
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 859 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 860 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 861 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'sal

could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 901 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 902 of 1085.
remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 903 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 904 of 1085.
remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not fi

could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 940 of 1085.
remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 941 of 1085.
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 942 of 1085.
remote could not be found in header
could not find 'posted date'
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 943 of 1085.
could not find 'posted date'
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span

Scraping job 944 of 1085

remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 982 of 1085.
remote could not be found in header
could not find 'posted date'
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 983 of 1085.
remote could not be found in header
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in description

Scraping job 984 of 1085.
could not find 'posted date'
could not find 'number of applicants' in applicant count or bullet
could not find "contract type" in job insights.span
could not find 'salary' with #SALARY tag, in p.t-16, or in descr

Unnamed: 0,title,company,location,remote,post_date,num_applicants,contract_type,company_size,description,salary
0,Data Scientist,Nordstrom,"Seattle, WA",On-site,,,level,"10,001+ employees · Retail",About the job\n \n\n \nJob Descriptio...,Full-time · Entry
1,Digital Sales Manager,VP MUSIC GROUP INC,"Queens, NY",On-site,2 days ago,14 applicants,Full-time,11-50 employees,About the job\n \n\n \nThe company ma...,
2,"Data Scientist, Auction Expert",Meta,"Austin, TX",,4 days ago,,Full-time,"10,001+ employees · Technology, Information an...",About the job\n \n\n \n ...,
3,Junior Data Analyst,Allegis Group,"Miami, FL",Hybrid,1 day ago,,Associate,"10,001+ employees · Staffing and Recruiting",About the job\n \n\n \nSkills\n\nONE ...,Full-time
4,Data Analyst REMOTE,"Software Guidance & Assistance, Inc. (SGA, Inc.)","Jacksonville, FL",Remote,1 week ago,,level,201-500 employees · Staffing and Recruiting,About the job\n \n\n \nJob Descriptio...,Contract · Entry
...,...,...,...,...,...,...,...,...,...,...
995,Data Engineer,Pearson,"Durham, NC",,1 week ago,,Full-time,"10,001+ employees · Education",About the job\n \n\n \n ...,Full-time
996,Data Integrity Analyst - Remote,Kforce Inc,"Boston, MA",Remote,1 day ago,,Associate,"1,001-5,000 employees · Staffing and Recruiting",About the job\n \n\n \nResponsibiliti...,Contract
997,DATA SCIENTIST III,Moffitt Cancer Center,"Tampa, FL",remote?,4 weeks ago,,level,"5,001-10,000 employees · Hospitals and Health ...",About the job\n \n\n \n ...,Full-time · Entry
998,Data Analyst II,RTD,"Denver, CO",,3 weeks ago,,level,"1,001-5,000 employees · Truck Transportation",About the job\n \n\n \n ...,Full-time · Entry


In [15]:
df2 = pd.DataFrame(scrape_listing_with_soup(links))
df2


Scraping job 1 of 1085.

https://linkedin.com/jobs/view/3245937555/?eBP=JOB_SEARCH_ORGANIC&recommendedFlavor=ACTIVELY_HIRING_COMPANY&refId=Syt68zSCTGmNiywIn1R4VA%3D%3D&trackingId=sVgDrv1FfLWq0%2F42qsUdDg%3D%3D&trk=flagship3_search_srp_jobs

title could not be found
company could not be found
location could not be found
remote position could not be found in header , description, or title
❌ could not find 'posted date'
❌ could not find 'number of applicants' in applicant count, bullet, or caption
❌ could not find 'contract type' in job insights.span or job insights
❌ could not find 'company size'
text rich
 Job DescriptionThe ideal candidate is a creative and passionate problem-solver who thinks big, acts quickly, and is motivated to develop new approaches to optimizing Nordstrom’s business using quantitative techniques and cutting-edge technology.A day in the life…Engage broadly with the business to frame, structure and prioritize business problems where analytic projects or tools can 

title could not be found
company could not be found
location could not be found
remote position could not be found in header , description, or title
❌ could not find 'posted date'
❌ could not find 'number of applicants' in applicant count, bullet, or caption
❌ could not find 'contract type' in job insights.span or job insights
❌ could not find 'company size'
text rich
 SkillsONE OF TEKSYSTEMS LEADING CLIENTS HAS AN IMMEDIATE NEED FOR A JR DATA ANALYST. THE IDEAL CANDIDATE WILL HAVE THE FOLLOWING SKILLS:Data, Analytical skill, Sql, Data analysis, Analysis, Reporting, Microsoft excel, Sql queriesTop Skills DetailsData,Analytical skill,SqlExperience LevelEntry LevelThis is a GREAT opportunity for anyone looking for their first role within the Data Analytics space.APPLY NOW!!!About TEKsystemsWe're partners in transformation. We help clients activate ideas and solutions to take advantage of a new world of opportunity. We are a team of 80,000 strong, working with over 6,000 clients, includin

title could not be found
company could not be found
location could not be found
remote position could not be found in header , description, or title
❌ could not find 'posted date'
❌ could not find 'number of applicants' in applicant count, bullet, or caption
❌ could not find 'contract type' in job insights.span or job insights
❌ could not find 'company size'
text rich
 Project/Unit DescriptionThe Data Architect in the Electronic Systems Laboratory (ELSYS) Systems Engineering Research Division (SERD) Applied Decision Support Branch (ADSB) supports Department of Defense (DoD) leadership decisions, including defining organizational vision and business requirements; developing and managing roadmaps; engineering strategic acquisition and budgeting processes; and supporting portfolio and program management. The researcher works with a team to define requirements, design decision support processes, and engineer web-based software systems to facilitate those processes.This position is located 

title could not be found
company could not be found
location could not be found
remote position could not be found in header , description, or title
❌ could not find 'posted date'
❌ could not find 'number of applicants' in applicant count, bullet, or caption
❌ could not find 'contract type' in job insights.span or job insights
❌ could not find 'company size'
text rich
 Dice is the leading career destination for tech experts at every stage of their careers. Our client, Vaco Technology, is seeking the following. Apply via Dice today!8+ years total experienceNeeds to be a hands-on Architect. Someone with a prior Architect background, but able to be hands-on in the technical weeds as well. Hands-on experience in designing application architecture that helps in data extraction and transformation for modeling.  Hands-on experience in designing systems that can collect, store and analyze data at scale.  Strong data background Experience in performing root cause analysis and helping the busine

Unnamed: 0,title,company,location,remote,post_date,num_applicants,contract_type,company_size,description,salary
0,,,,,,,,,Job DescriptionThe ideal candidate is a creati...,
1,,,,,,,,,The company manages a reggae music catalog of ...,
2,,,,,,,,,The massive scale and heavy engagement of the ...,
3,,,,,,,,,SkillsONE OF TEKSYSTEMS LEADING CLIENTS HAS AN...,
4,,,,,,,,,"Job DescriptionSoftware Guidance & Assistance,...",
5,,,,,,,,,The primary focus of the Data Visualization Ar...,
6,,,,,,,,,Project/Unit DescriptionThe Data Architect in ...,
7,,,,,,,,,"Job Number: R0151262Data Scientist, JuniorThe ...",
8,,,,,,,,,Dice is the leading career destination for tec...,
9,,,,,,,,,Current EmployeesIf you are a current employee...,


In [None]:
df = pd.DataFrame(scrape_listing(links)) # FIX: maybe add: .reset_index(drop=True)
df.insert(0, 'descriptions', scrape_listing_with_soup(links))
df

## Export Data to CSV for Cleaning

In [22]:
df.to_csv(f'output/linkedin_jobs_uncleaned_full_{datetime.date.today()}.csv', index = False, encoding='utf-8')