# Scraping job listings from Indeed.com

We will be scraping job listings from Indeed.com using BeautifulSoup. Luckily, Indeed.com is a simple text page where we can easily find relevant entries.

First, look at the source of an Indeed.com page: (http://www.indeed.com/jobs?q=data+scientist+%2420%2C000&l=New+York&start=10")

Notice, each job listing is underneath a div tag with a class name of result. We can use BeautifulSoup to extract those.

Set up a request (using requests) to the URL below. Use BeautifulSoup to parse the page and extract all results (HINT: Look for div tags with class name result)
The URL here has many query parameters

1:q for the job search
2:This is followed by "+20,000" to return results with salaries (or expected salaries >$20,000)
3:l for a location
4:start for what result number to start on

In [6]:
#Importing all the necessary libraries
import urllib
import requests
import bs4
from bs4 import BeautifulSoup
import pandas as pd
import re

In [7]:
url= "http://www.indeed.com/jobs?q=data+scientist+%2420%2C000&l=New+York&start=10"

While this has some more verbose elements removed, we can see that there is some structure to the above:

The salary is available in a nobr element inside of a td element with class='snip.
The title of a job is in a link with class set to jobtitle and a data-tn-element="jobTitle.
The location is set in a span with class='location'.
The company is set in a span with class='company'.

In [16]:

def parse(url):
    html = requests.get(url)
    soup = BeautifulSoup(html.content, 'html.parser', from_encoding="utf-8")
    df = pd.DataFrame(columns=["Title","Location","Company","Salary", "Synopsis"])
    for each in soup.find_all(class_= "result" ):
        try: 
            title = each.find(class_='jobtitle').text.replace('\n', '')
        except:
            title = 'None'
        try:
            location = each.find('span', {'class':"location" }).text.replace('\n', '')
        except:
            location = 'None'
        try: 
            company = each.find(class_='company').text.replace('\n', '')
        except:
            company = 'None'
        try:
            salary = each.find('span', {'class':'no-wrap'}).text
        except:
            salary = 'None'
        synopsis = each.find('div', {'class':'summary'}).text.replace('\n', '')
        df = df.append({'Title':title, 'Location':location, 'Company':company, 'Salary':salary, 'Synopsis':synopsis}, ignore_index='JobId')
    return df

In [17]:
parse(url)

Unnamed: 0,Title,Location,Company,Salary,Synopsis
0,Data Scientist (Data Startup),,Averity,"\n $130,000 - $150,000 a year",We are growing quickly and need th...
1,Data Scientist,,Disney Streaming Services,"\n $98,000 - $145,000 a year (I...",3+ years of experience as a Data S...
2,Data Scientist,,Lockheed Martin Corporation,"\n $89,000 - $131,000 a year (I...",The applicant is expected to colla...
3,"Scientist, Systems Engineering",,Harris Corp,"\n $73,000 - $114,000 a year (I...","Job Title – Scientist, Systems Eng..."
4,DATA AND REPORTING SPECIALIST,,New York City HRA/DEPT OF SOCIAL SERVICES,"\n $82,008 - $100,000 a year",The Office of Program Accountabili...
5,Sr Data Scientist,,Aetna,,Leads and participates in the deve...
6,Data Scientist,"New York, NY 10038 (Financial District area)",New York County Defender Services,"\n $88,000 - $129,000 a year (I...",Creating and maintaining a data in...
7,Data Scientist,"New York, NY",Boll & Branch,"\n $98,000 - $145,000 a year (I...",Reporting to the Director of Busin...
8,Senior Data Scientist,"New York, NY 10010 (Gramercy area)",Sub Rosa,,We are looking for a data scientis...
9,Research Data Analyst,"New York, NY",Weill Cornell Medicine,,Provides statistical expertise in ...


While this has some more verbose elements removed, we can see that there is some structure to the above:

The salary is available in a nobr element inside of a td element with class='snip.
The title of a job is in a link with class set to jobtitle and a data-tn-element="jobTitle.
The location is set in a span with class='location'.
The company is set in a span with class='company'.

Now, to scale up our scraping, we need to accumulate more results. We can do this by examining the URL above.

"http://www.indeed.com/jobs?q=data+scientist+%2420%2C000&l=New+York&start=10"
There are two query parameters here we can alter to collect more results, the l=New+York and the start=10. The first controls the location of the results (so we can try a different city). The second controls where in the results to start and gives 10 results (thus, we can keep incrementing by 10 to go further in the list).

Complete the following code to collect results from multiple cities and starting points.
Enter your city below to add it to the search
Remember to convert your salary to U.S. Dollars to match the other cities if the currency is different

In [175]:
YOUR_CITY='Pittsburg'

In [176]:
url_template = "http://www.indeed.com/jobs?q=data+scientist+%2420%2C000&l={}&start={}"
max_results_per_city = 25 # Set this to a high-value (30) to generate more results. 
# Crawling more results, will also take much longer. First test your code on a small number of results and then expand.
i = 0
results = []
df_more = pd.DataFrame(columns=["Title","Location","Company","Salary", "Synopsis"])
for city in set(['New+York', 'Chicago', 'San+Francisco', 'Austin', 'Seattle', 
    'Los+Angeles', 'Philadelphia', 'Atlanta', 'Dallas', YOUR_CITY, 
    'Portland', 'Phoenix', 'Denver', 'Houston', 'Miami',
    'Charlottesville', 'Richmond', 'Baltimore', 'Harrisonburg', 'San+Antonio', 'San+Diego', 'San+Jose'
    'Austin', 'Jacksonville', 'Indianapolis', 'Columbus', 'Fort+Worth', 'Charlotte', 'Detroit', 'El+Paso', 
    'Memphis', 'Boston', 'Nashville', 'Louisville', 'Milwaukee', 'Las+Vegas', 'Albuquerque', 'Tucson', 
    'Fresno', 'Sacramento', 'Long+Beach', 'Mesa', 'Virginia+Beach', 'Norfolk', 'Atlanta', 'Colorado+Springs',
    'Raleigh', 'Omaha', 'Oakland', 'Tulsa', 'Minneapolis', 'Cleveland', 'Wichita', 'Arlington', 'New+Orleans', 
    'Bakersfield', 'Tampa', 'Honolulu', 'Anaheim', 'Aurora', 'Santa+Ana', 'Riverside', 'Corpus+Christi', 'Pittsburgh', 
    'Lexington', 'Anchorage', 'Cincinnati', 'Baton+Rouge', 'Chesapeake', 'Alexandria', 'Fairfax', 'Herndon',
    'Reston', 'Roanoke']):
    for start in range(0, max_results_per_city, 10):
        # Grab the results from the request (as above)
        url = url_template.format(city, start)
        # Append to the full set of results
        html = requests.get(url)
        soup = BeautifulSoup(html.content, 'html.parser', from_encoding="utf-8")
        for each in soup.find_all(class_= "result" ):
            try: 
                title = each.find(class_='jobtitle').text.replace('\n', '')
            except:
                title = None
            try:
                location = each.find('span', {'class':"location" }).text.replace('\n', '')
            except:
                location = None
            try: 
                company = each.find(class_='company').text.replace('\n', '')
            except:
                company = None
            try:
                salary = each.find('span', {'class':'no-wrap'}).text.replace('\n','')
            except:
                salary = None
            try:
                synopsis = each.find('div', {'class':'summary'}).text.replace('\n', '')
            except:
                synopsis = None
            df_more = df_more.append({'Title':title, 'Location':location, 'Company':company, 'Salary':salary, 'Synopsis':synopsis}, ignore_index=True)
            i += 1
            if i % 1000 == 0:  # Ram helped me build this counter to see how many. You can visibly see Ram's vernacular in the print statements.
                print('You have ' + str(i) + ' results. ' + str(df_more.dropna().drop_duplicates().shape[0]) + " of these aren't rubbish.")

You have 1000 results. 109 of these aren't rubbish.
You have 2000 results. 234 of these aren't rubbish.
You have 3000 results. 343 of these aren't rubbish.


In [178]:
df_more.head()

Unnamed: 0,Title,Location,Company,Salary,Synopsis
0,Process Scientist,,Ampac Fine Chemicals,,Works under supervision of more se...
1,Sr. Analyst - Data Scientist,,CarMax,,"Data Scientist, Expansion Planning & Analysis,..."
2,Life Insurance Agent (Remote) - Fr...,,ASSURANCE,"$50,000 - $125,000 a year","Our team of engineers, data scient..."
3,Data Scientist,,Indeed Prime,,Indeed Prime is a free service tha...
4,"Data Analyst, Data Engineering","Richmond, VA 23219 (City Center area)",United Network for Organ Sharing,,The Data Analyst must organize com...


Find the entries with annual salary entries, by filtering the entries without salaries or salaries that are not yearly (filter those that refer to hour or week). Also, remove duplicate entries


In [179]:
df_more["Salary"]=df_more["Salary"].str.replace("a year"," ",case = False)

In [180]:
df_more["Salary"]=df_more["Salary"].str.replace("Indeed est."," ",case=False)

In [134]:
#df_more["Salary"]=df_more["Salary"].str.replace("()"," ",case=False)

In [181]:
df_more

Unnamed: 0,Title,Location,Company,Salary,Synopsis
0,Process Scientist,,Ampac Fine Chemicals,,Works under supervision of more se...
1,Sr. Analyst - Data Scientist,,CarMax,,"Data Scientist, Expansion Planning & Analysis,..."
2,Life Insurance Agent (Remote) - Fr...,,ASSURANCE,"$50,000 - $125,000","Our team of engineers, data scient..."
3,Data Scientist,,Indeed Prime,,Indeed Prime is a free service tha...
4,"Data Analyst, Data Engineering","Richmond, VA 23219 (City Center area)",United Network for Organ Sharing,,The Data Analyst must organize com...
5,"Data Analyst, Data Products","Richmond, VA 23219 (City Center area)",United Network for Organ Sharing,,The Data Analyst- Data Products mu...
6,Data Scientist I,"Midlothian, VA 23112",Virginia Credit Union,"$81,000 - $119,000 ( )",Individual must be comfortable ext...
7,Data Scientist - FinTech Company,"Richmond, VA",Lennon Wright Associates,"$70,000 - $120,000",The team is fantastic you will be ...
8,Data Analyst - FinTech Company,"Richmond, VA",Lennon Wright Associates,"$70,000 - $120,000",The team is fantastic you will be ...
9,Entry Level Scientist - Immunochem...,"Richmond, VA 23230",PPD,,Responsible for review and compila...


Dropping all the duplicates and null values

In [182]:
print (df_more.head())
print (df_more.shape)
print (df_more[df_more.Salary != 'None'].shape)
df_more = df_more[df_more.Salary != 'None'].drop_duplicates().dropna()
print (df_more.shape)

                                               Title  \
0                                  Process Scientist   
1                       Sr. Analyst - Data Scientist   
2              Life Insurance Agent (Remote) - Fr...   
3                                     Data Scientist   
4                     Data Analyst, Data Engineering   

                                Location  \
0                                   None   
1                                   None   
2                                   None   
3                                   None   
4  Richmond, VA 23219 (City Center area)   

                                    Company  \
0                      Ampac Fine Chemicals   
1                                    CarMax   
2                                 ASSURANCE   
3                              Indeed Prime   
4          United Network for Organ Sharing   

                                 Salary  \
0                                  None   
1                            

In [None]:
# Removing all the rows having Salaries in hourly or monthly basis

In [183]:
df_more = df_more[df_more.Salary.str.contains("hour") == False]
df_more = df_more[df_more.Salary.str.contains("month") == False]
df_more = df_more[df_more.Salary.str.contains("week") == False]
print (df_more.shape)
df_more.head()

(293, 5)


Unnamed: 0,Title,Location,Company,Salary,Synopsis
6,Data Scientist I,"Midlothian, VA 23112",Virginia Credit Union,"$81,000 - $119,000 ( )",Individual must be comfortable ext...
7,Data Scientist - FinTech Company,"Richmond, VA",Lennon Wright Associates,"$70,000 - $120,000",The team is fantastic you will be ...
8,Data Analyst - FinTech Company,"Richmond, VA",Lennon Wright Associates,"$70,000 - $120,000",The team is fantastic you will be ...
10,Data Scientist,"Richmond, VA",VA Commonwealth Univ,"$75,000 - $90,000",Knowledgeable in data science life...
11,Senior Data Scientist,"Chesterfield, VA",Virginia VA Community College Sys,"$65,000 - $73,000",1+ years’ experience using other a...


In [184]:
df_more = df_more.join(df_more['Salary'].str.split('-', 1, expand=True).rename(columns={0:'Low Salary', 1:'High Salary'}))

In [185]:
df_more = df_more.drop(['Salary'],axis=1)

#Creating 2 more rows by dividing Location to City and State

In [186]:
df_more = df_more.join(df_more['Location'].str.split(',', 1, expand=True).rename(columns={0:'City', 1:'State'}))

In [187]:
df_more.head()

Unnamed: 0,Title,Location,Company,Synopsis,Low Salary,High Salary,City,State
6,Data Scientist I,"Midlothian, VA 23112",Virginia Credit Union,Individual must be comfortable ext...,"$81,000","$119,000 ( )",Midlothian,VA 23112
7,Data Scientist - FinTech Company,"Richmond, VA",Lennon Wright Associates,The team is fantastic you will be ...,"$70,000","$120,000",Richmond,VA
8,Data Analyst - FinTech Company,"Richmond, VA",Lennon Wright Associates,The team is fantastic you will be ...,"$70,000","$120,000",Richmond,VA
10,Data Scientist,"Richmond, VA",VA Commonwealth Univ,Knowledgeable in data science life...,"$75,000","$90,000",Richmond,VA
11,Senior Data Scientist,"Chesterfield, VA",Virginia VA Community College Sys,1+ years’ experience using other a...,"$65,000","$73,000",Chesterfield,VA


#Creating new Column having state initials

In [188]:
def strip_state(x):
    if x != None:
        return x[0:3]
    else:
        None
df_more['State Initials'] = df_more['State'].apply(strip_state)

In [189]:
df_more= df_more.drop(['State'],axis=1)

In [172]:
#df_more.drop(['Salary'],axis=1)

In [190]:
#Auditing
df_more.isnull().sum()

Title              0
Location           0
Company            0
Synopsis           0
Low Salary         0
High Salary       24
City               0
State Initials     0
dtype: int64

In [191]:
df_more

Unnamed: 0,Title,Location,Company,Synopsis,Low Salary,High Salary,City,State Initials
6,Data Scientist I,"Midlothian, VA 23112",Virginia Credit Union,Individual must be comfortable ext...,"$81,000","$119,000 ( )",Midlothian,VA
7,Data Scientist - FinTech Company,"Richmond, VA",Lennon Wright Associates,The team is fantastic you will be ...,"$70,000","$120,000",Richmond,VA
8,Data Analyst - FinTech Company,"Richmond, VA",Lennon Wright Associates,The team is fantastic you will be ...,"$70,000","$120,000",Richmond,VA
10,Data Scientist,"Richmond, VA",VA Commonwealth Univ,Knowledgeable in data science life...,"$75,000","$90,000",Richmond,VA
11,Senior Data Scientist,"Chesterfield, VA",Virginia VA Community College Sys,1+ years’ experience using other a...,"$65,000","$73,000",Chesterfield,VA
12,Senior Data Scientist,"Chesterfield, VA",VA Community College Sys,1+ years' experience using other a...,"$65,000","$73,000",Chesterfield,VA
13,Data Scientist,"Richmond, VA 23284 (The Fan area)",Virginia Commonwealth University,Working Title Data Scientist. Know...,"$53,000","$78,000 ( )",Richmond,VA
20,Senior Data Scientist - Virginia C...,"Richmond, VA",Virginia Community College System,Creating and managing an integrate...,"$65,000","$73,000",Richmond,VA
27,Data Scientist,"Richmond, VA 23219 (City Center area)",Afton Chemical,This data scientist position will ...,"$86,000","$126,000 ( )",Richmond,VA
28,DATA SCIENTIST,"Chester, VA",Knowledge Facilitation Group,We are looking for a Data Scientis...,"$85,000","$126,000 ( )",Chester,VA


In [192]:
df_more['High Salary'] = df_more['High Salary'].str.replace(',','')

In [193]:
df_more['Low Salary']= df_more['Low Salary'].str.replace(',','')

In [194]:
df_more[str('Low Salary')] = df_more[str('Low Salary')].replace({'\$':''}, regex = True)

In [195]:
df_more[str('High Salary')] = df_more[str('High Salary')].replace({'\$':''}, regex = True)

In [202]:
#df_more[str('High Salary')] = df_more[str('High Salary')].replace({'(':''},regex = True)

In [197]:
df_more

Unnamed: 0,Title,Location,Company,Synopsis,Low Salary,High Salary,City,State Initials
6,Data Scientist I,"Midlothian, VA 23112",Virginia Credit Union,Individual must be comfortable ext...,81000,119000 ( ),Midlothian,VA
7,Data Scientist - FinTech Company,"Richmond, VA",Lennon Wright Associates,The team is fantastic you will be ...,70000,120000,Richmond,VA
8,Data Analyst - FinTech Company,"Richmond, VA",Lennon Wright Associates,The team is fantastic you will be ...,70000,120000,Richmond,VA
10,Data Scientist,"Richmond, VA",VA Commonwealth Univ,Knowledgeable in data science life...,75000,90000,Richmond,VA
11,Senior Data Scientist,"Chesterfield, VA",Virginia VA Community College Sys,1+ years’ experience using other a...,65000,73000,Chesterfield,VA
12,Senior Data Scientist,"Chesterfield, VA",VA Community College Sys,1+ years' experience using other a...,65000,73000,Chesterfield,VA
13,Data Scientist,"Richmond, VA 23284 (The Fan area)",Virginia Commonwealth University,Working Title Data Scientist. Know...,53000,78000 ( ),Richmond,VA
20,Senior Data Scientist - Virginia C...,"Richmond, VA",Virginia Community College System,Creating and managing an integrate...,65000,73000,Richmond,VA
27,Data Scientist,"Richmond, VA 23219 (City Center area)",Afton Chemical,This data scientist position will ...,86000,126000 ( ),Richmond,VA
28,DATA SCIENTIST,"Chester, VA",Knowledge Facilitation Group,We are looking for a Data Scientis...,85000,126000 ( ),Chester,VA


In [122]:
#df_more['Average'] = df_more[['Low', 'High']].mean(axis=1)

In [198]:
df_more.reset_index(drop=True,inplace=True)

In [199]:
df_more.index += 1

In [200]:
df_more

Unnamed: 0,Title,Location,Company,Synopsis,Low Salary,High Salary,City,State Initials
1,Data Scientist I,"Midlothian, VA 23112",Virginia Credit Union,Individual must be comfortable ext...,81000,119000 ( ),Midlothian,VA
2,Data Scientist - FinTech Company,"Richmond, VA",Lennon Wright Associates,The team is fantastic you will be ...,70000,120000,Richmond,VA
3,Data Analyst - FinTech Company,"Richmond, VA",Lennon Wright Associates,The team is fantastic you will be ...,70000,120000,Richmond,VA
4,Data Scientist,"Richmond, VA",VA Commonwealth Univ,Knowledgeable in data science life...,75000,90000,Richmond,VA
5,Senior Data Scientist,"Chesterfield, VA",Virginia VA Community College Sys,1+ years’ experience using other a...,65000,73000,Chesterfield,VA
6,Senior Data Scientist,"Chesterfield, VA",VA Community College Sys,1+ years' experience using other a...,65000,73000,Chesterfield,VA
7,Data Scientist,"Richmond, VA 23284 (The Fan area)",Virginia Commonwealth University,Working Title Data Scientist. Know...,53000,78000 ( ),Richmond,VA
8,Senior Data Scientist - Virginia C...,"Richmond, VA",Virginia Community College System,Creating and managing an integrate...,65000,73000,Richmond,VA
9,Data Scientist,"Richmond, VA 23219 (City Center area)",Afton Chemical,This data scientist position will ...,86000,126000 ( ),Richmond,VA
10,DATA SCIENTIST,"Chester, VA",Knowledge Facilitation Group,We are looking for a Data Scientis...,85000,126000 ( ),Chester,VA


In [201]:
df_more.to_csv("tasty.csv")

# 1 Importing libraries

In [18]:
import numpy as np
import pandas as pd
import tweepy
import json

# 2 Authentication

In [19]:
access_token = "1094669446327791619-YOzcYMTzeSYofWWNLDvobSda4pUSzp"
access_token_secret = "eQps0UdGWlrYVutCftnNBj2zGQAPBD1vQbybwG9zeqfy8"
api_key = "X9LxDdcBgNdN2p6YVdnuMJeCf"
api_secret = "qbA6zb5hgpc8JyOg8Vcv1uDc8EljyLQIN0xSDm0Fp3mqjiDeUI"
auth = tweepy.OAuthHandler(api_key, api_secret)
auth.set_access_token(access_token, access_token_secret)

api = tweepy.API(auth)

In [20]:
df_more=pd.read_csv("tasty.csv")

In [21]:
df_more

Unnamed: 0,Job_id,Title,Location,Company,Synopsis,Low Salary,High Salary,City,State Initials
0,1,Data Scientist I,"Midlothian, VA 23112",Virginia Credit Union,Individual must be comfortable ext...,81000,119000.0,Midlothian,VA
1,2,Data Scientist - FinTech Company,"Richmond, VA",Lennon Wright Associates,The team is fantastic you will be ...,70000,120000.0,Richmond,VA
2,3,Data Analyst - FinTech Company,"Richmond, VA",Lennon Wright Associates,The team is fantastic you will be ...,70000,120000.0,Richmond,VA
3,4,Data Scientist,"Richmond, VA",VA Commonwealth Univ,Knowledgeable in data science life...,75000,90000.0,Richmond,VA
4,5,Senior Data Scientist,"Chesterfield, VA",Virginia VA Community College Sys,1+ years’ experience using other a...,65000,73000.0,Chesterfield,VA
5,6,Senior Data Scientist,"Chesterfield, VA",VA Community College Sys,1+ years' experience using other a...,65000,73000.0,Chesterfield,VA
6,7,Data Scientist,"Richmond, VA 23284 (The Fan area)",Virginia Commonwealth University,Working Title Data Scientist. Know...,53000,78000.0,Richmond,VA
7,8,Senior Data Scientist - Virginia C...,"Richmond, VA",Virginia Community College System,Creating and managing an integrate...,65000,73000.0,Richmond,VA
8,9,Data Scientist,"Richmond, VA 23219 (City Center area)",Afton Chemical,This data scientist position will ...,86000,126000.0,Richmond,VA
9,10,DATA SCIENTIST,"Chester, VA",Knowledge Facilitation Group,We are looking for a Data Scientis...,85000,126000.0,Chester,VA


In [22]:
companyies=list(df_more['Company'])

In [23]:
social_media_id=[]
screen_name=[]
url=[]
company=[]
for company in companyies:
    print(company)
    harshil=api.search_users(q=company)
    try:
        social_media_id.append(harshil[0]._json['id'])
    except:
        pass
        
    try:    
        screen_name.append(harshil[0]._json['screen_name'])
    except:
        pass
        
    try:
        url.append(harshil[0]._json['url'])
    except:
        pass
    

        Virginia Credit Union
        Lennon Wright Associates
        Lennon Wright Associates
        VA Commonwealth Univ
        Virginia VA Community College Sys
        VA Community College Sys
        Virginia Commonwealth University
        Virginia Community College System
        Afton Chemical
        Knowledge Facilitation Group
        Amyx, Inc.
        CSS Corporation
        New Virginia Majority
        Virginia Community College System
        Horizon Industries Ltd
        Virginia Department of General Services
        The Prosper Group
        Covance
        Zotec Partners
        Cummins Inc.
        Crowe
        Elements Financial
        OneAmerica
        Crowe
        KSM Consulting
        Cummins Inc.
        Retrace Labs
        US Department of Agriculture
        Child Trends
        Rice University
        ALASKA POWER ASSOCIATION
        Centers for Disease Control and Prevention
        US Department of the Air Force
        US Department of the Air 

In [24]:
len(social_media_id)

221

In [25]:
df_final= pd.DataFrame({"user_name":screen_name,"social_media_id":social_media_id,"company_url":url,})

In [26]:
df_final.index += 1

In [27]:
df_final()

Unnamed: 0,user_name,social_media_id,company_url
1,VACULIVE,818510649139597312,https://t.co/PN9t3nogN8
2,VCU,156714051,http://t.co/a3In7QjvfX
3,DSLCC,25579173,http://t.co/A7EtLSAS9A
4,FRC422,258023422,https://t.co/5bKlfSDvjN
5,AmyxInc,827209038786068481,https://t.co/1oluRoEPuY
6,csscorp09,85772994,http://t.co/sTEMfJU5WC
7,NewVAMajority,233630180,https://t.co/689Wo71shT
8,DSLCC,25579173,http://t.co/A7EtLSAS9A
9,CCASHMORE_BUYER,156568290,https://t.co/OCxt0dShFT
10,Covance,365005464,https://t.co/apKYTWy1Ny


In [28]:
df_final.to_csv('Twitter urls.csv')

# CITATION

https://github.com/aakashtandel/Web-Scraping-Indeed/blob/master/Code/Scratch%20Notebooks/project-3-aakash-version4.ipynb

https://nycdatascience.com/blog/student-works/project-3-web-scraping-company-data-from-indeed-com-and-dice-com/

Youtube Videos