# **Scraping Glassdoor Companies Reviews**

Link: https://bulletbyte.weebly.com/tech/how-to-scrape-a-companys-glassdoor-reviews-using-python


## **Introduction: Libraries and Functions**

In [3]:
#import the libraries
import os
import time

import numpy as np
import pandas as pd
import math

from bs4 import BeautifulSoup
from urllib.request import Request, urlopen

In [4]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [6]:
#create a function to scrape any Glassdoor company review page
#the code still works when I run it on 7 Sep, 2021, but the html content of Glassdoor webpages changes all the time
#please inspect the webpage and make the necessary changes to the html tags if any of the list returns empty

def review_scraper(url):
  #scraping the web page content
  hdr = {"User-Agent":"Mozilla/5.0 Gecko/20100101 Firefox/33.0 GoogleChrome/10.0"}
  req = Request(url,headers=hdr)
  page = urlopen(req)
  soup = BeautifulSoup(page, "html.parser") 

  #define some lists
  Summary=[]
  Date_n_JobTitle=[]
  Date=[]
  JobTitle=[]
  AuthorLocation=[]
  OverallRating=[]
  Pros=[]
  Cons=[]  

  #get the Summary (Hugo: Corrected)
  for x in soup.find_all('h2', {'class':'mb-xxsm mt-0 css-93svrw el6ke055'}):
    Summary.append(x.text)

  #get the Posted Date and Job Title
  for x in soup.find_all('span', {'class':'authorJobTitle middle common__EiReviewDetailsStyle__newGrey'}):
    Date_n_JobTitle.append(x.text)

  #get the Posted Date
  for x in Date_n_JobTitle:
    Date.append(x.split(' -')[0])

  #get Job Title
  for x in Date_n_JobTitle:
    JobTitle.append(x.split(' -')[1])

  #get Author Location
  for x in soup.find_all('span', {'class':'authorLocation'}):
    AuthorLocation.append(x.text)

  #get Overall Rating
  for x in soup.find_all('span', {'class':'ratingNumber mr-xsm'}):
    OverallRating.append(float(x.text))

  #get Pros
  for x in soup.find_all('span', {'data-test':'pros'}):
    Pros.append(x.text)

  #get Cons
  for x in soup.find_all('span', {'data-test':'cons'}):
    Cons.append(x.text)

  #putting everything together
  Reviews = pd.DataFrame(list(zip(Summary, Date, JobTitle, AuthorLocation, OverallRating, Pros, Cons)), 
                    columns = ['Summary', 'Date', 'JobTitle', 'AuthorLocation', 'OverallRating', 'Pros', 'Cons'])
  
  return Reviews

## **Scraping for the 4 companies**

**1. Danske Bank**



In [None]:
#paste/replace the url to the first page of the company's Glassdoor review in between the ""
input_url="https://www.glassdoor.sg/Reviews/Danske-Bank-Reviews-E10384"

#scraping the first page content
hdr = {"User-Agent":"Mozilla/5.0 Gecko/20100101 Firefox/33.0 GoogleChrome/10.0"}
req = Request(input_url+".htm?sort.sortType=RD&sort.ascending=false",headers=hdr)
page = urlopen(req)
soup = BeautifulSoup(page, "html.parser") 

#check the total number of reviews
countReviews = soup.find('div', {'data-test':'pagination-footer-text'}).text
countReviews = float(countReviews.split(' Reviews')[0].split('of ')[1].replace(',',''))

#calculate the max number of pages (assuming 10 reviews a page)
countPages = math.ceil(countReviews/10)
countPages

#I'm setting the max pages to scrape to 3 here to save time
#maxPage = 300
#uncomment the line below to set the max page to scrape (based on total number of reviews)
maxPage = countPages + 1

#scraping multiple pages of company glassdoor review
output = review_scraper(input_url+".htm?sort.sortType=RD&sort.ascending=false")
for x in range(2,maxPage):
  url = input_url+"_P"+str(x)+".htm?sort.sortType=RD&sort.ascending=false"
  output = output.append(review_scraper(url), ignore_index=True)

#display the output
display(output)

Unnamed: 0,Summary,Date,JobTitle,AuthorLocation,OverallRating,Pros,Cons
0,Great workplace but lot of bureaucracy and pol...,"Nov 5, 2021",Strategy Consultant,Copenhagen,4.0,- Smart people\n- Good benefits\n- Interesting...,- Low comp\n- Very political environment
1,Good,"Nov 4, 2021",Associate Director,Warsaw,5.0,Good employer in the Nordics,No direct cons to mention
2,Danske bank,"Oct 31, 2021",Anonymous Employee,"Stockholm, Stockholm, Stockholm",4.0,Good employer cares about people who works there,Bank burocracy takes long time to do simple wo...
3,Moving forward!,"Oct 28, 2021",Anonymous Employee,Vilnius,4.0,Great ambition to change with the demands of s...,Some things move really slow
4,"Good place for non-ambitious, low-qualified pe...","Oct 28, 2021",AML Analyst,"Helsinki, Southern Finland, Southern Finland",1.0,- flexible work environment\n- social benefits...,- poor management without hard-skills;\n- prom...
...,...,...,...,...,...,...,...
321,Slave Labour,"Aug 29, 2013",Servcie Advisor,Copenhagen,1.0,Good banter with workmates etc.,AWFUL compensation.no career progress whatsoev...
322,Conservative and bureaucratic,"Jul 29, 2013",Senior Analyst,Linköping,2.0,- Good job security.\r\n - Good flexibility in...,- Lack of transparent career progression and o...
323,Fair Danish Investment bank but not more than ...,"Jan 5, 2013",Assistant Analyst,Copenhagen,3.0,"I recerived a fair salary for my position, and...",People working there are generally less ambici...
324,good,"Oct 29, 2011",Business Manager,"Dublin, Dublin",4.0,Look after employees well bar basic salary,base salary and everyone at the top is danish


In [None]:
output['Company'] = "Danske Bank"
output.head()

Unnamed: 0,Summary,Date,JobTitle,AuthorLocation,OverallRating,Pros,Cons,Company
0,Great workplace but lot of bureaucracy and pol...,"Nov 5, 2021",Strategy Consultant,Copenhagen,4.0,- Smart people\n- Good benefits\n- Interesting...,- Low comp\n- Very political environment,Danske Bank
1,Good,"Nov 4, 2021",Associate Director,Warsaw,5.0,Good employer in the Nordics,No direct cons to mention,Danske Bank
2,Danske bank,"Oct 31, 2021",Anonymous Employee,"Stockholm, Stockholm, Stockholm",4.0,Good employer cares about people who works there,Bank burocracy takes long time to do simple wo...,Danske Bank
3,Moving forward!,"Oct 28, 2021",Anonymous Employee,Vilnius,4.0,Great ambition to change with the demands of s...,Some things move really slow,Danske Bank
4,"Good place for non-ambitious, low-qualified pe...","Oct 28, 2021",AML Analyst,"Helsinki, Southern Finland, Southern Finland",1.0,- flexible work environment\n- social benefits...,- poor management without hard-skills;\n- prom...,Danske Bank


In [None]:
output.to_csv('danske_bank_reviews.csv')

**2. Lego**

In [None]:
#paste/replace the url to the first page of the company's Glassdoor review in between the ""
input_url="https://www.glassdoor.sg/Reviews/the-LEGO-Group-Reviews-E3944"

#scraping the first page content
hdr = {"User-Agent":"Mozilla/5.0 Gecko/20100101 Firefox/33.0 GoogleChrome/10.0"}
req = Request(input_url+".htm?sort.sortType=RD&sort.ascending=false",headers=hdr)
page = urlopen(req)
soup = BeautifulSoup(page, "html.parser") 

#check the total number of reviews
countReviews = soup.find('div', {'data-test':'pagination-footer-text'}).text
countReviews = float(countReviews.split(' Reviews')[0].split('of ')[1].replace(',',''))

#calculate the max number of pages (assuming 10 reviews a page)
countPages = math.ceil(countReviews/10)
countPages

#I'm setting the max pages to scrape to 3 here to save time
#maxPage = 300
#uncomment the line below to set the max page to scrape (based on total number of reviews)
maxPage = countPages + 1

#scraping multiple pages of company glassdoor review
output = review_scraper(input_url+".htm?sort.sortType=RD&sort.ascending=false")
for x in range(2,maxPage):
  url = input_url+"_P"+str(x)+".htm?sort.sortType=RD&sort.ascending=false"
  output = output.append(review_scraper(url), ignore_index=True)

#display the output
display(output)

Unnamed: 0,Summary,Date,JobTitle,AuthorLocation,OverallRating,Pros,Cons
0,Excellent place to work and great people. Feel...,"Nov 5, 2021",Equipment Manager,Singapore,5.0,Culture and work ethic. Exciting projects and ...,None that I can think!
1,Assistant,"Nov 4, 2021",Senior Assistant,Billund,2.0,Company is caring for employees,"Toxic work environment, boring work"
2,Good job,"Nov 3, 2021",Student Worker,"San Antonio, TX",4.0,They do a lot for their employees,"A Big company, so they are bureaucratic"
3,"Fantasic company to work for, they really care...","Nov 3, 2021",Retail Manager,Billund,5.0,Great Pay and benefits even for part time empl...,Working nights and weekends are not ideal but ...
4,Everything is Awesome,"Nov 2, 2021",Human Resources Director,"London, England, England",5.0,"Love the brand, love the people and love our f...","none, its truly an amazing place to work"
...,...,...,...,...,...,...,...
603,District & regional management turn a blind eye,"Dec 4, 2011",Sales Associate,"Friendswood, TX",1.0,1) You get to work with the coolest of toys. ...,District/Regional Management ignores bad store...
604,Definitely not like what you would think...,"Nov 15, 2011",,"Enfield, CT",2.0,My positive experience with this company only ...,"On the other side, the company has many reorga..."
605,"It's ok, but not as fun as you might think it ...","Nov 12, 2011",,"Enfield, CT",2.0,The instant company recognition is nice - it's...,LEGO brands itself externally and interally as...
606,You work very hard at this company but it is v...,"Feb 18, 2011",,"Enfield, CT",3.0,Family oriented company. You enjoy going to t...,Growth and training was very limited. There w...


In [None]:
output['Company'] = "Lego"
output.head()

Unnamed: 0,Summary,Date,JobTitle,AuthorLocation,OverallRating,Pros,Cons,Company
0,Excellent place to work and great people. Feel...,"Nov 5, 2021",Equipment Manager,Singapore,5.0,Culture and work ethic. Exciting projects and ...,None that I can think!,Lego
1,Assistant,"Nov 4, 2021",Senior Assistant,Billund,2.0,Company is caring for employees,"Toxic work environment, boring work",Lego
2,Good job,"Nov 3, 2021",Student Worker,"San Antonio, TX",4.0,They do a lot for their employees,"A Big company, so they are bureaucratic",Lego
3,"Fantasic company to work for, they really care...","Nov 3, 2021",Retail Manager,Billund,5.0,Great Pay and benefits even for part time empl...,Working nights and weekends are not ideal but ...,Lego
4,Everything is Awesome,"Nov 2, 2021",Human Resources Director,"London, England, England",5.0,"Love the brand, love the people and love our f...","none, its truly an amazing place to work",Lego


In [None]:
output.to_csv('lego_reviews.csv')

**3. Maersk**

In [None]:
#paste/replace the url to the first page of the company's Glassdoor review in between the ""
input_url="https://www.glassdoor.sg/Reviews/MAERSK-Reviews-E38791"

#scraping the first page content
hdr = {"User-Agent":"Mozilla/5.0 Gecko/20100101 Firefox/33.0 GoogleChrome/10.0"}
req = Request(input_url+".htm?sort.sortType=RD&sort.ascending=false",headers=hdr)
page = urlopen(req)
soup = BeautifulSoup(page, "html.parser") 

#check the total number of reviews
countReviews = soup.find('div', {'data-test':'pagination-footer-text'}).text
countReviews = float(countReviews.split(' Reviews')[0].split('of ')[1].replace(',',''))

#calculate the max number of pages (assuming 10 reviews a page)
countPages = math.ceil(countReviews/10)
countPages

#I'm setting the max pages to scrape to 3 here to save time
#maxPage = 300
#uncomment the line below to set the max page to scrape (based on total number of reviews)
maxPage = countPages + 1

#scraping multiple pages of company glassdoor review
output = review_scraper(input_url+".htm?sort.sortType=RD&sort.ascending=false")
for x in range(2,maxPage):
  url = input_url+"_P"+str(x)+".htm?sort.sortType=RD&sort.ascending=false"
  output = output.append(review_scraper(url), ignore_index=True)

#display the output
display(output)

Unnamed: 0,Summary,Date,JobTitle,AuthorLocation,OverallRating,Pros,Cons
0,Good company,"Nov 4, 2021",Financial Analyst,Copenhagen,2.0,"Welfare, many clubs, large company","Low salary, less common activity"
1,Best company to work,"Nov 4, 2021",Senior Process Expert,Chennai,5.0,Good place to work with..,All are good to go..
2,Good company,"Nov 4, 2021",Scrum Master%2FAgile Coach,"London, England, England",4.0,Takes care of its people,I can't think of any.
3,Nice company with nice benefits,"Nov 3, 2021",Senior Data Scientist,Copenhagen,5.0,Nice working environment and good space to gro...,A very traditional industrial that technical a...
4,Good,"Nov 3, 2021",Anonymous Employee,"Charlotte, NC",5.0,All good no bad thing,Working on old technologies which is out dated
...,...,...,...,...,...,...,...
1996,Good place to learn your trade but not good to...,"Sep 25, 2008",Client Coordinator,"Madison, NJ",2.0,Good work experience and working there will pu...,The payreviews are very poor and you can often...
1997,Passionate people; changes underway.,"Sep 15, 2008",Analyst,Bangkok,3.0,The company has a very large global reach. Th...,The normal downsides of big companies - slow t...
1998,This ship's in troubled waters...,"Jul 31, 2008",Analyst,Genoa,3.0,Company has a decent Medical Dental Plan and a...,Lack of foresight at senior management level l...
1999,Great place to start off your career in intern...,"Jul 30, 2008",Management Trainee,Copenhagen,4.0,Maersk is proud to be the world's largest ship...,If you didn't join the company as a management...


In [None]:
output['Company'] = "Maersk"
output.head()

Unnamed: 0,Summary,Date,JobTitle,AuthorLocation,OverallRating,Pros,Cons,Company
0,Good company,"Nov 4, 2021",Financial Analyst,Copenhagen,2.0,"Welfare, many clubs, large company","Low salary, less common activity",Maersk
1,Best company to work,"Nov 4, 2021",Senior Process Expert,Chennai,5.0,Good place to work with..,All are good to go..,Maersk
2,Good company,"Nov 4, 2021",Scrum Master%2FAgile Coach,"London, England, England",4.0,Takes care of its people,I can't think of any.,Maersk
3,Nice company with nice benefits,"Nov 3, 2021",Senior Data Scientist,Copenhagen,5.0,Nice working environment and good space to gro...,A very traditional industrial that technical a...,Maersk
4,Good,"Nov 3, 2021",Anonymous Employee,"Charlotte, NC",5.0,All good no bad thing,Working on old technologies which is out dated,Maersk


In [None]:
output.to_csv('maersk_reviews.csv')

**4. Pandora**

In [None]:
#paste/replace the url to the first page of the company's Glassdoor review in between the ""
input_url="https://www.glassdoor.sg/Reviews/Pandora-Jewelry-Reviews-E346695"

#scraping the first page content
hdr = {"User-Agent":"Mozilla/5.0 Gecko/20100101 Firefox/33.0 GoogleChrome/10.0"}
req = Request(input_url+".htm?sort.sortType=RD&sort.ascending=false",headers=hdr)
page = urlopen(req)
soup = BeautifulSoup(page, "html.parser") 

#check the total number of reviews
countReviews = soup.find('div', {'data-test':'pagination-footer-text'}).text
countReviews = float(countReviews.split(' Reviews')[0].split('of ')[1].replace(',',''))

#calculate the max number of pages (assuming 10 reviews a page)
countPages = math.ceil(countReviews/10)
countPages

#I'm setting the max pages to scrape to 3 here to save time
#maxPage = 300
#uncomment the line below to set the max page to scrape (based on total number of reviews)
maxPage = countPages + 1

#scraping multiple pages of company glassdoor review
output = review_scraper(input_url+".htm?sort.sortType=RD&sort.ascending=false")
for x in range(2,maxPage):
  url = input_url+"_P"+str(x)+".htm?sort.sortType=RD&sort.ascending=false"
  output = output.append(review_scraper(url), ignore_index=True)

#display the output
display(output)

Unnamed: 0,Summary,Date,JobTitle,AuthorLocation,OverallRating,Pros,Cons
0,Ok company to learn about selling luxury goods.,"Nov 4, 2021",Anonymous Employee,"Fort Worth, TX",3.0,"2% commission, 5 days a week.",13+hr shifts that would go back to back or in ...
1,Diverse nice job,"Nov 4, 2021",Sales,Hamburg,3.0,"Colleagues are friendly, clients are wealthy",Not much benefits and repetitive job
2,best,"Nov 4, 2021",Sales Associate,"Sutton, London, England, England",5.0,"75% discount on all jewelry, flexible hours",there aren't really any cons
3,Don't go,"Nov 4, 2021",CR Representative,"Roosevelt, NY",1.0,you have fun with some colleagues. I worked at...,"location, food, management, salary, too many p..."
4,amazing job,"Nov 4, 2021",Retail Sales Assistant,"Calgary, AB",5.0,great company to work for. plenty of opportuni...,People may not like set targets however its go...
...,...,...,...,...,...,...,...
963,Pandora Dealers Need to File a Class Action La...,"Jan 26, 2013",Dealer,"Virginia Beach, VA",1.0,There are no pros. A prime example of greed an...,This company started many years ago as an unkn...
964,AMazing experience to work for such a sought a...,"Dec 10, 2012",Key Holder,"Louisville, KY",5.0,Franchised and my store was very much like a f...,Lack of organization at times. The environment...
965,"Great people, fun work environment, excellent ...","Sep 15, 2012",Assistant Manager,"Edmonton, AB",4.0,Fun job. Opportunity to work with cool product...,Not as much corporate support as needed.
966,Ok and should be better but company growing so...,"Aug 15, 2011",Merchandiser,"Houston, TX",4.0,Wonderful people and it isdifferent every day.,Terrible communication and huge territory.


In [None]:
output['Company'] = "Pandora"
output.head()

Unnamed: 0,Summary,Date,JobTitle,AuthorLocation,OverallRating,Pros,Cons,Company
0,Ok company to learn about selling luxury goods.,"Nov 4, 2021",Anonymous Employee,"Fort Worth, TX",3.0,"2% commission, 5 days a week.",13+hr shifts that would go back to back or in ...,Pandora
1,Diverse nice job,"Nov 4, 2021",Sales,Hamburg,3.0,"Colleagues are friendly, clients are wealthy",Not much benefits and repetitive job,Pandora
2,best,"Nov 4, 2021",Sales Associate,"Sutton, London, England, England",5.0,"75% discount on all jewelry, flexible hours",there aren't really any cons,Pandora
3,Don't go,"Nov 4, 2021",CR Representative,"Roosevelt, NY",1.0,you have fun with some colleagues. I worked at...,"location, food, management, salary, too many p...",Pandora
4,amazing job,"Nov 4, 2021",Retail Sales Assistant,"Calgary, AB",5.0,great company to work for. plenty of opportuni...,People may not like set targets however its go...,Pandora


In [None]:
output.to_csv('pandora_reviews.csv')

**5. Novo Nordisk**

In [None]:
#paste/replace the url to the first page of the company's Glassdoor review in between the ""
input_url="https://www.glassdoor.sg/Reviews/Novo-Nordisk-Reviews-E3498"

#scraping the first page content
hdr = {"User-Agent":"Mozilla/5.0 Gecko/20100101 Firefox/33.0 GoogleChrome/10.0"}
req = Request(input_url+".htm?sort.sortType=RD&sort.ascending=false",headers=hdr)
page = urlopen(req)
soup = BeautifulSoup(page, "html.parser") 

#check the total number of reviews
countReviews = soup.find('div', {'data-test':'pagination-footer-text'}).text
countReviews = float(countReviews.split(' Reviews')[0].split('of ')[1].replace(',',''))

#calculate the max number of pages (assuming 10 reviews a page)
countPages = math.ceil(countReviews/10)
countPages

#I'm setting the max pages to scrape to 3 here to save time
#maxPage = 300
#uncomment the line below to set the max page to scrape (based on total number of reviews)
maxPage = countPages + 1

#scraping multiple pages of company glassdoor review
output = review_scraper(input_url+".htm?sort.sortType=RD&sort.ascending=false")
for x in range(2,maxPage):
  url = input_url+"_P"+str(x)+".htm?sort.sortType=RD&sort.ascending=false"
  output = output.append(review_scraper(url), ignore_index=True)

#display the output
display(output)

Unnamed: 0,Summary,Date,JobTitle,AuthorLocation,OverallRating,Pros,Cons
0,Good,"Nov 6, 2021",Medical Advisor,Tokyo,5.0,Pipeline and launched products are excellent,Strong focus on one product which is semaglutide
1,Good company,"Nov 3, 2021",Senior Diabetes Care Specialist,"Wheaton, IL",4.0,Care about employees \r\nWilling to listen and...,As with any pharmaceutical company it is extre...
2,Review from interview to working,"Nov 3, 2021",Finance Executive,Dhaka,4.0,Excellent working environment \nClassy place a...,The only cons I have seen so far is that peopl...
3,Great Place To Work,"Nov 3, 2021",Administrative Specialist,"Plainsboro, NJ",5.0,"Company Culture, Benefits, People, Diversity",If you interview for a position and you are no...
4,good,"Nov 2, 2021",Manufacturing Associate,"Clayton, NC",2.0,great people good benefits for there employees,little pay here but starting company
...,...,...,...,...,...,...,...
1001,Finance,"Jun 25, 2010",,"Phila, PA",5.0,the benefits and the people who work here,there are no downsides to working here
1002,nothing special,"Feb 17, 2009",Medical Director,"Zürich, Zürich",3.0,"evidence based, focused on diabetes, richpipe ...",career path depends on management
1003,"A good company to work for, but its the same a...","Jan 5, 2009",Sales,"Los Angeles, CA",3.0,benefits. benefits. benefits. benefits. benefits.,The management in the US don't support or tru...
1004,great company,"Dec 10, 2008",District Business Manager,"Princeton, NJ",4.0,the products are best in class,misunderstanding of the \r\nAmerican merket


In [None]:
output['Company'] = "Novo Nordisk"
output.head()

Unnamed: 0,Summary,Date,JobTitle,AuthorLocation,OverallRating,Pros,Cons,Company
0,Good,"Nov 6, 2021",Medical Advisor,Tokyo,5.0,Pipeline and launched products are excellent,Strong focus on one product which is semaglutide,Novo Nordisk
1,Good company,"Nov 3, 2021",Senior Diabetes Care Specialist,"Wheaton, IL",4.0,Care about employees \r\nWilling to listen and...,As with any pharmaceutical company it is extre...,Novo Nordisk
2,Review from interview to working,"Nov 3, 2021",Finance Executive,Dhaka,4.0,Excellent working environment \nClassy place a...,The only cons I have seen so far is that peopl...,Novo Nordisk
3,Great Place To Work,"Nov 3, 2021",Administrative Specialist,"Plainsboro, NJ",5.0,"Company Culture, Benefits, People, Diversity",If you interview for a position and you are no...,Novo Nordisk
4,good,"Nov 2, 2021",Manufacturing Associate,"Clayton, NC",2.0,great people good benefits for there employees,little pay here but starting company,Novo Nordisk


In [None]:
output.to_csv('novo_nordisk_reviews.csv')

## **Gather all data in one file**

In [None]:
df1 = pd.read_csv('/content/danske_bank_reviews.csv')
df2 = pd.read_csv('/content/lego_reviews.csv')
df3 = pd.read_csv('/content/maersk_reviews.csv')
df4 = pd.read_csv('/content/novo_nordisk_reviews.csv')
df5 = pd.read_csv('/content/pandora_reviews.csv')

tabs = [df1, df2, df3, df4, df5]

glassdoor_reviews = pd.concat(tabs)
glassdoor_reviews.reset_index(inplace=True)

glassdoor_reviews

Unnamed: 0.1,index,Unnamed: 0,Summary,Date,JobTitle,AuthorLocation,OverallRating,Pros,Cons,Company
0,0,0,Great workplace but lot of bureaucracy and pol...,"Nov 5, 2021",Strategy Consultant,Copenhagen,4.0,- Smart people\n- Good benefits\n- Interesting...,- Low comp\n- Very political environment,Danske Bank
1,1,1,Good,"Nov 4, 2021",Associate Director,Warsaw,5.0,Good employer in the Nordics,No direct cons to mention,Danske Bank
2,2,2,Danske bank,"Oct 31, 2021",Anonymous Employee,"Stockholm, Stockholm, Stockholm",4.0,Good employer cares about people who works there,Bank burocracy takes long time to do simple wo...,Danske Bank
3,3,3,Moving forward!,"Oct 28, 2021",Anonymous Employee,Vilnius,4.0,Great ambition to change with the demands of s...,Some things move really slow,Danske Bank
4,4,4,"Good place for non-ambitious, low-qualified pe...","Oct 28, 2021",AML Analyst,"Helsinki, Southern Finland, Southern Finland",1.0,- flexible work environment\n- social benefits...,- poor management without hard-skills;\n- prom...,Danske Bank
...,...,...,...,...,...,...,...,...,...,...
4904,963,963,Pandora Dealers Need to File a Class Action La...,"Jan 26, 2013",Dealer,"Virginia Beach, VA",1.0,There are no pros. A prime example of greed an...,This company started many years ago as an unkn...,Pandora
4905,964,964,AMazing experience to work for such a sought a...,"Dec 10, 2012",Key Holder,"Louisville, KY",5.0,Franchised and my store was very much like a f...,Lack of organization at times. The environment...,Pandora
4906,965,965,"Great people, fun work environment, excellent ...","Sep 15, 2012",Assistant Manager,"Edmonton, AB",4.0,Fun job. Opportunity to work with cool product...,Not as much corporate support as needed.,Pandora
4907,966,966,Ok and should be better but company growing so...,"Aug 15, 2011",Merchandiser,"Houston, TX",4.0,Wonderful people and it isdifferent every day.,Terrible communication and huge territory.,Pandora


In [None]:
glassdoor_reviews.to_csv('glassdoor_reviews.csv')

## **Add Carlsberg**

In [7]:
#paste/replace the url to the first page of the company's Glassdoor review in between the ""
input_url="https://www.glassdoor.sg/Reviews/Carlsberg-Group-Reviews-E3116"

#scraping the first page content
hdr = {"User-Agent":"Mozilla/5.0 Gecko/20100101 Firefox/33.0 GoogleChrome/10.0"}
req = Request(input_url+".htm?sort.sortType=RD&sort.ascending=false",headers=hdr)
page = urlopen(req)
soup = BeautifulSoup(page, "html.parser") 

#check the total number of reviews
countReviews = soup.find('div', {'data-test':'pagination-footer-text'}).text
countReviews = float(countReviews.split(' Reviews')[0].split('of ')[1].replace(',',''))

#calculate the max number of pages (assuming 10 reviews a page)
countPages = math.ceil(countReviews/10)
countPages

#I'm setting the max pages to scrape to 3 here to save time
#maxPage = 300
#uncomment the line below to set the max page to scrape (based on total number of reviews)
maxPage = countPages + 1

#scraping multiple pages of company glassdoor review
output = review_scraper(input_url+".htm?sort.sortType=RD&sort.ascending=false")
for x in range(2,maxPage):
  url = input_url+"_P"+str(x)+".htm?sort.sortType=RD&sort.ascending=false"
  output = output.append(review_scraper(url), ignore_index=True)

#display the output
display(output)

Unnamed: 0,Summary,Date,JobTitle,AuthorLocation,OverallRating,Pros,Cons
0,Strong purpose,"Nov 1, 2021",Human Resources Professional,Copenhagen,4.0,"Great colleagues, strong purpose, strong leade...",Too cost focused which prevents long term plans
1,Feedback,"Oct 29, 2021",AREA SALES MANAGER,Hanoi,4.0,take good care of employees\r\ngood work atmos...,harder than expected for international mobility
2,Good Company but too focused on savings,"Oct 27, 2021",Finance Manager,Copenhagen,4.0,"products, rotation program, people, office",Company is vey focused on savings
3,Probably not the best company or beer in the w...,"Oct 20, 2021",Director,Copenhagen,2.0,Big brewing company - with two big brands,"Very conservative, political and full of burea..."
4,All good if you work hard,"Oct 16, 2021",Senior Procurement Director,"Glarus Nord, Glarus",5.0,Manu opportunities globālu are there,Nothing personal just Business needs
...,...,...,...,...,...,...,...
205,Truly fast-moving company,"Nov 20, 2012",Chief Financial Officer,Copenhagen,5.0,They do not stop to think - they do! So they ...,Well... sometimes there would be some time nee...
206,"Good products, good inovation, good people, di...","Oct 11, 2012",Anonymous Employee,Copenhagen,3.0,"Chalenging and fast learning enviroment, espec...",To many different project that interfere betwe...
207,"Great, inspiring people, but not most structur...","Sep 10, 2012",Brand Manager,Copenhagen,4.0,"Great people, fun, youthful environment. Very ...",Lack of professionalism - complete mess when i...
208,"Great company, with lots of challenges and cha...","Aug 12, 2012",Business Controller,Copenhagen,4.0,"This is a company, which gives you a chance to...","As a market leader, the company has to follow ..."


In [8]:
output['Company'] = "Carlsberg"
output.head()

Unnamed: 0,Summary,Date,JobTitle,AuthorLocation,OverallRating,Pros,Cons,Company
0,Strong purpose,"Nov 1, 2021",Human Resources Professional,Copenhagen,4.0,"Great colleagues, strong purpose, strong leade...",Too cost focused which prevents long term plans,Carlsberg
1,Feedback,"Oct 29, 2021",AREA SALES MANAGER,Hanoi,4.0,take good care of employees\r\ngood work atmos...,harder than expected for international mobility,Carlsberg
2,Good Company but too focused on savings,"Oct 27, 2021",Finance Manager,Copenhagen,4.0,"products, rotation program, people, office",Company is vey focused on savings,Carlsberg
3,Probably not the best company or beer in the w...,"Oct 20, 2021",Director,Copenhagen,2.0,Big brewing company - with two big brands,"Very conservative, political and full of burea...",Carlsberg
4,All good if you work hard,"Oct 16, 2021",Senior Procurement Director,"Glarus Nord, Glarus",5.0,Manu opportunities globālu are there,Nothing personal just Business needs,Carlsberg


In [9]:
output.to_csv('carlsberg_reviews.csv')

In [10]:
df = pd.read_csv('/content/drive/MyDrive/1_glassdoor_reviews.csv')

df6 = pd.read_csv('/content/carlsberg_reviews.csv')

tabs = [df, df6]

glassdoor_reviews = pd.concat(tabs)
glassdoor_reviews.reset_index(inplace=True)

glassdoor_reviews

Unnamed: 0.2,level_0,Unnamed: 0,index,Unnamed: 0.1,Summary,Date,JobTitle,AuthorLocation,OverallRating,Pros,Cons,Company
0,0,0,0.0,0.0,Great workplace but lot of bureaucracy and pol...,"Nov 5, 2021",Strategy Consultant,Copenhagen,4.0,- Smart people\n- Good benefits\n- Interesting...,- Low comp\n- Very political environment,Danske Bank
1,1,1,1.0,1.0,Good,"Nov 4, 2021",Associate Director,Warsaw,5.0,Good employer in the Nordics,No direct cons to mention,Danske Bank
2,2,2,2.0,2.0,Danske bank,"Oct 31, 2021",Anonymous Employee,"Stockholm, Stockholm, Stockholm",4.0,Good employer cares about people who works there,Bank burocracy takes long time to do simple wo...,Danske Bank
3,3,3,3.0,3.0,Moving forward!,"Oct 28, 2021",Anonymous Employee,Vilnius,4.0,Great ambition to change with the demands of s...,Some things move really slow,Danske Bank
4,4,4,4.0,4.0,"Good place for non-ambitious, low-qualified pe...","Oct 28, 2021",AML Analyst,"Helsinki, Southern Finland, Southern Finland",1.0,- flexible work environment\n- social benefits...,- poor management without hard-skills;\n- prom...,Danske Bank
...,...,...,...,...,...,...,...,...,...,...,...,...
5114,205,205,,,Truly fast-moving company,"Nov 20, 2012",Chief Financial Officer,Copenhagen,5.0,They do not stop to think - they do! So they ...,Well... sometimes there would be some time nee...,Carlsberg
5115,206,206,,,"Good products, good inovation, good people, di...","Oct 11, 2012",Anonymous Employee,Copenhagen,3.0,"Chalenging and fast learning enviroment, espec...",To many different project that interfere betwe...,Carlsberg
5116,207,207,,,"Great, inspiring people, but not most structur...","Sep 10, 2012",Brand Manager,Copenhagen,4.0,"Great people, fun, youthful environment. Very ...",Lack of professionalism - complete mess when i...,Carlsberg
5117,208,208,,,"Great company, with lots of challenges and cha...","Aug 12, 2012",Business Controller,Copenhagen,4.0,"This is a company, which gives you a chance to...","As a market leader, the company has to follow ...",Carlsberg


In [11]:
glassdoor_reviews.to_csv('2_glassdoor_reviews.csv')