## Web Scraping Data from PeoplePerHour

Below are the XPath expressions you'll need to scrape data from https://www.peopleperhour.com/hire-freelancers. Remember that you'll need to use the `.text` method to extract the values.

### XPath Expressions

1. **Names of Freelancers**
   - XPath: `//h2[contains(@class, 'clearfix')]`

2. **Description of Each Freelancer**
   - XPath: `//p[contains(@class, 'job-title')]`

3. **Nationality of Each Freelancer**
   - XPath: `//div/span[contains(@class, 'small')]`

4. **Price of Each Freelancer**
   - XPath: `//div/span[contains(@class, 'title-nano card')]`

5. **Ratings (needs to split into two columns: rating and number of reviews)**
   - XPath: `(//div[contains(@class, 'ratings')])`

6. **Next Page**
   - XPath: `//li[contains(@class, 'next')]`


In [2]:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.options import Options
import time
import pandas as pd

# Initialize WebDriver
options = Options()
options.add_argument("--headless")  # Run in headless mode (optional)
driver = webdriver.Chrome()

# Open the website
driver.get("https://www.peopleperhour.com/hire-freelancers")
time.sleep(3)  # Allow time for elements to load

data = []
max_rows = 25
collected_rows = 0

while collected_rows < max_rows:
    freelancers = driver.find_elements(By.XPATH, "//h2[contains(@class, 'clearfix')]")
    descriptions = driver.find_elements(By.XPATH, "//p[contains(@class, 'job-title')]")
    nationalities = driver.find_elements(By.XPATH, "//div/span[contains(@class, 'small')]")
    prices = driver.find_elements(By.XPATH, "//div/span[contains(@class, 'title-nano card')]")
    ratings_data = driver.find_elements(By.XPATH, "//div[contains(@class, 'ratings')]")
    
    for i in range(len(freelancers)):
        if collected_rows >= max_rows:
            break
        
        name = freelancers[i].text if i < len(freelancers) else "N/A"
        description = descriptions[i].text if i < len(descriptions) else "N/A"
        nationality = nationalities[i].text if i < len(nationalities) else "N/A"
        price = prices[i].text if i < len(prices) else "N/A"
        
        rating_text = ratings_data[i].text if i < len(ratings_data) else "N/A"
        rating_parts = rating_text.split(" ")
        rating = rating_parts[0] if len(rating_parts) > 0 else "N/A"
        total_votes = rating_parts[1][1:-1] if len(rating_parts) > 1 else "N/A"  # Removing parentheses
        
        data.append([name, description, nationality, price, rating, total_votes])
        collected_rows += 1

    # Try to go to the next page
    try:
        next_button = driver.find_element(By.XPATH, "//li[contains(@class, 'next')]")
        next_button.click()
        time.sleep(3)  # Wait for the new page to load
    except:
        print("No more pages or next button not found.")
        break

# Close the driver
driver.quit()

# Save to DataFrame and CSV
columns = ["Name", "Description", "Nationality", "Price", "Rating", "Total Votes"]
df = pd.DataFrame(data, columns=columns)
df.to_csv("freelancers_data.csv", index=False)

print("Scraping completed. Data saved to freelancers_data.csv.")

No more pages or next button not found.
Scraping completed. Data saved to freelancers_data.csv.


In [3]:
df

Unnamed: 0,Name,Description,Nationality,Price,Rating,Total Votes
0,Tom M.,Google Ads Partner / PPC / AdWords / Google Sh...,United Kingdom,$51/hr,5.0\n,128)\nCER
1,Maria H.,"Experienced Team of Graphic Designers, Web Dev...",United Kingdom,$32/hr,4.9\n,7123)\nTO
2,Ann-Marie M.,Freelance Paralegal,Jamaica,$19/hr,,
3,Maykal P.,Full stack Web / App developer / AI profession...,Bulgaria,$30/hr,5.0\n,19)\nCER
4,White Hat SEO Guru| Guaranteed Ranking S.,PPH's #1 for SEO & 22 Years of Excellence |Goo...,India,$10/hr,4.8\n,4776)\nTO
5,Green D.,"SEO Expert, 5000+ Reviews, #1in SEO & Marketin...",Ireland,$45/hr,5.0\n,6184)\nTO
6,Sunday E.,Web designer,United Kingdom,$19/hr,5.0\n,1)\nCER
7,GAJURA C.,SEO Expert and Experienced Guest Post Writer o...,Germany,$32/hr,5.0\n,2498)\nTO
8,Out of Box Studios,"UK's Leading Digital Design, Development & Mar...",United Kingdom,$38/hr,4.9\n,1677)\nTO
9,Writing Expertise,⭐ Copywriter | Proofreader| SEO Specialist |Mu...,United Kingdom,$14/hr,5.0\n,8)\nTO
