<h1 align=center>Code For Remote Jobs Web Scraping <h1>

<h1> Import Required Libraries<h1>

First, import the necessary Python libraries for web scraping and data handling.

In [30]:
import requests
from bs4 import BeautifulSoup
import pandas as pd

<h1>Set the Target URL and Headers<h1>

Define the base URL and headers to mimic a browser request. This helps avoid being blocked by the server.

In [31]:
base_url = "https://remoteok.io/remote-dev-jobs"

headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"}

<h1>Send an HTTP Request<h1>

Make a GET request to the website to fetch the raw HTML content.

In [32]:
response = requests.get(base_url, headers=headers)

if response.status_code != 200:
    print(f"Failed to fetch the webpage. Status code: {response.status_code}")
else:
    print("Fetching jobs...")


Fetching jobs...


<h1>Parse the HTML Content<h1>

Use BeautifulSoup to parse the HTML and make it navigable

In [33]:
soup = BeautifulSoup(response.text, 'html.parser')


<h1>Find Job Containers<h1>

Identify the HTML elements that hold job data. For RemoteOK, job containers are stored in <tr> tags with the class job.

In [34]:
job_containers = soup.find_all('tr', class_='job')
print(f"Found {len(job_containers)} job postings.")


Found 20 job postings.


<h1> Extract Data From Each Job<h1>

Loop through each job container and extract the required fields: Job Title, Company Name, Location, and Apply Link

In [35]:
jobs = []

for job in job_containers:
    try:
    
        job_title = job.find('h2', class_='').text.strip()
        

        company_name = job.find('h3', class_='').text.strip()
        

        location = job.find('div', class_='location').text.strip() if job.find('div', class_='location') else "Remote"
        
     
        apply_link = job.find('a', class_='preventLink')['href']
        apply_link = f"https://remoteok.io{apply_link}"
        
    
        jobs.append({
            "Job Title": job_title,
            "Company Name": company_name,
            "Location": location,
            "Apply Link": apply_link
        })
    except Exception as e:
        print(f"Error extracting job details: {e}")


<h1>Convert Data to a Pandas DataFrame<h1>

Once the data is extracted, convert it into a structured format using a Pandas DataFrame.

In [36]:
df = pd.DataFrame(jobs)

<h1> Save Data to a CSV File<h1>

Save the extracted data to a CSV file for further analysis or sharing

In [37]:
df.to_csv("remote_jobs.csv", index=False)
print("Scraping completed. Dataset saved as 'remote_jobs.csv'.")

Scraping completed. Dataset saved as 'remote_jobs.csv'.


<h1>Dataset<h1>

In [38]:
df.head(3)

Unnamed: 0,Job Title,Company Name,Location,Apply Link
0,Senior Backend Golang Engineer,Hippo Technologies,🌏 Worldwide,https://remoteok.io/remote-jobs/remote-senior-...
1,Senior Backend Engineer,Composer Technologies,🌎 North America,https://remoteok.io/remote-jobs/remote-senior-...
2,Full Stack Engineer,ResumeDive,🌏 Worldwide,https://remoteok.io/remote-jobs/remote-full-st...
