In [5]:
## 🧪 Lab Task: Data Collection from RemoteOK

#As a data scientist, the first step in any project is to collect relevant and structured data. In this exercise, your task is to extract job-related information from the RemoteOK job website: [https://remoteok.com/r](https://remoteok.com/r).

### 🎯 Objectives:
#- Collect the following data fields for each job posting:
 # - **Company Name**
  #- **Job Role**
  #- **Location**
  #- **Features or Tags** (e.g., technologies, benefits, job type)

### 🛠 Instructions:
#- Use Python along with libraries such as `requests`, `pandas`, and optionally `json` or `BeautifulSoup` if needed.
#- Retrieve the job data from the RemoteOK API or web page.
#- Parse the JSON or HTML response to extract the required fields.
#- Store the collected data in a structured format such as **CSV** for future analysis.

### 📦 Output:
#A CSV file (e.g., `remoteok_jobs.csv`) containing all extracted job listings with the specified fields.

#> ✅ This dataset will serve as the foundation for further data analysis and machine learning tasks in upcoming lab exercises.


# To run this script, you need to install the following libraries:
# pip install requests beautifulsoup4

import requests
import pandas as pd

api_url = "https://remoteok.com/api"
all_jobs_data = []
print("Fetching job data from the RemoteOK API...")

try:
    response = requests.get(api_url)
    response.raise_for_status()
    job_listings = response.json()[1:]
    for job in job_listings:
        company = job.get("company", "N/A")
        role = job.get("position", "N/A")
        location = job.get("location", "Worldwide")
        tags_list = job.get("tags", [])
        tags = ", ".join(tags_list)

        all_jobs_data.append({
            "Company Name": company,
            "Job Role": role,
            "Location": location,
            "Tags": tags
        })
    print(f"Successfully collected {len(all_jobs_data)} job listings.")

except requests.exceptions.RequestException as e:
    print(f"An error occurred while fetching data: {e}")

if all_jobs_data:
    df = pd.DataFrame(all_jobs_data)
    df.to_csv('remoteok_jobs.csv', index=False, encoding='utf-8')
    print("Data successfully saved to remoteok_jobs.csv")
else:
    print("No data was collected. CSV file not created.")

print("\n--- Process Complete ---")

Fetching job data from the RemoteOK API...
Successfully collected 95 job listings.
Data successfully saved to remoteok_jobs.csv

--- Process Complete ---
