<a href="https://colab.research.google.com/github/SRIJANRAOS/srijanraos_INFO5731_spring2026/blob/main/In_Class_Exercise_3_%26_2_(1).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#In Class Exercise 3: Dynamic Web Scraping (API-Based) (10 points)

## Scenario
**You are a data analyst. Your company wants a quick snapshot of the remote job market.**

## Target
- Website: https://remoteok.com  
- API endpoint (data source): https://remoteok.com/api


## Your Task (write code + write your answers)

Using Python, collect job data from the API and answer the following. *(Note: results may vary over time; grading is based on correct logic.)*

**Q1)** How many job postings are currently available? **(2 points)**

**Q2)** What are the top 3 most frequent companies? **(2 points)**

**Q3)** How many job titles contain the word **"Data"** (case-insensitive)? **(2 points)**

**Q4)** Create a pandas DataFrame using the **first 10** job postings (**exactly 10 rows**) with these columns: **(4 points)**


* `title`
* `company`
* `location` (if missing → `"Unknown"`)
* `salary` (if missing → `None`)
* `tags` (store as a comma-separated string)

## Requirements

* You may use ChatGPT.
* Do NOT use Selenium.
* Print your results clearly: `df.head(10)` + answers to Q1–Q3.



In [7]:
# Starter cell (optional)
import requests
import pandas as pd

API_URL = "https://remoteok.com/api"


In [8]:
# Write your solution below.
# Tip: the API response includes a metadata object plus job objects.
# Make sure you keep only real job postings.


import requests
import pandas as pd

API_URL = "https://remoteok.com/api"

# RemoteOK works best with a user-agent header
headers = {"User-Agent": "Mozilla/5.0"}

# Fetch API data
resp = requests.get(API_URL, headers=headers)
resp.raise_for_status()
data = resp.json()

# Keep ONLY real job postings:
# The API includes a metadata dict (usually first item), job posts have an "id" and typically a "position"
jobs = [item for item in data if isinstance(item, dict) and "id" in item and "position" in item]

# --------------------
# Q1) How many job postings?
# --------------------
q1_total_postings = len(jobs)

# --------------------
# Q2) Top 3 most frequent companies
# --------------------
companies_series = pd.Series([job.get("company", "Unknown") for job in jobs])
q2_top3_companies = companies_series.value_counts().head(3)

# --------------------
# Q3) How many job titles contain "Data" (case-insensitive)?
# --------------------
q3_data_titles = sum(
    1 for job in jobs
    if isinstance(job.get("position"), str) and "data" in job["position"].lower()
)

# --------------------
# Q4) DataFrame with first 10 job postings (exactly 10 rows)
# --------------------
first_10 = jobs[:10]  # exactly 10 rows

rows = []
for job in first_10:
    title = job.get("position")
    company = job.get("company")
    location = job.get("location") or "Unknown"
    salary = job.get("salary") or None
    tags_list = job.get("tags", [])
    tags = ", ".join(tags_list) if isinstance(tags_list, list) else ""

    rows.append({
        "title": title,
        "company": company,
        "location": location,
        "salary": salary,
        "tags": tags
    })

df = pd.DataFrame(rows)

# --------------------
# Print results clearly
# --------------------
print("Q1) Total job postings currently available:", q1_total_postings)

print("\nQ2) Top 3 most frequent companies:")
for company, count in q2_top3_companies.items():
    print(f"   {company}: {count}")

print(f'\nQ3) Number of job titles containing "Data" (case-insensitive): {q3_data_titles}')

print("\nQ4) DataFrame (first 10 job postings):")
print(df.head(10))













Q1) Total job postings currently available: 95

Q2) Top 3 most frequent companies:
   Ubiminds: 3
   Versaterm: 3
   Anduril Industries: 3

Q3) Number of job titles containing "Data" (case-insensitive): 3

Q4) DataFrame (first 10 job postings):
                                             title               company  \
0  Senior Technical Program Manager Infrastructure                Planet   
1                              Web Program Manager              Huntress   
2        Senior Program Manager Workforce Planning                Fluxon   
3        Senior Principal SEO GEO & Search Systems              Workwize   
4       Senior Full Stack Software Engineer Growth             Circle.so   
5                 Enterprise Account Executive KSA               Dataiku   
6                    Strategic Partner Manager GSI        Armis Security   
7                           React Native Developer  Bluelight Consulting   
8                  Customer Service Representative        Wing Assistan

# In-Class Exercise 2 (10 points)

**In-Class Assignment — Week 3 (Lesson 3)**

Time: 20–30 minutes
Points: 10

**Instructions**

This is an individual in-class assignment.

Complete it during class time.

You may use Week 3 lecture notes / demo notebooks.

Write your code under each question and print the output.

Submit the GitHub link.

Q1 ( 4 points)
Write a Python program that prompts the user to enter two numbers and perform a division operation. Handle exceptions for both zero division and invalid input (non-numeric input). Display appropriate error messages for each type of exception and ensure the program does not crash due to these errors.

In [6]:


try:

    num1 = float(input("Enter the first number: "))
    num2 = float(input("Enter the second number: "))

    result = num1 / num2

    print("Result:", result)

except ZeroDivisionError:
    print("Error: Division by zero is not allowed.")


except ValueError:
    print("Error: Invalid input. Please enter numeric values only.")


except Exception as e:
    print("Unexpected error occurred:", e)



Enter the first number: 7
Enter the second number: 5
Result: 1.4


 Q2( 4 points)
Define a base class called 'Vehicle' with attributes make and model. Create a derived class Car that inherits from Vehicle and has an additional attribute 'num_doors'. Demonstrate an example of creating an instance of the 'Car' class and accessing its attributes.

In [3]:



class Vehicle:
    def __init__(self, make, model):
        self.make = make
        self.model = model



class Car(Vehicle):
    def __init__(self, make, model, num_doors):
        super().__init__(make, model)
        self.num_doors = num_doors



my_car = Car("Honda", "Civic", 4)

print("Make:", my_car.make)
print("Model:", my_car.model)
print("Number of Doors:", my_car.num_doors)



Make: Honda
Model: Civic
Number of Doors: 4


Question 3 ( 2 points)
Create a program that accepts a list of numbers as input and outputs a new list containing only the even numbers.

In [4]:



numbers = input("Enter numbers separated by spaces: ")

numbers_list = list(map(int, numbers.split()))

even_numbers = [num for num in numbers_list if num % 2 == 0]

print("Even numbers:", even_numbers)



Enter numbers separated by spaces: 6
Even numbers: [6]
