<a href="https://colab.research.google.com/github/varunasnv7-cpu/Varun_Info_5731_Spring2026/blob/main/In_Class_Exercise_3_%26_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#In Class Exercise 3: Dynamic Web Scraping (API-Based) (10 points)

## Scenario
**You are a data analyst. Your company wants a quick snapshot of the remote job market.**

## Target
- Website: https://remoteok.com  
- API endpoint (data source): https://remoteok.com/api


## Your Task (write code + write your answers)

Using Python, collect job data from the API and answer the following. *(Note: results may vary over time; grading is based on correct logic.)*

**Q1)** How many job postings are currently available? **(2 points)**

**Q2)** What are the top 3 most frequent companies? **(2 points)**

**Q3)** How many job titles contain the word **"Data"** (case-insensitive)? **(2 points)**

**Q4)** Create a pandas DataFrame using the **first 10** job postings (**exactly 10 rows**) with these columns: **(4 points)**


* `title`
* `company`
* `location` (if missing → `"Unknown"`)
* `salary` (if missing → `None`)
* `tags` (store as a comma-separated string)

## Requirements

* You may use ChatGPT.
* Do NOT use Selenium.
* Print your results clearly: `df.head(10)` + answers to Q1–Q3.



In [None]:
# Starter cell (optional)
import requests
import pandas as pd

API_URL = "https://remoteok.com/api"


In [1]:
import requests
import pandas as pd

# API endpoint
url = "https://remoteok.com/api"

# Request data (RemoteOK requires a user-agent header)
headers = {"User-Agent": "Mozilla/5.0"}
response = requests.get(url, headers=headers)

data = response.json()

# The first element is metadata, actual jobs start from index 1
jobs = data[1:]

# -------------------------
# Q1) Total number of jobs
# -------------------------
total_jobs = len(jobs)
print("Q1) Total job postings:", total_jobs)

# ---------------------------------
# Q2) Top 3 most frequent companies
# ---------------------------------
companies = [job.get("company") for job in jobs if job.get("company")]

top_3_companies = (
    pd.Series(companies)
    .value_counts()
    .head(3)
)

print("\nQ2) Top 3 most frequent companies:")
print(top_3_companies)

# -----------------------------------------
# Q3) Job titles containing "Data"
# -----------------------------------------
data_jobs_count = sum(
    1 for job in jobs
    if job.get("position") and "data" in job.get("position").lower()
)

print("\nQ3) Number of job titles containing 'Data':", data_jobs_count)

# -----------------------------------------
# Q4) DataFrame of first 10 job postings
# -----------------------------------------
first_10 = jobs[:10]

rows = []
for job in first_10:
    row = {
        "title": job.get("position"),
        "company": job.get("company"),
        "location": job.get("location") if job.get("location") else "Unknown",
        "salary": job.get("salary") if job.get("salary") else None,
        "tags": ", ".join(job.get("tags")) if job.get("tags") else ""
    }
    rows.append(row)

df = pd.DataFrame(rows)

print("\nQ4) DataFrame (first 10 rows):")
print(df.head(10))

Q1) Total job postings: 95

Q2) Top 3 most frequent companies:
Ubiminds              3
Versaterm             3
Anduril Industries    3
Name: count, dtype: int64

Q3) Number of job titles containing 'Data': 3

Q4) DataFrame (first 10 rows):
                                             title               company  \
0  Senior Technical Program Manager Infrastructure                Planet   
1                              Web Program Manager              Huntress   
2        Senior Program Manager Workforce Planning                Fluxon   
3        Senior Principal SEO GEO & Search Systems              Workwize   
4       Senior Full Stack Software Engineer Growth             Circle.so   
5                 Enterprise Account Executive KSA               Dataiku   
6                    Strategic Partner Manager GSI        Armis Security   
7                           React Native Developer  Bluelight Consulting   
8                  Customer Service Representative        Wing Assistant   


# In-Class Exercise 2 (10 points)

**In-Class Assignment — Week 3 (Lesson 3)**

Time: 20–30 minutes
Points: 10

**Instructions**

This is an individual in-class assignment.

Complete it during class time.

You may use Week 3 lecture notes / demo notebooks.

Write your code under each question and print the output.

Submit the GitHub link.

Q1 ( 4 points)
Write a Python program that prompts the user to enter two numbers and perform a division operation. Handle exceptions for both zero division and invalid input (non-numeric input). Display appropriate error messages for each type of exception and ensure the program does not crash due to these errors.

In [2]:
# Q1: Division with Exception Handling

try:
    # Prompt user for input
    num1 = float(input("Enter the first number: "))
    num2 = float(input("Enter the second number: "))

    # Perform division
    result = num1 / num2

    # Print result
    print("Result:", result)

except ZeroDivisionError:
    print("Error: Division by zero is not allowed.")

except ValueError:
    print("Error: Invalid input. Please enter numeric values only.")

except Exception as e:
    print("Unexpected error:", e)

print("Program finished without crashing.")

Enter the first number: 5
Enter the second number: 4
Result: 1.25
Program finished without crashing.


 Q2( 4 points)
Define a base class called 'Vehicle' with attributes make and model. Create a derived class Car that inherits from Vehicle and has an additional attribute 'num_doors'. Demonstrate an example of creating an instance of the 'Car' class and accessing its attributes.

In [3]:
# Q2: Inheritance Example

# Base class
class Vehicle:
    def __init__(self, make, model):
        self.make = make
        self.model = model


# Derived class
class Car(Vehicle):
    def __init__(self, make, model, num_doors):
        # Call parent constructor
        super().__init__(make, model)
        self.num_doors = num_doors


# Create an instance of Car
my_car = Car("Toyota", "Camry", 4)

# Access and print attributes
print("Make:", my_car.make)
print("Model:", my_car.model)
print("Number of doors:", my_car.num_doors)

Make: Toyota
Model: Camry
Number of doors: 4


Question 3 ( 2 points)
Create a program that accepts a list of numbers as input and outputs a new list containing only the even numbers.

In [7]:
#Here is a complete solution for Q3 (2 points) with code and printed output.


# Q3: Filter Even Numbers from a List

# Accept input from user (numbers separated by spaces)
user_input = input("Enter a list of numbers separated by spaces: ")

# Convert input string to a list of integers
numbers = [int(num) for num in user_input.split()]

# Create a new list with only even numbers
even_numbers = [num for num in numbers if num % 2 == 0]

# Print result
print("Even numbers:", even_numbers)

Enter a list of numbers separated by spaces: 1 4 8 5 2 7 4
Even numbers: [4, 8, 2, 4]
