<a href="https://colab.research.google.com/github/paidasahithi26/SahithiPaida_INFO5731_Fall2024/blob/main/In_Class_Exercise_3_%26_2_(1).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#In Class Exercise 3: Dynamic Web Scraping (API-Based) (10 points)

## Scenario
**You are a data analyst. Your company wants a quick snapshot of the remote job market.**

## Target
- Website: https://remoteok.com  
- API endpoint (data source): https://remoteok.com/api


## Your Task (write code + write your answers)

Using Python, collect job data from the API and answer the following. *(Note: results may vary over time; grading is based on correct logic.)*

**Q1)** How many job postings are currently available? **(2 points)**

**Q2)** What are the top 3 most frequent companies? **(2 points)**

**Q3)** How many job titles contain the word **"Data"** (case-insensitive)? **(2 points)**

**Q4)** Create a pandas DataFrame using the **first 10** job postings (**exactly 10 rows**) with these columns: **(4 points)**


* `title`
* `company`
* `location` (if missing → `"Unknown"`)
* `salary` (if missing → `None`)
* `tags` (store as a comma-separated string)

## Requirements

* You may use ChatGPT.
* Do NOT use Selenium.
* Print your results clearly: `df.head(10)` + answers to Q1–Q3.



In [2]:
import requests
import pandas as pd
from collections import Counter

API_URL = "https://remoteok.com/api"

# Fetch data
response = requests.get(API_URL)
response.raise_for_status()
data = response.json()

# First element is metadata → skip it
jobs = data[1:]

# -----------------------------
# Q1: Number of job postings
# -----------------------------
q1_count = len(jobs)

# -----------------------------
# Q2: Top 3 most frequent companies
# -----------------------------
company_counts = Counter(job.get("company") for job in jobs if job.get("company"))
q2_top3 = company_counts.most_common(3)

# -----------------------------
# Q3: Job titles containing "Data" (case-insensitive)
# -----------------------------
q3_count = sum(
    "data" in job.get("position", "").lower()
    for job in jobs
)

# -----------------------------
# Q4: DataFrame of first 10 jobs
# -----------------------------
df = pd.DataFrame([
    {
        "title": job.get("position"),
        "company": job.get("company"),
        "location": job.get("location") if job.get("location") else "Unknown",
        "salary": job.get("salary") if job.get("salary") else None,
        "tags": ", ".join(job.get("tags", []))
    }
    for job in jobs[:10]
])

# -----------------------------
# Print results
# -----------------------------
print("Q1) Number of job postings:", q1_count)
print("\nQ2) Top 3 most frequent companies:")
for company, count in q2_top3:
    print(f"   {company}: {count}")

print("\nQ3) Number of job titles containing 'Data':", q3_count)

print("\nQ4) First 10 job postings DataFrame:")
print(df.head(10))


Q1) Number of job postings: 95

Q2) Top 3 most frequent companies:
   Anduril Industries: 3
   Versaterm: 3
   Ubiminds: 3

Q3) Number of job titles containing 'Data': 3

Q4) First 10 job postings DataFrame:
                                             title               company  \
0  Senior Technical Program Manager Infrastructure                Planet   
1                              Web Program Manager              Huntress   
2        Senior Program Manager Workforce Planning                Fluxon   
3        Senior Principal SEO GEO & Search Systems              Workwize   
4       Senior Full Stack Software Engineer Growth             Circle.so   
5                 Enterprise Account Executive KSA               Dataiku   
6                    Strategic Partner Manager GSI        Armis Security   
7                           React Native Developer  Bluelight Consulting   
8                  Customer Service Representative        Wing Assistant   
9            Staff Software Engi

In [None]:
# Write your solution below.
# Tip: the API response includes a metadata object plus job objects.
# Make sure you keep only real job postings.
Q1) Number of job postings are: 95

Q2) Top 3 most frequent companies:
   Anduril Industries: 3
   Versaterm: 3
   Ubiminds: 3

Q3) Number of job titles containing 'Data': 3

Q4) First 10 job postings DataFrame:
                                             title               company  \
0  Senior Technical Program Manager Infrastructure                Planet
1                              Web Program Manager              Huntress
2        Senior Program Manager Workforce Planning                Fluxon
3        Senior Principal SEO GEO & Search Systems              Workwize
4       Senior Full Stack Software Engineer Growth             Circle.so
5                 Enterprise Account Executive KSA               Dataiku
6                    Strategic Partner Manager GSI        Armis Security
7                           React Native Developer  Bluelight Consulting
8                  Customer Service Representative        Wing Assistant
9            Staff Software Engineer Platform Team               Wellhub

                   location salary  \
0             United States   None
1  United States of America   None
2             United States   None
3                 Amsterdam   None
4                    Remote   None
5              Saudi Arabia   None
6                    London   None
7       Panama City, Panama   None
8       Manila, Philippines   None
9                   Unknown   None

                                                tags
0               manager, technical, software, senior
1  web, manager, saas, security, game, senior, op...
2  manager, management, senior, operational, engi...
3                                  saas, seo, senior
4  software, growth, full-stack, senior, analytic...
5  software, management, analytics, sales, executive
6  infosec, manager, security, growth, management...
7  react, consulting, developer, ios, consultant,...
8  virtual assistant, security, support, software...
9                                 software, engineer















# In-Class Exercise 2 (10 points)

**In-Class Assignment — Week 3 (Lesson 3)**

Time: 20–30 minutes
Points: 10

**Instructions**

This is an individual in-class assignment.

Complete it during class time.

You may use Week 3 lecture notes / demo notebooks.

Write your code under each question and print the output.

Submit the GitHub link.

Q1 ( 4 points)
Write a Python program that prompts the user to enter two numbers and perform a division operation. Handle exceptions for both zero division and invalid input (non-numeric input). Display appropriate error messages for each type of exception and ensure the program does not crash due to these errors.

In [4]:
# write your answer here
try:
  num1 = float(input("Enter the first number: "))
  num2 = float(input("Enter the second number: "))
  result = num1 / num2
  print("Result:", result)
except ZeroDivisionError:
  print("Error: Division by zero is not allowed.")
except ValueError:
  print("Error: Please enter numeric values only.")


Enter the first number: 12
Enter the second number: 14
Result: 0.8571428571428571


 Q2( 4 points)
Define a base class called 'Vehicle' with attributes make and model. Create a derived class Car that inherits from Vehicle and has an additional attribute 'num_doors'. Demonstrate an example of creating an instance of the 'Car' class and accessing its attributes.

In [6]:
# write your answer here
class Vehicle:
    def __init__(self, make, model):
        self.make = make
        self.model = model
class Car(Vehicle):
    def __init__(self, make, model, num_doors):
        # Call the parent class constructor
        super().__init__(make, model)
        self.num_doors = num_doors
my_car = Car("Toyota", "Camry", 4)
print("Make:", my_car.make)
print("Model:", my_car.model)
print("Number of Doors:", my_car.num_doors)


Make: Toyota
Model: Camry
Number of Doors: 4


Question 3 ( 2 points)
Create a program that accepts a list of numbers as input and outputs a new list containing only the even numbers.

In [7]:
# write your answer here
# Take input from user
numbers = input("Enter numbers separated by spaces: ")

# Convert input string into a list of integers
numbers_list = list(map(int, numbers.split()))

# Create a new list for even numbers
even_numbers = []

for num in numbers_list:
    if num % 2 == 0:
        even_numbers.append(num)

print("Even numbers:", even_numbers)


Enter numbers separated by spaces: 1 2 3
Even numbers: [2]
