# SQL Practice 11/17/2022

## Q1

You are given a table of job postings and the goal is to write a query that gets the number of companies that have posted duplicate job listings. A duplicate job listing is one that has the same title and description.

Data Information

`job_listings` **Table**

Column Name | Type
------------|------
job_id | integer
company_id | integer
title | string
job_description | string

My solution is below:

```SQL
WITH job_listings_rank AS (
  SELECT
    ROW_NUMBER() OVER (PARTITION BY (company_id, title, job_description)) AS ranking,
    company_id,
    title,
    job_description
  FROM job_listings
  )
  
SELECT COUNT(*) AS companies_w_duplicate_jobs
  FROM job_listings_rank
  WHERE ranking = 2
;
```

This will produce a table like the following

|companies_w_duplicate_jobs|
|:------------------------:|
|3|

We can use the `ROW_NUMBER` window function over a partition of all three of the columns we care about duplicates in to create a new table with a column that counts the number of times it has seen a row with the same values in those three columns. Then we count the number of entries that have seen a row twice, representing a duplicate job posting from the same company.

## Q2

Given two tables described below, write a SQL query using a window function to show which candidates scored the highest from each college.

Data Information

`candidateColleges` **Table**

Column Name | Type
------------|-----
college_id | integer
candidate_name | string

`candidateInterviews` **Table**

Column Name | Type
------------|-----
interview_id | integer
candidate_name | string
interview_score | integer

An interview score is a number between 1 and 5 inclusive

My solution is below:

```SQL
WITH full_table AS (
    SELECT 
        candidateCollege.college_id,
        candidateInterviews.interview_name,
        candidateInterviews.interview_score,
        RANK() OVER (PARTITION BY candidateCollege.college_id ORDER BY candidateInterviews.interview_score DESC) AS rank

        FROM candidateColleges LEFT JOIN candidateInterviews
            ON candidateColleges.candidate_name = candidateInterviews.candidate_name
    )

SELECT college_id, candidate_name
    FROM full_table
    WHERE rank = 1
;
```

We are first creating a new table that is the join of the two tables given, because we need information about a column from each table. in that join we include a window function using the `RANK` aggregation partiationing by the `college_id` and ordering by the `interview_score` in descending order, so that the highest interview score for each college is listed as rank 1.

Then we select from that joined table the `college_id` and `candidate_name` conditioning on where the rank is 1.