<a href="https://colab.research.google.com/github/ankitarm/SQL_Data_Engineer/blob/main/DataLemur.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 📌 [Data Science Skills – LinkedIn SQL Interview Question (DataLemur)](https://datalemur.com/questions/matching-skills)

**Question:**
Given a table of candidates and their skills, you're tasked with finding the candidates best suited for an open Data Science job. You want to find candidates who are proficient in **Python**, **Tableau**, and **PostgreSQL**.

Write a query to list the candidates who possess **all of the required skills** for the job. Sort the output by candidate ID in ascending order.

**Input Table: `candidates`**

| candidate\_id | skill      |
| ------------- | ---------- |
| 123           | Python     |
| 123           | Tableau    |
| 123           | PostgreSQL |
| 234           | R          |
| 234           | PowerBI    |
| 234           | SQL Server |
| 345           | Python     |
| 345           | Tableau    |

**Expected Output:**

| candidate\_id |
| ------------- |
| 123           |

---

```sql
-- SQL Query to find candidates with all required skills
SELECT candidate_id
FROM candidates
WHERE skill IN ('Python', 'Tableau', 'PostgreSQL')
GROUP BY candidate_id
HAVING COUNT(DISTINCT skill) = 3
ORDER BY candidate_id;
```



# 📌 [SQL Page With No Likes – DataLemur SQL Interview Question](https://datalemur.com/questions/sql-page-with-no-likes)

**Question:**
Find the pages that **have never received any likes**.

**Input Tables:**

**`pages`**

| page\_id | page\_name    |
| -------- | ------------- |
| 1        | SQL Solutions |
| 2        | TikTok Ads    |
| 3        | Data Science  |

**`page_likes`**

| user\_id | page\_id |
| -------- | -------- |
| 101      | 1        |
| 102      | 2        |
| 103      | 2        |

**Expected Output:**

| page\_id | page\_name   |
| -------- | ------------ |
| 3        | Data Science |

---

 ✅ Solution 1: `LEFT JOIN ... WHERE IS NULL` (Most Common)

```sql
SELECT p.page_id, p.page_name
FROM pages p
LEFT JOIN page_likes pl ON p.page_id = pl.page_id
WHERE pl.page_id IS NULL;
```

📌 **Best For:**

* **Small to medium datasets**
* Simple and easy to read
* Efficient if `page_likes` has an index on `page_id`

---

 ✅ Solution 2: `NOT IN` Subquery

```sql
SELECT page_id, page_name
FROM pages
WHERE page_id NOT IN (
    SELECT DISTINCT page_id FROM page_likes
);
```

⚠️ **Caution:**

* Fails if `page_likes.page_id` contains NULLs
* Slower on **large datasets** due to subquery materialization

📌 **Best For:**

* **Small datasets only**
* Avoid if `page_likes` is large or has NULLs in `page_id`

---

 ✅ Solution 3: `NOT EXISTS` Subquery

```sql
SELECT page_id, page_name
FROM pages p
WHERE NOT EXISTS (
    SELECT 1
    FROM page_likes pl
    WHERE p.page_id = pl.page_id
);
```

📌 **Best For:**

* **Large datasets**
* Performs better than `NOT IN` and sometimes better than `LEFT JOIN`
* Efficient with proper indexing

---

 🔍 Which is Best?

| Query Type          | Small Data | Large Data | Handles NULLs | Readability | Performance |
| ------------------- | ---------- | ---------- | ------------- | ----------- | ----------- |
| LEFT JOIN + IS NULL | ✅          | ✅          | ✅             | ✅           | ✅✅          |
| NOT IN              | ✅          | ❌          | ❌             | ✅           | ❌           |
| NOT EXISTS          | ✅          | ✅✅         | ✅             | ✅           | ✅✅✅         |

---

**👉 Recommended for Production:** Use `NOT EXISTS` or `LEFT JOIN ... IS NULL` with proper indexing on `page_likes.page_id`.


#  📌 [Tesla Unfinished Parts – DataLemur SQL Interview Question](https://datalemur.com/questions/tesla-unfinished-parts)

**Question:**
Tesla is analyzing its parts supply. Each part has a `part_id` and may or may not have a `finish_date` depending on whether the part has been completed. Your task is to find the part IDs of **parts that have not been finished yet**.

**Input Table: `parts_assembly`**

| part\_id | finish\_date |
| -------- | ------------ |
| 1001     | 2022-07-01   |
| 1002     | NULL         |
| 1003     | NULL         |
| 1004     | 2022-07-15   |

**Expected Output:**

| part\_id |
| -------- |
| 1002     |
| 1003     |

---

 ✅ Solution 1: `WHERE finish_date IS NULL`

```sql
SELECT part_id
FROM parts_assembly
WHERE finish_date IS NULL;
```

📌 **Best For:**

* Simple, clear, and very efficient
* Optimal for both **small** and **large datasets**

---


| Query Type            | Simple | Fast on Small Data | Fast on Large Data | Readable | Recommended |
| --------------------- | ------ | ------------------ | ------------------ | -------- | ----------- |
| `WHERE IS NULL`       | ✅✅     | ✅✅                 | ✅✅✅                | ✅✅       | ✅✅✅         |



#  📌 [Laptop vs Mobile Viewership – DataLemur SQL Interview Question](https://datalemur.com/questions/laptop-mobile-viewership)

**Question:**
Given a table tracking user viewership by device type (`laptop`, `tablet`, or `phone`), write a query to calculate:

* The total number of views from **laptops** as `laptop_views`
* The total number of views from **mobile devices** (defined as `tablet` + `phone`) as `mobile_views`

**Input Table: `viewership`**

| user\_id | device\_type | view\_time          |
| -------- | ------------ | ------------------- |
| 123      | tablet       | 01/02/2022 00:00:00 |
| 125      | laptop       | 01/07/2022 00:00:00 |
| 128      | laptop       | 02/09/2022 00:00:00 |
| 129      | phone        | 02/09/2022 00:00:00 |
| 145      | tablet       | 02/24/2022 00:00:00 |

**Expected Output:**

| laptop\_views | mobile\_views |
| ------------- | ------------- |
| 2             | 3             |

---
 ✅ SQL Solution:

```sql
SELECT
  COUNT(CASE WHEN device_type = 'laptop' THEN 1 END) AS laptop_views,
  COUNT(CASE WHEN device_type IN ('tablet', 'phone') THEN 1 END) AS mobile_views
FROM viewership;
```

---

✅ **Best For:** All dataset sizes – simple, efficient, and readable.
✅ Uses `CASE WHEN` inside `COUNT()` for concise conditional aggregation.


# New Section

# New Section

# New Section