Skip to content

vioxcd/sql-interviews-prep

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Doing SQL interview prep in various platforms

Mostly randomly doing problems here and there, and sometimes following this DE study guide

Note: this README is generated

Table of Contents

Easy

Stratascratch | Salaries Difference

Question: Write a query that calculates the difference between the highest salaries found in the marketing and engineering departments. Output just the absolute difference in salaries.

select
    abs(
        max(case when d.department = 'marketing' then e.salary else 0 end)
		- max(case when d.department = 'engineering' then e.salary else 0 end)
   ) salary_diff
from db_employee e
join db_dept d
	on e.department_id = d.id

Result

Medium

Leetcode | Consecutive Numbers

Question: Write an SQL query to find all numbers that appear at least three times consecutively. Return the result table in any order.

WITH stats AS (
  SELECT num,
         min(num) OVER three_consecutives AS min_num,
         max(num) OVER three_consecutives AS max_num,
         sum(num) OVER three_consecutives AS sum_num
  FROM Logs
  WINDOW three_consecutives AS
    (ORDER BY id ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
)
SELECT DISTINCT num AS ConsecutiveNums
FROM stats
WHERE min_num = max_num AND (num * 3) = sum_num

Result

Leetcode | Department Highest Salary

Question: Write an SQL query to find employees who have the highest salary in each of the departments. Return the result table in any order.

WITH salary_ranked AS (
  SELECT e.id
         , rank() OVER (PARTITION BY e.departmentId ORDER BY e.salary DESC) AS salary_rank
  FROM Employee e
)
SELECT d.name AS Department
       , e.name AS Employee
       , e.salary AS Salary
FROM Employee e
INNER JOIN Department d
  ON e.departmentId = d.id
INNER JOIN salary_ranked sr
  ON e.id = sr.id
WHERE sr.salary_rank = 1
ORDER BY e.name

Result

Leetcode | Exchange Seats

Question: (Broken test case). Write an SQL query to swap the seat id of every two consecutive students. If the number of students is odd, the id of the last student is not swapped. Return the result table ordered by id in ascending order.

with
id_mod_id as (
  select id, student, id + mod(id, 2) as cc
  from seat
)
select
  row_number() over (order by cc, id desc) as id, student
from id_mod_id

Result

Leetcode | Game Play Analysis Iv

Question: Write an SQL query to report the fraction of players that logged in again on the day after the day they first logged in, rounded to 2 decimal places. In other words, you need to count the number of players that logged in for at least two consecutive days starting from their first login date, then divide that number by the total number of players.

with
earliest_login as (
    select
        player_id,
        date(str_to_date(min(event_date), '%Y-%m-%d')) as event_date
    from activity
    group by 1
)

select
    round(count(1) / (select count(1) from earliest_login), 2) as fraction
from earliest_login a1
join activity a2
    on a1.player_id = a2.player_id
        and a1.event_date = a2.event_date - INTERVAL 1 DAY

Result

Leetcode | Nth Highest Salary

Question: Write an SQL query to report the nth highest salary from the Employee table. If there is no nth highest salary, the query should report null.

CREATE FUNCTION getNthHighestSalary(N INT) RETURNS INT
BEGIN
  RETURN (
      select distinct salary
      from (
          select
            salary,
            dense_rank() over (order by salary desc) as row_num
          from
            employee
     ) as t
      where row_num = N
 );
END

Result

Leetcode | Rank Scores

Question: Write an SQL query to rank the scores. The ranking should be calculated according to the following rules: 1) The scores should be ranked from the highest to the lowest. 2) If there is a tie between two scores, both should have the same ranking. 3) After a tie, the next ranking number should be the next consecutive integer value. In other words, there should be no holes between ranks. Return the result table ordered by score in descending order.

SELECT score,
       dense_rank() OVER (ORDER BY score DESC) AS "rank"
FROM Scores

Result

Leetcode | Second Highest Salary

Question: Write an SQL query to report the second highest salary from the Employee table. If there is no second highest salary, the query should report null.

SELECT MAX(salary) AS "SecondHighestSalary"
FROM Employee
WHERE salary NOT IN (
  SELECT MAX(salary)
  FROM Employee
)

Result

Stratascratch | Acceptance Rate By Date

Question: What is the overall friend acceptance rate by date? Your output should have the rate of acceptances by the date the request was sent. Order by the earliest date to latest. Assume that each friend request starts by a user sending (i.e., user_id_sender) a friend request to another user (i.e., user_id_receiver) that's logged in the table with action = 'sent'. If the request is accepted, the table logs action = 'accepted'. If the request is not accepted, no record of action = 'accepted' is logged.

with
count_per_date AS (
    select
        date,
        sum(
            case when action = 'sent' then 1 else 0 end
       ) as fr_sent,
        sum(
            case when action = 'sent' and lead is not null then 1 else 0 end
       ) as fr_accepted
    from (
        select
            *,
            lead(action) over (partition by user_id_sender, user_id_receiver)
        from fb_friend_requests fr
   ) t
    group by date
)
select
    date,
    fr_accepted::float / fr_sent::float as percentage_acceptance
from count_per_date
where fr_accepted > 0
order by date asc

Result

Stratascratch | Customer Revenue In March

Question: Calculate the total revenue from each customer in March 2019. Include only customers who were active in March 2019. Output the revenue along with the customer id and sort the results based on the revenue in descending order.

select
    cust_id
    ,sum(total_order_cost) as revenue
from orders
where extract(month from order_date) = 3
group by 1
order by 2 desc

Result

Stratascratch | Finding User Purchases

Question: Write a query that'll identify returning active users. A returning active user is a user that has made a second purchase within 7 days of any other of their purchases. Output a list of user_ids of these returning active users.

with time_lag as (
    select user_id,
        case when(created_at - LAG(created_at)
            over(partition by user_id order by created_at) <= 7)
            then True
            else False
        end as is_returning
    from amazon_transactions
    order by user_id, created_at
)
select distinct user_id
from time_lag
where is_returning
order by 1

Result

Stratascratch | Highest Cost Orders

Question: Find the customer with the highest daily total order cost between 2019-02-01 to 2019-05-01. If customer had more than one order on a certain day, sum the order costs on daily basis. Output customer's first name, total cost of their items, and the date. For simplicity, you can assume that every first name in the dataset is unique.

with
calculated_order_cost as (
    select
        c.first_name
        ,o.order_date
        ,sum(o.total_order_cost) as total_order_cost
    from customers c
    join orders o
        on c.id = o.cust_id
    where o.order_date between '2019-02-01' and '2019-05-01'
    group by c.first_name, o.order_date
)

select
    first_name
    ,total_order_cost as highest_total_order_cost
    ,order_date
from calculated_order_cost
where total_order_cost = (select max(total_order_cost) from calculated_order_cost)

Result

Stratascratch | Highest Energy Consumption

Question: Find the date with the highest total energy consumption from the Meta/Facebook data centers. Output the date along with the total energy consumption across all data centers.

with
all_region as (
    select
        date,
        sum(consumption) as total_consumption
    from (
        select * from fb_eu_energy
        union all
        select * from fb_asia_energy
        union all
        select * from fb_na_energy
   ) t
    group by date
)
select
    date,
    total_consumption as highest_consumption
from all_region
where total_consumption = (select max(total_consumption) from all_region)

Result

Stratascratch | Most Profitable Companies

Question: Find the 3 most profitable companies in the entire world. Output the result along with the corresponding company name. Sort the result based on profits in descending order.

select company, sum(profits) as profits
from forbes_global_2010_2014
group by 1
order by 2 desc
limit 3

Result

Stratascratch | Number Of Violations

Question: You're given a dataset of health inspections. Count the number of violation in an inspection in 'Roxanne Cafe' for each year. If an inspection resulted in a violation, there will be a value in the 'violation_id' column. Output the number of violations by year in ascending order.

select
    extract(year from inspection_date) as year,
    count(violation_id) as number_of_violations
from sf_restaurant_health_violations
where business_name = 'Roxanne Cafe'
group by 1
order by 1 asc

Result

Stratascratch | Reviews Of Categories

Question: Find the top business categories based on the total number of reviews. Output the category along with the total number of reviews. Order by total reviews in descending order.

;select
    unnest(string_to_array(categories, ';')) as category
    ,sum(review_count) as category_review_count
from yelp_business
group by 1
order by 2 desc

Result

Stratascratch | Users By Average Session Time

Question: Calculate each user's average session time. A session is defined as the time difference between a page_load and page_exit. For simplicity, assume a user has only 1 session per day and if there are multiple of the same events on that day, consider only the latest page_load and earliest page_exit, with an obvious restriction that load time event should happen before exit time event . Output the user_id and their average session time.

WITH user_page_load AS (
    SELECT user_id,
        DATE(timestamp),
        MAX(timestamp) last_page_load
    FROM facebook_web_log
    WHERE action = 'page_load'
    GROUP BY 1, 2
), user_page_exit AS (
    SELECT user_id,
        DATE(timestamp),
        MAX(timestamp) AS last_page_exit
    FROM facebook_web_log
    WHERE action = 'page_exit'
    GROUP BY 1, 2
)
SELECT user_id, AVG(last_page_exit - last_page_load)
FROM user_page_load
JOIN user_page_exit
USING (user_id, date)
GROUP BY user_id

Result

Stratascratch | Workers With The Highest Salaries

Question: You have been asked to find the job titles of the highest-paid employees. Your output should include the highest-paid title or multiple titles with the same salary.

select
	worker_title as best_paid_title
from worker
join title
	on worker.worker_id = title.worker_ref_id
where salary = (
    select max(salary)
    from worker
)

Result

Hackerrank | Binary Search Tree 1

Question: You are given a table, BST, containing two columns: N and P, where N represents the value of a node in Binary Tree, and P is the parent of N. Write a query to find the node type of Binary Tree ordered by the value of the node. Output one of the following for each node: Root: If node is root node. Leaf: If node is leaf node. Inner: If node is neither root nor leaf node.

select distinct
    bst1.n,
    case
        when bst1.p is null then 'Root'
        when bst2.p is null then 'Leaf'
        else 'Inner'
    end as node_type
from bst bst1
left join bst bst2
    on bst1.n = bst2.p
order by bst1.n

Result

Hackerrank | Challenges

Question: Write a query to print the hacker_id, name, and the total number of challenges created by each student. Sort your results by the total number of challenges in descending order. If more than one student created the same number of challenges, then sort the result by hacker_id. If more than one student created the same number of challenges and the count is less than the maximum number of challenges created, then exclude those students from the result.

with
calculated_hacker_challenges as (
    select
        hacker_id, name, count(1) as total_challenges
    from hackers h
    join challenges c
        using (hacker_id)
    group by 1, 2
),

grouped_total_challenges as (
    select total_challenges
    from calculated_hacker_challenges
    where total_challenges < (
        select max(total_challenges) from calculated_hacker_challenges
   )
    group by total_challenges
    having count(1) > 1
)

select *
from calculated_hacker_challenges
where
    total_challenges not in (
        select total_challenges from grouped_total_challenges
   )
order by total_challenges desc, hacker_id asc

Result

Hackerrank | Harry Potter And Wands

Question: Hermione decides the best way to choose is by determining the minimum number of gold galleons needed to buy each non-evil wand of high power and age. Write a query to print the id, age, coins_needed, and power of the wands that Ron's interested in, sorted in order of descending power. If more than one wand has same power, sort the result in order of descending age.

select
    w.id, wp.age, w.coins_needed, w.power
from wands w
left join wands_property wp
    using(code)
where wp.is_evil = 0
    and w.coins_needed = (
        select min(w1.coins_needed)
        from wands w1
        where w1.code = w.code
            and w1.power = w.power
   )
order by w.power desc, wp.age desc

Result

Hackerrank | Occupations

Question: Pivot the Occupation column in OCCUPATIONS so that each Name is sorted alphabetically and displayed underneath its corresponding Occupation. The output column headers should be Doctor, Professor, Singer, and Actor, respectively. Note: Print NULL when there are no more names corresponding to an occupation.

SELECT
  MAX(CASE WHEN Occupation = 'Doctor' THEN Name ELSE NULL END) AS Doctor,
  MAX(CASE WHEN Occupation = 'Professor' THEN Name ELSE NULL END) AS Professor,
  MAX(CASE WHEN Occupation = 'Singer' THEN Name ELSE NULL END) AS Singer,
  MAX(CASE WHEN Occupation = 'Actor' THEN Name ELSE NULL END) AS Actor
FROM
  (SELECT
    Occupation,
    Name,
    ROW_NUMBER() OVER (PARTITION BY Occupation ORDER BY Name) rn
   FROM OCCUPATIONS
 ) o
GROUP BY rn

Result

Hackerrank | Placements

Question: You are given three tables: Students, Friends and Packages. Students contains two columns: ID and Name. Friends contains two columns: ID and Friend_ID (ID of the ONLY best friend). Packages contains two columns: ID and Salary. Write a query to output the names of those students whose best friends got offered a higher salary than them. Names must be ordered by the salary amount offered to the best friends. It is guaranteed that no two students got same salary offer.

select s.name
from students s  /* the student's name */
join friends f  /* their friends' id */
    on s.id = f.id
join packages ps  /* the student's salary */
    on s.id = ps.id
join packages pf  /* the student's friends' salary */
    on f.friend_id = pf.id
where pf.salary > ps.salary
order by pf.salary

Result

Hackerrank | Print Prime Numbers

Question: Write a query to print all prime numbers less than or equal to 1000. Print your result on a single line, and use the ampersand (DESCRIPTION) character as your separator (instead of a space).

/* use Oracle */
with
x as (
    select level + 1 x
    from dual
    connect by level <= 1000
)
select

    listagg(x.x, 'QUERY') within group (order by x.x) as prime_numbers
from x
where not exists (
    select 1 from x y
    where x.x > y.x and remainder( x.x, y.x) = 0
)

Result

Hackerrank | Symmetric Pairs

Question: You are given a table, Functions, containing two columns: X and Y. Two pairs (X1, Y1) and (X2, Y2) are said to be symmetric pairs if X1 = Y2 and X2 = Y1. Write a query to output all such symmetric pairs in ascending order by the value of X. List the rows such that X1 ≤ Y1.

with
with_id as (
    select
        *,
        row_number() over () as id
    from functions
)
select distinct least(f1.X, f2.X) as X, greatest(f2.X, f2.Y) as Y
from with_id f1
join with_id f2
    on f1.X = f2.Y
        and f1.Y = f2.X
        and f1.id != f2.id
order by 1, 2

Result

Hackerrank | The Pads

Question: Generate the following two result sets: 1) Query an alphabetically ordered list of all names in OCCUPATIONS, immediately followed by the first letter of each profession as a parenthetical (i.e.: enclosed in parentheses). For example: AnActorName(A), ADoctorName(D), AProfessorName(P), and ASingerName(S). 2) Query the number of ocurrences of each occupation in OCCUPATIONS. Sort the occurrences in ascending order, and output them in the following format: "There are a total of [occupation_count] [occupation]s." where [occupation_count] is the number of occurrences of an occupation in OCCUPATIONS and [occupation] is the lowercase occupation name. If more than one Occupation has the same [occupation_count], they should be ordered alphabetically.

SELECT CONCAT(Name, '(', LEFT(Occupation, 1), ')') AS t
FROM OCCUPATIONS
UNION ALL
SELECT CONCAT("There are a total of ", COUNT(Occupation), " ", LOWER(Occupation), "s.") AS t
FROM OCCUPATIONS
GROUP BY Occupation
ORDER BY t

Result

Hackerrank | The Report

Question: Given two table Students and Grade, generate a report containing three columns: Name, Grade and Mark. The report must be in descending order by grade -- i.e. higher grades are entered first. If there is more than one student with the same grade (8-10) assigned to them, order those particular students by their name alphabetically. Finally, if the grade is lower than 8, use "NULL" as their name and list them by their grades in descending order. If there is more than one student with the same grade (1-7) assigned to them, order those particular students by their marks in ascending order.

select
    case when g.grade < 8 then 'NULL' else s.name end as name,
    g.grade,
    s.marks
from students s
    join grades g
    on s.marks between g.min_mark and g.max_mark
order by g.grade desc, s.name, s.marks

Result

Hackerrank | Weather Observation 18

Question: Consider P1(a, b) and P2(c, d) to be two points on a 2D plane: a happens to equal the minimum value in Northern Latitude (LAT_N in STATION). b happens to equal the minimum value in Western Longitude (LONG_W in STATION). c happens to equal the maximum value in Northern Latitude (LAT_N in STATION). d happens to equal the maximum value in Western Longitude (LONG_W in STATION). Query the Manhattan Distance between points P1 and P2 and and round it to a scale of 4 decimal places.

SELECT
    ROUND(
        ABS(MAX(LAT_N) - MIN(LAT_N)) + ABS(MAX(LONG_W) - MIN(LONG_W))
    , 4)
FROM station

Result

Hackerrank | Weather Observation 19

Question: Consider P1(a, b) and P2(c, d) to be two points on a 2D plane: a happens to equal the minimum value in Northern Latitude (LAT_N in STATION). b happens to equal the minimum value in Western Longitude (LONG_W in STATION). c happens to equal the maximum value in Northern Latitude (LAT_N in STATION). d happens to equal the maximum value in Western Longitude (LONG_W in STATION). Query the Eucliedean Distance between points P1 and P2 and and round it to a scale of 4 decimal places.

SELECT
    /*
    MIN(LAT_N) AS a,
    MAX(LAT_N) AS b,
    MIN(LONG_W) AS c,
    MAX(LONG_W) AS d
    */
    ROUND(
        SQRT(
            POW(MAX(LAT_N) - MIN(LAT_N), 2)
            +
            POW(MAX(LONG_W) - MIN(LONG_W), 2)
       )
        , 4
   )
FROM station

Result

Hard

Leetcode | Department Top Three Salaries

Question: A company's executives are interested in seeing who earns the most money in each of the company's departments. A high earner in a department is an employee who has a salary in the top three unique salaries for that department. Write an SQL query to find the employees who are high earners in each of the departments. Return the result table in any order.

WITH salary_ranks AS (
  SELECT d.name AS Department,
         e.name AS Employee,
         e.salary AS Salary,
         dense_rank() OVER (PARTITION BY e.departmentId ORDER BY e.salary DESC) ranks
  FROM Employee e
    INNER JOIN Department d
    ON e.departmentId = d.id
)
SELECT Department, Employee, Salary
FROM salary_ranks
WHERE ranks <= 3
ORDER BY Department, Salary

Result

Leetcode | Human Traffic Of Stadium

Question: Write an SQL query to display the records with three or more rows with consecutive id's, and the number of people is greater than or equal to 100 for each. Return the result table ordered by visit_date in ascending order.

WITH
filtered AS (
    SELECT
        *
    FROM Stadium
    WHERE people >= 100
),
ordered_id AS (
    SELECT
        *,
        id - ROW_NUMBER() OVER (ORDER BY id) AS id_ordering
    FROM filtered
)
SELECT id, visit_date, people
FROM ordered_id
WHERE id_ordering IN (
    SELECT id_ordering
    FROM ordered_id
    GROUP BY 1
    HAVING COUNT(1) >= 3
)

Result

Leetcode | Trips And Users

Question: The cancellation rate is computed by dividing the number of canceled (by client or driver) requests with unbanned users by the total number of requests with unbanned users on that day. Write a SQL query to find the cancellation rate of requests with unbanned users (both client and driver must not be banned) each day between "2013-10-01" and "2013-10-03". Round Cancellation Rate to two decimal points. Return the result table in any order.

SELECT
    t.request_at AS Day,
    ROUND(
        SUM(
            CASE
                WHEN status = 'completed' THEN 0
                ELSE 1
            END
       ) / COUNT(*)
    , 2) AS 'Cancellation Rate'
FROM Trips t
    INNER JOIN Users uc
        ON t.client_id = uc.users_id
    INNER JOIN Users ud
        ON t.driver_id = ud.users_id
WHERE
    uc.banned = 'No' AND ud.banned = 'No'
    AND uc.role = 'client' AND ud.role = 'driver'
    AND request_at BETWEEN '2013-10-01' AND '2013-10-03'
GROUP BY
    request_at
ORDER BY
    Day ASC

Result

Stratascratch | Host Popularity Rental Prices

Question: You’re given a table of rental property searches by users. The table consists of search results and outputs host information for searchers. Find the minimum, average, maximum rental prices for each host’s popularity rating. The host’s popularity rating is defined as below: 0 reviews: New, 1 to 5 reviews: Rising, 6 to 15 reviews: Trending Up, 16 to 40 reviews: Popular, more than 40 reviews: Hot. Tip: The id column in the table refers to the search ID. You'll need to create your own host_id by concating price, room_type, host_since, zipcode, and number_of_reviews. Output host popularity rating and their minimum, average and maximum rental prices.

with
host_popularity_details as (
    select
        case
            when number_of_reviews = 0 then 'New'
            when number_of_reviews <= 5 then 'Rising'
            when number_of_reviews <= 15 then 'Trending Up'
            when number_of_reviews <= 40 then 'Popular'
            else 'Hot'
        end as host_popularity
        ,price
    from (
        select distinct
            price,
            room_type,
            host_since,
            zipcode,
            number_of_reviews
        from airbnb_host_searches
   ) t
)
select
    host_popularity
    ,min(price) as min_price
    ,avg(price) as avg_price
    ,max(price) as max_price
from host_popularity_details
group by host_popularity

Result

Stratascratch | Marketing Campaign Success

Question: You have a table of in-app purchases by user. Users that make their first in-app purchase are placed in a marketing campaign where they see call-to-actions for more in-app purchases. Find the number of users that made additional in-app purchases due to the success of the marketing campaign. The marketing campaign doesn't start until one day after the initial in-app purchase so users that only made one or multiple purchases on the first day do not count, nor do we count users that over time purchase only the products they purchased on the first day.

with
known_first_purchase_date as (
    select
        user_id
        ,created_at
        ,product_id
        ,first_value(created_at) over (partition by user_id
                                        order by created_at)
                                        as first_purchase_date
    from marketing_campaign
),

products_bought_in_first_purchase_date as (
    select
        user_id
        ,array_agg(product_id) as bought_products
    from known_first_purchase_date
    where created_at = first_purchase_date
    group by user_id
)

select
    count (distinct user_id)
from known_first_purchase_date fpd
    join products_bought_in_first_purchase_date pfpd
    using (user_id)
where
    fpd.created_at > fpd.first_purchase_date
    and not product_id = any(bought_products)

Result

Stratascratch | Monthly Percentage Difference

Question: Given a table of purchases by date, calculate the month-over-month percentage change in revenue. The output should include the year-month date (YYYY-MM) and percentage change, rounded to the 2nd decimal point, and sorted from the beginning of the year to the end of the year. The percentage change column will be populated from the 2nd month forward and can be calculated as ((this month's revenue - last month's revenue) / by last month's revenue) * 100.

with
augmented_transactions as (
    select
        date_trunc('month', created_at) as year_month
        ,sum(value) as revenue
        ,lag(sum(value)) over (order by date_trunc('month', created_at)) as last_month_revenue
    from sf_transactions
    group by date_trunc('month', created_at)
)

select
    concat_ws('-', extract(year from year_month), lpad(extract(month from year_month)::text, 2, '0')) as year_month_format
    ,round(
        ((revenue - last_month_revenue)::numeric / last_month_revenue) * 100, 2
   ) as revenue_diff_pct
from augmented_transactions
order by year_month

Result

Stratascratch | Popularity Percentage

Question: Find the popularity percentage for each user on Meta/Facebook. The popularity percentage is defined as the total number of friends the user has divided by the total number of users on the platform, then converted into a percentage by multiplying by 100. Output each user along with their popularity percentage. Order records in ascending order by user id. The 'user1' and 'user2' column are pairs of friends.

with
reordered_ff as (
    select user1, user2 from facebook_friends
    union
    select user2 as user1, user1 as user2 from facebook_friends
)

select
    user1,
    (count(1)::numeric /
        (select count(distinct user1) from reordered_ff)) * 100
        as popularity_percentage
from reordered_ff
group by 1
order by 1

Result

Stratascratch | Premium Vs Freemium

Question: Find the total number of downloads for paying and non-paying users by date. Include only records where non-paying customers have more downloads than paying customers. The output should be sorted by earliest date first and contain 3 columns date, non-paying downloads, paying downloads.

with
download_fct as (
    select
        date
        ,sum(case when paying_customer = 'no' then downloads else 0 end) as non_paying
        ,sum(case when paying_customer = 'yes' then downloads else 0 end) as paying
    from ms_user_dimension u
        join ms_acc_dimension a
            on u.acc_id = a.acc_id
        join ms_download_facts d
        on u.user_id = d.user_id
    group by date
    order by date asc
)
select *
from download_fct
where non_paying > paying

Result

Others

CMU Database SQL Assignment

A homework from Carnegie Mellon University Database Course (Fall 2021). I used this to refresh my SQL knowledge and certain advanced SQL features I didn't know before. Also, learned to use SQLite which I found to be really useful later on. There's a video lecture to learn the materials before working on this homework

Caltech DE Exercises

Exercises from Caltech Data Engineering course

It covers several case study involving basic data cleaning and transformation using SQL and NoSQL, investigating query performance, and dealing with nested data with more advanced analysis

Stratascratch Business Analysis

A business analysis case from Stratascratch, Insights from Failed Orders

This one's not an SQL project, although I'm sure you can use SQL to solve this. (I use pyspark for this)

The analysis uses Gett data, previously known as GetTaxi, a technology platform focused on corporate Ground Transportation Management

The analysis focuses on reason for orders' failure, for example, what's the reason? Is there certain hours when failure happens? It also goes into average what's the average cancellation time of orders? What's the estimated time arrival (ETA) by the hours?

With a bonus question about plotting failures in the map. Here's how my map looks like:

A choropleth map built using H3 and Folium library

My analysis can accessed through this colab notebook

About

SQL learning and interview preps, etc.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages