#### ID 2141

```Amazon's information technology department is looking for information on employees' most recent logins. The output should include all information related to each employee's most recent login.```

In [None]:
%%sql
WITH cte
         AS (SELECT DENSE_RANK()
                    OVER (PARTITION BY worker_id ORDER BY login_timestamp DESC),
                    id,
                    worker_id,
                    login_timestamp,
                    ip_address,
                    country,
                    region,
                    city,
                    device_type
             FROM worker_logins)
SELECT id,
       worker_id,
       login_timestamp,
       ip_address,
       country,
       region,
       city,
       device_type
FROM cte
WHERE dense_rank = 1
ORDER BY id

In [None]:
df = worker_logins

df['rnk'] = df.groupby('worker_id')['login_timestamp'].rank(method='first', ascending=False)

df.query('rnk == 1').drop(columns=['rnk'])

#### ID 2142

```You've been asked by Amazon to find the shipment_id and weight of the third heaviest shipment. Output the shipment_id, and total_weight for that shipment_id. In the event of a tie, do not skip ranks.```

In [None]:
%%sql
WITH cte AS (SELECT shipment_id,
                    SUM(weight) AS total_weight,
                    DENSE_RANK() OVER (ORDER BY SUM(weight) DESC)
             FROM amazon_shipment
             GROUP BY shipment_id)
SELECT shipment_id, total_weight
FROM cte
WHERE dense_rank = 3

In [None]:
df = amazon_shipment

grouped_df = df.groupby('shipment_id', as_index=False).agg(total_weight=('weight', 'sum'))

grouped_df['rnk'] = grouped_df['total_weight'].rank(method='dense', ascending=False)

grouped_df.query('rnk == 3').drop(columns=['rnk'])

#### ID 2151

```You have been asked to find the number of employees hired between the months of January and July in the year 2022 inclusive. Your output should contain the number of employees hired in this given time frame.```

In [None]:
%%sql
SELECT COUNT(id) AS hired_emp
FROM employees
WHERE DATE_TRUNC('month', joining_date) BETWEEN '2022-01-01' AND '2022-07-01'

In [None]:
df = employees

df[df['joining_date'].dt.to_period('M').between('2022-01-01', '2022-07-01')]['id'].count()

#### ID 2156

```You have been tasked with finding the worker IDs of individuals who logged in between the 13th to the 19th inclusive of December 2021. In your output, provide the unique worker IDs for the dates requested.'```

In [None]:
%%sql
SELECT DISTINCT worker_id
FROM worker_logins
WHERE login_timestamp BETWEEN '2021-12-13' AND '2021-12-20'

In [None]:
df = worker_logins

df.query('login_timestamp.between("2021-12-13", "2021-12-19")')['worker_id'].drop_duplicates()

#### ID 2159

```You have been asked to get a list of all the sign up IDs with transaction start dates in either April or May. Since a sign up ID can be used for multiple transactions only output the unique ID. Your output should contain a list of non duplicated sign-up IDs.```

In [None]:
%%sql
SELECT DISTINCT signup_id
FROM transactions
WHERE DATE_TRUNC('month', transaction_start_date)::DATE IN ('2020-04-01', '2020-05-01')

In [None]:
df = transactions

df.query('transaction_start_date >= "2020-04-01" & transaction_start_date < "2020-06-01"')['signup_id'].unique()

#### ID 2160

```The sales division is investigating their sales for the past month in Oregon. Calculate the total revenue generated from Oregon-based customers for April.```

In [None]:
%%sql
SELECT SUM(cost_in_dollars * units_sold) AS total_sales
FROM online_orders AS oo
         JOIN online_customers AS oc ON oo.customer_id = oc.id
WHERE state ILIKE 'oregon'
  AND DATE_TRUNC('month', date) = '2022-04-01'

In [None]:
df = pd.merge(online_orders, online_customers, how='inner', left_on='customer_id', right_on='id')

df['sales'] = df['cost_in_dollars'] * df['units_sold']

df.query('state == "Oregon" & date.between("2022-04-01", "2022-04-30")')['sales'].sum()