**ID 10365**

```You are analyzing a social network dataset at Google. Your task is to find mutual friends between two users, Karl and Hans. There is only one user named Karl and one named Hans in the dataset. The output should contain 'user_id' and 'user_name' columns.```

In [None]:
%%sql
WITH cte AS (SELECT *
             FROM (SELECT friend_id
                   FROM friends
                   WHERE user_id IN (SELECT user_id
                                     FROM users
                                     WHERE user_name = 'Karl')) AS t1
             INTERSECT
             SELECT *
             FROM (SELECT friend_id
                   FROM friends
                   WHERE user_id IN (SELECT user_id
                                     FROM users
                                     WHERE user_name = 'Hans')) AS t2)
SELECT user_id, user_name
FROM users
WHERE user_id IN (SELECT * FROM cte)

**ID 10366**

```Capital One's marketing team is working on a project to analyze customer feedback from their feedback surveys. The team sorted the words from the feedback into three different categories: short_comments, mid_length_comments, long_comments. The team wants to find comments that are not short and that come from social media. The output should include 'feedback_id,' 'feedback_text,' 'source_channel,' and a calculated category```

In [None]:
%%sql
SELECT DISTINCT feedback_id,
                feedback_text,
                source_channel,
                comment_category
FROM customer_feedback
WHERE comment_category != 'short_comments'
  AND source_channel IN ('social_media')

In [None]:
df = customer_feedback

df.query('source_channel.isin(["social_media"]) & comment_category != "short_comments"').drop_duplicates()

**ID 10367**

```You're tasked with analyzing a Spotify-like dataset that captures user listening habits. For each user, calculate the total listening time and the count of unique songs they've listened to. In the database duration values are displayed in seconds. Round the total listening duration to the nearest whole minute. The output should contain three columns: 'user_id', 'total_listen_duration', and 'unique_song_count'.```

In [None]:
%%sql
SELECT user_id,
       ROUND(SUM(listen_duration) / 60) AS total_listen_duration,
       COUNT(DISTINCT song_id)          AS unique_song_count
FROM listening_habits
GROUP BY user_id

In [None]:
df = listening_habits
result = df.groupby('user_id', as_index=False).agg(total_listen_duration=('listen_duration', 'sum'), unique_song_count = ('song_id', 'nunique'))
result['total_listen_duration'] = result['total_listen_duration'].apply(lambda x: round(x / 60))
result

**ID 10368**

```You are working on a data analysis project at Deloitte where you need to analyze a dataset containing information about various cities. Your task is to calculate the population density of these cities, rounded to the nearest integer, and identify the cities with the minimum and maximum densities. The population density should be calculated as (Population / Area). The output should contain 'city', 'country', 'density'.```

In [None]:
%%sql
WITH cte AS (SELECT city,
                    country,
                    population / NULLIF(area, 0)                               AS density,
                    RANK() OVER (ORDER BY (population / NULLIF(area, 0)))      AS low_rnk,
                    RANK() OVER (ORDER BY (population / NULLIF(area, 0)) DESC) AS high_rnk
             FROM cities_population
             WHERE area > 0)
SELECT city,
       country,
       density
FROM cte
WHERE low_rnk = 1
   OR high_rnk = 1

In [None]:
df = cities_population
df = df.query('area > 0')

df['density'] = df['population'] / df['area']
df['low_rank'] = df['density'].rank(method='min')
df['high_rank'] = df['density'].rank(method='min', ascending=False)
df.query('low_rank == 1 | high_rank == 1')[['city', 'country', 'density']]