|
| 1 | +--- |
| 2 | +comments: true |
| 3 | +difficulty: 中等 |
| 4 | +edit_url: https://github.com/doocs/leetcode/edit/main/solution/3400-3499/3497.Analyze%20Subscription%20Conversion/README.md |
| 5 | +tags: |
| 6 | + - 数据库 |
| 7 | +--- |
| 8 | + |
| 9 | +<!-- problem:start --> |
| 10 | + |
| 11 | +# [3497. 分析订阅转化](https://leetcode.cn/problems/analyze-subscription-conversion) |
| 12 | + |
| 13 | +[English Version](/solution/3400-3499/3497.Analyze%20Subscription%20Conversion/README_EN.md) |
| 14 | + |
| 15 | +## 题目描述 |
| 16 | + |
| 17 | +<!-- description:start --> |
| 18 | + |
| 19 | +<p>表:<code>UserActivity</code></p> |
| 20 | + |
| 21 | +<pre> |
| 22 | ++------------------+---------+ |
| 23 | +| Column Name | Type | |
| 24 | ++------------------+---------+ |
| 25 | +| user_id | int | |
| 26 | +| activity_date | date | |
| 27 | +| activity_type | varchar | |
| 28 | +| activity_duration| int | |
| 29 | ++------------------+---------+ |
| 30 | +(user_id, activity_date, activity_type) 是这张表的唯一主键。 |
| 31 | +activity_type 是('free_trial', 'paid', 'cancelled')中的一个。 |
| 32 | +activity_duration 是用户当天在平台上花费的分钟数。 |
| 33 | +每一行表示一个用户在特定日期的活动。 |
| 34 | +</pre> |
| 35 | + |
| 36 | +<p>订阅服务想要分析用户行为模式。公司提供7天免费试用,试用结束后,用户可以选择订阅 <strong>付费计划</strong> 或 <strong>取消</strong>。编写解决方案:</p> |
| 37 | + |
| 38 | +<ol> |
| 39 | + <li>查找从免费试用转为付费订阅的用户</li> |
| 40 | + <li>计算每位用户在 <strong>免费试用</strong> 期间的 <strong>平均每日活动时长</strong>(四舍五入至小数点后 <code>2</code> 位)</li> |
| 41 | + <li>计算每位用户在 <strong>付费</strong> 订阅期间的 <strong>平均每日活动时长</strong>(四舍五入到小数点后 <code>2</code> 位)</li> |
| 42 | +</ol> |
| 43 | + |
| 44 | +<p>返回结果表以<em> </em><code>user_id</code><em> </em><strong>升序 </strong>排序。</p> |
| 45 | + |
| 46 | +<p>结果格式如下所示。</p> |
| 47 | + |
| 48 | +<p> </p> |
| 49 | + |
| 50 | +<p><strong class="example">示例:</strong></p> |
| 51 | + |
| 52 | +<div class="example-block"> |
| 53 | +<p><strong>输入:</strong></p> |
| 54 | + |
| 55 | +<p>UserActivity 表:</p> |
| 56 | + |
| 57 | +<pre class="example-io"> |
| 58 | ++---------+---------------+---------------+-------------------+ |
| 59 | +| user_id | activity_date | activity_type | activity_duration | |
| 60 | ++---------+---------------+---------------+-------------------+ |
| 61 | +| 1 | 2023-01-01 | free_trial | 45 | |
| 62 | +| 1 | 2023-01-02 | free_trial | 30 | |
| 63 | +| 1 | 2023-01-05 | free_trial | 60 | |
| 64 | +| 1 | 2023-01-10 | paid | 75 | |
| 65 | +| 1 | 2023-01-12 | paid | 90 | |
| 66 | +| 1 | 2023-01-15 | paid | 65 | |
| 67 | +| 2 | 2023-02-01 | free_trial | 55 | |
| 68 | +| 2 | 2023-02-03 | free_trial | 25 | |
| 69 | +| 2 | 2023-02-07 | free_trial | 50 | |
| 70 | +| 2 | 2023-02-10 | cancelled | 0 | |
| 71 | +| 3 | 2023-03-05 | free_trial | 70 | |
| 72 | +| 3 | 2023-03-06 | free_trial | 60 | |
| 73 | +| 3 | 2023-03-08 | free_trial | 80 | |
| 74 | +| 3 | 2023-03-12 | paid | 50 | |
| 75 | +| 3 | 2023-03-15 | paid | 55 | |
| 76 | +| 3 | 2023-03-20 | paid | 85 | |
| 77 | +| 4 | 2023-04-01 | free_trial | 40 | |
| 78 | +| 4 | 2023-04-03 | free_trial | 35 | |
| 79 | +| 4 | 2023-04-05 | paid | 45 | |
| 80 | +| 4 | 2023-04-07 | cancelled | 0 | |
| 81 | ++---------+---------------+---------------+-------------------+ |
| 82 | +</pre> |
| 83 | + |
| 84 | +<p><strong>输出:</strong></p> |
| 85 | + |
| 86 | +<pre class="example-io"> |
| 87 | ++---------+--------------------+-------------------+ |
| 88 | +| user_id | trial_avg_duration | paid_avg_duration | |
| 89 | ++---------+--------------------+-------------------+ |
| 90 | +| 1 | 45.00 | 76.67 | |
| 91 | +| 3 | 70.00 | 63.33 | |
| 92 | +| 4 | 37.50 | 45.00 | |
| 93 | ++---------+--------------------+-------------------+ |
| 94 | +</pre> |
| 95 | + |
| 96 | +<p><strong>解释:</strong></p> |
| 97 | + |
| 98 | +<ul> |
| 99 | + <li><strong>用户 1:</strong> |
| 100 | + |
| 101 | + <ul> |
| 102 | + <li>体验了 3 天免费试用,时长分别为 45,30 和 60 分钟。</li> |
| 103 | + <li>平均试用时长:(45 + 30 + 60) / 3 = 45.00 分钟。</li> |
| 104 | + <li>拥有 3 天付费订阅,时长分别为 75,90 和 65分钟。</li> |
| 105 | + <li>平均花费市场:(75 + 90 + 65) / 3 = 76.67 分钟。</li> |
| 106 | + </ul> |
| 107 | + </li> |
| 108 | + <li><strong>用户 2:</strong> |
| 109 | + <ul> |
| 110 | + <li>体验了 3 天免费试用,时长分别为 55,25 和 50 分钟。</li> |
| 111 | + <li>平均试用时长:(55 + 25 + 50) / 3 = 43.33 分钟。</li> |
| 112 | + <li>没有转为付费订阅(只有 free_trial 和 cancelled 活动)。</li> |
| 113 | + <li>未包含在输出中,因为他未转换为付费用户。</li> |
| 114 | + </ul> |
| 115 | + </li> |
| 116 | + <li><strong>用户 3:</strong> |
| 117 | + <ul> |
| 118 | + <li>体验了 3 天免费试用,时长分别为 70,60 和 80 分钟。</li> |
| 119 | + <li>平均试用时长:(70 + 60 + 80) / 3 = 70.00 分钟。</li> |
| 120 | + <li>拥有 3 天付费订阅,时长分别为 50,55 和 85 分钟。</li> |
| 121 | + <li>平均花费时长:(50 + 55 + 85) / 3 = 63.33 分钟。</li> |
| 122 | + </ul> |
| 123 | + </li> |
| 124 | + <li><strong>用户 4:</strong> |
| 125 | + <ul> |
| 126 | + <li>体验了 2 天免费试用,时长分别为 40 和 35 分钟。</li> |
| 127 | + <li>平均试用时长:(40 + 35) / 2 = 37.50 分钟。</li> |
| 128 | + <li>在取消前有 1 天的付费订阅,时长为45分钟。</li> |
| 129 | + <li>平均花费时长:45.00 分钟。</li> |
| 130 | + </ul> |
| 131 | + </li> |
| 132 | + |
| 133 | +</ul> |
| 134 | + |
| 135 | +<p>结果表仅包括从免费试用转为付费订阅的用户(用户 1,3 和 4),并且以 user_id 升序排序。</p> |
| 136 | +</div> |
| 137 | + |
| 138 | +<!-- description:end --> |
| 139 | + |
| 140 | +## 解法 |
| 141 | + |
| 142 | +<!-- solution:start --> |
| 143 | + |
| 144 | +### 方法一:分组 + 条件筛选 + 等值连接 |
| 145 | + |
| 146 | +我们首先将表中的数据进行筛选,找出所有 `activity_type` 不等于 `cancelled` 的数据,将数据按照 `user_id` 和 `activity_type` 进行分组,求得每组的时长 `duration`,记录在表 `T` 中。 |
| 147 | + |
| 148 | +接下来,我们从表 `T` 中筛选出 `activity_type` 为 `free_trial` 和 `paid` 的记录,分别记录在表 `F` 和 `P` 中,最后将这两张表按照 `user_id` 进行等值连接,并按照题目要求筛选出对应的字段并排序,得到最终结果。 |
| 149 | + |
| 150 | +<!-- tabs:start --> |
| 151 | + |
| 152 | +#### MySQL |
| 153 | + |
| 154 | +```sql |
| 155 | +# Write your MySQL query statement below |
| 156 | +WITH |
| 157 | + T AS ( |
| 158 | + SELECT user_id, activity_type, ROUND(SUM(activity_duration) / COUNT(1), 2) duration |
| 159 | + FROM UserActivity |
| 160 | + WHERE activity_type != 'cancelled' |
| 161 | + GROUP BY user_id, activity_type |
| 162 | + ), |
| 163 | + F AS ( |
| 164 | + SELECT user_id, duration trial_avg_duration |
| 165 | + FROM T |
| 166 | + WHERE activity_type = 'free_trial' |
| 167 | + ), |
| 168 | + P AS ( |
| 169 | + SELECT user_id, duration paid_avg_duration |
| 170 | + FROM T |
| 171 | + WHERE activity_type = 'paid' |
| 172 | + ) |
| 173 | +SELECT user_id, trial_avg_duration, paid_avg_duration |
| 174 | +FROM |
| 175 | + F |
| 176 | + JOIN P USING (user_id) |
| 177 | +ORDER BY 1; |
| 178 | +``` |
| 179 | + |
| 180 | +#### Pandas |
| 181 | + |
| 182 | +```python |
| 183 | +import pandas as pd |
| 184 | + |
| 185 | + |
| 186 | +def analyze_subscription_conversion(user_activity: pd.DataFrame) -> pd.DataFrame: |
| 187 | + df = user_activity[user_activity["activity_type"] != "cancelled"] |
| 188 | + |
| 189 | + df_grouped = ( |
| 190 | + df.groupby(["user_id", "activity_type"])["activity_duration"] |
| 191 | + .mean() |
| 192 | + .add(0.0001) |
| 193 | + .round(2) |
| 194 | + .reset_index() |
| 195 | + ) |
| 196 | + |
| 197 | + df_free_trial = ( |
| 198 | + df_grouped[df_grouped["activity_type"] == "free_trial"] |
| 199 | + .rename(columns={"activity_duration": "trial_avg_duration"}) |
| 200 | + .drop(columns=["activity_type"]) |
| 201 | + ) |
| 202 | + |
| 203 | + df_paid = ( |
| 204 | + df_grouped[df_grouped["activity_type"] == "paid"] |
| 205 | + .rename(columns={"activity_duration": "paid_avg_duration"}) |
| 206 | + .drop(columns=["activity_type"]) |
| 207 | + ) |
| 208 | + |
| 209 | + result = df_free_trial.merge(df_paid, on="user_id", how="inner").sort_values( |
| 210 | + "user_id" |
| 211 | + ) |
| 212 | + |
| 213 | + return result |
| 214 | +``` |
| 215 | + |
| 216 | +<!-- tabs:end --> |
| 217 | + |
| 218 | +<!-- solution:end --> |
| 219 | + |
| 220 | +<!-- problem:end --> |
0 commit comments