## Task 1. Carry out RFM analysis (Practical task)

In [1]:
import pandas as pd
import numpy as np

In [2]:
orders = pd.read_csv('Purchases.csv', sep=';')

In [3]:
orders.head()

Unnamed: 0,user_id_1,Event Time,Event Name,Event Revenue USD
0,5308,12.12.2018,af_purchase,0.762492
1,1314,12.12.2018,af_purchase,0.265333
2,3109,12.12.2018,af_purchase,1.332
3,6288,12.12.2018,af_purchase,0.311221
4,1842,12.12.2018,af_purchase,1.514547


In [4]:
orders.dtypes

user_id_1              int64
Event Time            object
Event Name            object
Event Revenue USD    float64
dtype: object

In [5]:
orders['Event Name'] = orders['Event Name'].astype('str')
orders['Event Time'] = pd.to_datetime(orders['Event Time'])

In [6]:
orders.dtypes

user_id_1                     int64
Event Time           datetime64[ns]
Event Name                   object
Event Revenue USD           float64
dtype: object

In [7]:
last_date = orders["Event Time"].max()
last_date

Timestamp('2019-12-06 00:00:00')

### Расчет Recency, Frequency, Monetary Value

In [8]:
rfm_table = orders.groupby('user_id_1').agg({'Event Time': lambda x: (last_date - x.max()).days,
                                               # Recency #Number of days since last order
                                           'Event Name': lambda x: len(x),
                                               # Frequency #Number of events
                                           'Event Revenue USD': lambda x: x.sum()})
                                               # Monetary Value #Total amount for all events

rfm_table['Event Time'] = rfm_table['Event Time'].astype(int)
rfm_table.rename(columns={'Event Time': 'recency',
                         'Event Name': 'frequency',
                         'Event Revenue USD': 'monetary_value'}, inplace=True)

In [9]:
rfm_table.head()

Unnamed: 0_level_0,recency,frequency,monetary_value
user_id_1,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,336,18,16.572877
2,190,12,9.621055
3,236,26,23.467281
4,275,26,18.621195
5,246,21,18.836892


### Dividing clients into segments (as specified in the task)

In [10]:
recency_labels = ['recently_paid', 'paid_a_while_ago', 'paid_long_ago']
recency_bins = [0, 10, 30, np.inf]
rfm_table['RecencySegment'] = pd.cut(rfm_table['recency'], bins=recency_bins, labels=recency_labels)

frequency_labels = ['only_1_payment', 'pay_rarely', 'pay_sometimes', 'pay_frequently']
frequency_bins = [1, 2, 5, 20, np.inf]
rfm_table['FrequencySegment'] = pd.cut(rfm_table['frequency'], bins=frequency_bins, labels=frequency_labels)

monetary_labels = ['minnows', 'dolphins', 'whales', 'grand_whales']
monetary_bins = [0, 20, 100, 500, np.inf]
rfm_table['MonetarySegment'] = pd.cut(rfm_table['monetary_value'], bins=monetary_bins, labels=monetary_labels)

In [11]:
rfm_table.head()

Unnamed: 0_level_0,recency,frequency,monetary_value,RecencySegment,FrequencySegment,MonetarySegment
user_id_1,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1,336,18,16.572877,paid_long_ago,pay_sometimes,minnows
2,190,12,9.621055,paid_long_ago,pay_sometimes,minnows
3,236,26,23.467281,paid_long_ago,pay_frequently,dolphins
4,275,26,18.621195,paid_long_ago,pay_frequently,minnows
5,246,21,18.836892,paid_long_ago,pay_frequently,minnows


### Create a new column with a combination of RFM categories

In [12]:
rfm_table['RFM'] = rfm_table['RecencySegment'].astype('str') + ' / ' + rfm_table['FrequencySegment'].astype('str') \
                                                        + ' / ' + rfm_table['MonetarySegment'].astype('str')

In [13]:
rfm_table.head()

Unnamed: 0_level_0,recency,frequency,monetary_value,RecencySegment,FrequencySegment,MonetarySegment,RFM
user_id_1,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
1,336,18,16.572877,paid_long_ago,pay_sometimes,minnows,paid_long_ago / pay_sometimes / minnows
2,190,12,9.621055,paid_long_ago,pay_sometimes,minnows,paid_long_ago / pay_sometimes / minnows
3,236,26,23.467281,paid_long_ago,pay_frequently,dolphins,paid_long_ago / pay_frequently / dolphins
4,275,26,18.621195,paid_long_ago,pay_frequently,minnows,paid_long_ago / pay_frequently / minnows
5,246,21,18.836892,paid_long_ago,pay_frequently,minnows,paid_long_ago / pay_frequently / minnows


In [14]:
rfm_table['RecencySegment'].str.count('paid_long_ago').sum()

84519.0

### Display a table counting the number of users for each combination of RFM categories

In [15]:
rfm_count = pd.DataFrame(rfm_table['RFM'].value_counts())
rfm_count.reset_index(inplace=True)
rfm_count.columns = ['RFM', 'Count']

In [16]:
rfm_count

Unnamed: 0,RFM,Count
0,paid_long_ago / pay_rarely / minnows,37131
1,paid_long_ago / pay_sometimes / minnows,30727
2,paid_long_ago / only_1_payment / minnows,6671
3,recently_paid / pay_sometimes / minnows,6289
4,recently_paid / pay_rarely / minnows,4605
5,paid_long_ago / pay_frequently / dolphins,3530
6,paid_long_ago / pay_sometimes / dolphins,2767
7,paid_a_while_ago / pay_sometimes / minnows,783
8,paid_long_ago / pay_frequently / minnows,759
9,paid_a_while_ago / pay_rarely / minnows,580


In [17]:
total_count = rfm_count['Count'].sum()
rfm_count['Percentage'] = (rfm_count['Count'] / total_count) * 100

In [18]:
rfm_count

Unnamed: 0,RFM,Count,Percentage
0,paid_long_ago / pay_rarely / minnows,37131,39.143773
1,paid_long_ago / pay_sometimes / minnows,30727,32.392629
2,paid_long_ago / only_1_payment / minnows,6671,7.032617
3,recently_paid / pay_sometimes / minnows,6289,6.62991
4,recently_paid / pay_rarely / minnows,4605,4.854625
5,paid_long_ago / pay_frequently / dolphins,3530,3.721352
6,paid_long_ago / pay_sometimes / dolphins,2767,2.916992
7,paid_a_while_ago / pay_sometimes / minnows,783,0.825444
8,paid_long_ago / pay_frequently / minnows,759,0.800143
9,paid_a_while_ago / pay_rarely / minnows,580,0.61144


### Conclusion

Based on the analysis data, the following conclusions and suggestions can be made:

The majority of customers (71.57%) have not made purchases for a long time or rarely make them. It is necessary to develop loyalty and motivation programs to increase the activity of these customers.

4.85% of customers rarely make purchases but have recently made a payment. They need to be retained and encouraged to make repeat purchases.

7.03% of customers have made only one payment and are not active. You need to contact them and find out the reasons for the lack of activity.

Customers who make regular purchases and are active are the most valuable to a business and should be the target audience for retention efforts.

Customers who shop frequently and infrequently may require additional marketing efforts to encourage repeat purchases and retention.

Customers who are recent and active purchasers may require additional efforts to retain and encourage repeat purchases.

A small percentage of customers (0.12%) shop frequently and are high payers, which is the most valuable segment for the business.

A small number of customers (0.02%) shop infrequently and are large payers, which may require additional efforts to stimulate their activity.

Based on these findings, it is proposed to develop an individual strategy for working with each customer segment, which will be aimed at increasing the number of repeat purchases, retaining customers and maximizing profits.

### Task 2 (Methodological)

Since February 5, there has been a decrease in Retention 1.3d indicators for all users by 2 percentage points. (for other periods Retention also fell).
#### Goal: determine the reasons for the drop in Retention. Develop a set of recommendations to solve the problem.

What does a good solution look like?
1. Formulate valuable hypotheses
2. Test hypotheses
3. Describe in the form of diagrams the process of testing hypotheses, possible test results and subsequent actions based on the results obtained.

### Formulation of hypotheses

1. **Game Balance Issues**: If players feel like the game is unfair and doesn't give them a chance to win, they may quickly lose interest and leave the app.
2. **Performance Issues**: If an app becomes slow or unstable, it can cause Retention to drop as users may experience issues that prevent them from playing.
3. **Not interesting enough content**: It is possible that the content in the app has become less interesting or varied, causing users to stop playing.
4. **Seasonality**: a decrease in Retention may be due to changes in seasonal factors, for example, vacation periods or national holidays.
5. **Insufficient Motivation**: If users do not receive enough motivation to continue playing (such as rewards and achievements), this can reduce Retention.
6. **Strong Competition**: If there are other games in the app that offer more interesting or varied gameplay, this can turn off users and reduce Retention.

### Testing hypotheses

1. **Game Balance Issues**: To assess game balance, you can analyze game data such as win-loss statistics, difficulty level, and the average time players spend in the game. You can also conduct A/B testing to compare different versions of the game and determine what changes affect Retention. For this hypothesis, you can use metrics such as the number of players completing levels, the number of attempts to complete each level, the average time to complete levels, etc.
2. **Game Performance Issues**: You can analyze the logs and collected application metrics to identify possible errors and performance issues. You can also conduct A/B testing to compare the app's performance on different devices and operating systems.
3. **Insufficiently interesting content**: to test the hypothesis about insufficiently interesting content, you can use the following metrics: average time that users spend in the game, ratio of the number of users, Engagement rate (ratio of the number of users interacting with the content to the total number of users ), Retention rate (the percentage of users who remain active after viewing content), comparing the behavior of users who make in-app purchases and those who do not. If users who make purchases spend more time in the app and use more features, this may indicate that they are enjoying the content.
4. **Seasonality**: compare Retention with data from previous years and identify whether there is seasonality in user behavior. If so, measures must be developed to retain users during seasonal periods. You can use the following metrics: number of app downloads: you can compare the number of app downloads in different time periods to find out if there are any seasonal changes, number of active users, number of sessions: you can compare the number of sessions in different time periods to find out if there are Are there any seasonal changes? For example, if the number of sessions decreases during the holidays, this could indicate seasonality in the number of in-app purchases: A similar comparison can be made for the number of in-app purchases across different time periods. If the number of purchases increases significantly during any period, this may indicate seasonality.
5. **Lack of motivation**: For this hypothesis, you can use metrics such as number of sessions, time spent in the application, number of levels completed, amount of money spent on in-app purchases, etc. You can conduct a user survey to find out which app features and functionality provide the most value and what you can do to improve them.
6. **Strong competition**: To assess the competitive situation, you can conduct an analysis of the mobile games market. You can also survey users to find out which competing games they prefer and why. To compare Retention, you can conduct A/B testing to evaluate the effects of introducing new features and changes to the game. You can use metrics such as number of app downloads, number of active users, time spent by users in the app, etc.

### Describe in the form of diagrams the process of testing hypotheses

1. Formulate hypotheses based on the supposed reasons for the decline in Retention.

2. Identify metrics that can confirm or refute the hypotheses.

3. Collect data on selected metrics for the periods before and after the drop in Retention.

4. Analyze the data and run statistical tests to determine how significantly the metrics have changed.

5. For each hypothesis, make a decision based on the test results:
     - If the hypothesis is confirmed, take measures to solve it.
     - If the hypothesis is rejected, investigate further to find other possible reasons for the drop in Retention.
     - If the results are ambiguous, conduct additional research.
    
6. Develop a set of measures to eliminate the identified problems and test these measures.

7. Evaluate the effectiveness of the measures taken and continue to monitor metrics to detect additional problems.

## Task 3 (Methodological)

Today is February 18th. Since January 1, there has been a decrease in the Revenue indicator for the MTQ product.
#### Goal: determine the reasons for the fall in Revenue. Develop a set of recommendations to solve the problem.

What does a good solution look like?
1. Formulate valuable hypotheses
2. Test hypotheses
3. Describe in the form of diagrams the process of testing hypotheses, possible test results and subsequent actions based on the results obtained.

### Formulation of hypotheses

1. **Decreased Demand**: It is possible that demand for the MTQ product has decreased, resulting in fewer sales and lower overall revenue.
2. **Competition**: It is possible that other companies have offered more attractive alternatives, causing customers to switch to other products.
3. **Price Change**: It is possible that the price of the MTQ product has been changed, resulting in a decrease in sales and therefore a decrease in revenue.
4. **Change in Pricing**: The company may have reduced prices for the MTQ product, resulting in a decrease in overall revenue, even if the number of sales remained the same.
5. **Seasonality**: It is possible that the demand for the MTQ product depends on the time of year, and the decrease in revenue was caused by a seasonal decline.
6. **Advertising Campaigns**: It is possible that advertising campaigns for the MTQ product have been reduced, resulting in decreased interest in the product and therefore decreased revenue.

### Testing hypotheses

1. **Decrease in demand**: You can test this hypothesis by analyzing the number of sales of the MTQ product during the period, comparing it with past periods. You can also analyze the number of searches for an MTQ product on the company's website or other online platforms to determine whether the number of users interested in the product has changed. Metrics: number of sales of the MTQ product, number of visitors on the site, time spent on the site, share of the MTQ product in the company’s total sales.
2. **Competition**: You can analyze the market share of MTQ's product and competitors, and also examine changes in sales of competitors over the same period of time. You can also analyze customer reviews of the MTQ product and its competitors to understand which products are most popular with customers. Metrics: competitor ratings in customer reviews, average price of MTQ product compared to competitors, number of new competitors, number of returning customers.
3. **Price Change**: You can analyze the price changes of the MTQ product and its competitors over the period. You can also compare sales mix and revenue before and after the price change to determine how demand has changed. Metrics: change in the average price of an MTQ product, change in the number of sales after a price change, change in the share of sales of an MTQ product in the company's total sales.

4. **Pricing Policy Changes**: You can analyze changes in the company's pricing policy, including changes in the prices of other products, which may affect sales of the MTQ product. You can also analyze the sales mix and revenue for the MTQ product and other company products to understand how changes in pricing policy affected revenue. Metrics: change in the average price of an MTQ product, change in the share of sales of an MTQ product in the company’s total sales, number of sales of an MTQ product before and after changing the pricing policy.

5. **Seasonality**: You can analyze historical MTQ product revenue data and compare it to current period revenue to determine if the revenue decline was due to seasonal declines. Metrics: number of sales of MTQ product by month, number of visitors to the site by month, assessment of customer needs in different seasons.

6. **Advertising Campaigns**: You can analyze the volume of advertising campaigns for the MTQ product and compare them with previous periods. You can also analyze the number of clicks to the company's website and to the MTQ product page to determine how the volume of advertising campaigns affects the number of users interested in the product. Metrics: number of clicks on advertisements, conversion of advertisements to MTQ product sales, advertising costs compared to total MTQ product sales.

### Describe in the form of diagrams the process of testing hypotheses

1. **Data Collection**: It is necessary to collect sales data for the MTQ product, including the number of sales and revenue for the period from January 1 to the current date.
2. **Testing Hypotheses**: After collecting data, it is necessary to test each of the hypotheses. This can be done using statistical methods such as analysis of variance, regression analysis or hypothesis testing. As a result of testing each hypothesis, one of three possible results can be obtained: confirmation, refutation, or uncertainty.
3. **Analysis of results**: based on the results of testing each hypothesis, conclusions can be drawn about the reasons for the decline in revenue. If the hypothesis is confirmed, then it is necessary to determine measures to eliminate its negative impact. If the hypothesis is rejected, then it is necessary to continue searching for the reasons for the decrease in revenue. If the results of the audit are uncertain, then additional research must be conducted to determine the reasons for the decrease in revenue.
4. **Development of an action plan**: Based on the analysis of the results, an action plan can be developed aimed at increasing the revenue of the MTQ product. The plan may include measures to improve product quality, increase marketing efforts, or change pricing policies.
5. **Implementing the Action Plan**: Once the action plan is developed, it is necessary to implement it. At the same time, it is necessary to monitor changes in sales of the MTQ product and evaluate the effectiveness of the measures implemented. If sales of the MTQ product begin to increase, then the action plan can be considered successful. If sales of the MTQ product continue to decline, then it is necessary to analyze the results of the implementation of the action plan and make adjustments.