STATUS: Completed
Based on the data without accompanying information, the goal was to answer the following questions:
- Who is the customer?
- What challenges does the customer experience in their business?
- Propose solutions to the business problems of the customer.
During the project, it became clear that we were dealing with a sales report from the online marketplace Amazon. The main pain point for the customer is customer retention on the service. A hypothesis was put forward that the main reason for customer churn is delivery delays. However, it was not possible to test the hypothesis as the available data limits the verification of such hypotheses.
- The highest ARPU indicator is 29.773664 and corresponds to January.
- The peak user activity occurred in January 2020, with 589 unique active users.
- There are a total of 3819 unique customers, of which 2229 made a repeat purchase. 1590 customers made a single purchase and did not return to the service during the reporting period.
- The service is experiencing a significant user churn.
The research results can be valuable for the customer to make timely decisions to address customer churn on their service.
The data is noisy with missing values and duplicates (both explicit and implicit) - preprocessing has been conducted. Data grouping in various dimensions was used to answer research questions. Various types of visualization were also used to answer the questions.
- Python
- Pandasql
- Matplotlib.pyplot