# INTRODUCTION

In recent times, a detailed regression analysis on appointment trends revealed a significant drop in the rate of MH infusion appointments starting from mid-2023, with the decline becoming more pronounced by early 2024. This observation raised concerns about the potential causes of this trend, particularly whether pricing changes might have contributed to patients discontinuing their services.

The primary goal of this project was to investigate the relationship between payments and patient retention to determine if changes in cost played a role in patient churn. By analyzing patient payment data and appointment records, we aimed to identify patterns that could explain the steep decline in MH infusion rates and provide actionable insights.

**Objectives**

This project sought to address the following specific questions:

- Did the cost of services directly contribute to patient churn?
- What was the rate of patient retention over time, and how did it vary across different patient groups?
- Was the increase in service costs the primary driver of patient churn?

The findings from this analysis will help uncover the underlying factors influencing patient behavior and provide recommendations for addressing any pricing-related issues that may be affecting retention. Additionally, the insights will guide further exploration into broader economic and operational factors impacting service delivery and patient satisfaction.

# EXPLANATORY DATA ANALYSIS 

## Data Preparation

Before diving into the analysis, the data underwent some careful cleaning and organization to ensure accuracy. Here’s what was done:

1. **Removed an irrelevant record:** The very first record was inaccurate and unnecessary, so it was excluded from the analysis.
2. **Standardized appointment statuses:** To keep things consistent, all "Made" appointments were categorized as "Seen," focusing only on successful appointments for analysis.
3. **Cleaned up patient IDs:** Any extra spaces in chart numbers were removed, and all IDs were converted to a consistent format for easy comparison between the appointments and patient lifetime datasets.
4. **Formatted dates:** Dates across the datasets were standardized into a recognizable format for smoother processing.

With these steps, the data was ready for the next stage—analyzing patient payments over time. Additional adjustments may be made as needed during the analysis process.

## Data Analysis and Visualization

**Ideas for Analysis**

To uncover meaningful insights, the data was segmented into the following patient groups:

- **Patients Before Mid-2023:** Those who concluded their services before mid-2023 (`patients_before_mid2023`).
- **Late 2023 Patients:** Those who began their services after mid-2023 but concluded before 2024 (`patients_late_2023`).
- **All-Time Patients:** Those whose services spanned from before mid-2023 and continued into 2024 (`all_time_patients`).
- **2024 Patients**: Those whose services occurred exclusively in 2024 (`patients_2024`). This group is especially crucial as the MH infusion drop was observed to steepen in 2024.

The analysis aims to explore the following areas:

- The role of insurance and its offerings to patients.
- The proportion of expenses covered by insurance versus the amount paid out-of-pocket by patients.
- The total cost incurred by patients for their services.
- Pay-per-visit trends, focusing specifically on successful appointments.

By breaking down the data in this way, the goal is to identify patterns and uncover any potential links between costs and patient behavior.

### Insurance Analysis 

Upon reviewing the insurance data, it was found that approximately 80% of the entries were missing. Due to this significant gap, a meaningful analysis of insurance contributions was not possible.

As a result, the focus has shifted to analyzing the amount paid directly by each patient, which provides a clearer and more complete picture of the financial aspect of the services.

### Payment Analysis

This phase focused on uncovering meaningful insights into the amounts paid by patients, looking for trends, patterns, and any potential impact on customer retention.

The goal was to understand if payment structures or costs may have contributed to the changes observed in MH infusion rates. 

To ensure a clear approach, a few assumptions were made:

- All services were charged at the same rate(Including MH infusion).
- Canceled appointments were not invoiced, so they are excluded from the payment analysis.


After inspecting the completeness of the total payment variable in the PLP data set, only 3% of the patient payment data is missing, allowing us to proceed with the analysis.

**Distribution of payments**

A distribution in data refers to how values are spread or arranged within a dataset. It provides insight into the frequency or likelihood of different outcomes or measurements occurring.

In this analysis, we plotted distribution plot(blend of histogram and density plot) to show the distribution of the total payment per patients in each category. 

![image](/home/the-ape/CurrentProjects/upwork/CambridgeBiotherapies/infusion-volume-analysis/images/total_payment_histogram.png)
![](../images/total_payment_histogram.png)

From the distribution plot above, we can draw the following conclusions:

* While all categories exhibit a similar shape of distribution, all-time patients (those who have been receiving services since before mid-2023 and continued through 2024) recorded the highest total payments for their services.
* The majority of `all-time patients` made total payments ranging between $10,000 and $20,000.
* Patients in all other categories predominantly paid below 5,000 dollars for their services.


The dataset spans data from as far back as 2018. As a result, it can be inferred that all-time patients and those in the `before mid-2023` category have had longer service periods (up to five years), while patients in the `late mid-2023` and `2024` categories have had much shorter periods of service—approximately six months to one year. This disparity in service duration could explain why `all-time patients` had significantly higher total payments compared to the more recent patient categories. This prompted the need to delve deeper into payment patterns, leading to the idea of analyzing pay per visit(successful appointments).

The following steps will be taken to further investigate the issue of payments and uncover differences between the categories:

* **Preprocess the data** to determine the actual number of visits for each patient.
* **Calculate the pay-per-appointment** for each patient to gain a clearer understanding of individual payment patterns.
* **Analyze the distribution** of pay-per-appointment values across all categories to identify any notable differences.

<span style="color: orange;">**Note:**</span> It is assumed that each successful appointment was billed at a uniform rate.

### Breakdown to pay per visit

Before calculating pay per visit, the following preprocessing steps were performed:

- Selected all successful appointments to ensure only valid data is included.
- Aggregated the total count of successful appointments for each patient.
- Calculated the pay per visit for each patient by dividing the total payment by the number of successful appointments.

This process ensures that the pay per visit is accurately calculated for each patient.

The chart below represents the distribution of pay per visit for patients in different times(categories)

![image](/home/the-ape/CurrentProjects/upwork/CambridgeBiotherapies/infusion-volume-analysis/images/pay_per_visit_histogram.png)

![](../images/pay_per_visit_histogram.png)

The histograms above illustrate the distribution of pay per visit for all categories. All the categories follow a normal distribution, with a slight left skew. However, most patients before 2023 paid the least. The distribution shows that patients in the "late 2023" and "2024" categories generally paid more than $600 per visit.

### Summary Statistics for the payments 

To gain a clearer understanding of the payment trends across each category, let's calculate both the median and the mean (average) pay per visit for each group

![image](/home/the-ape/CurrentProjects/upwork/CambridgeBiotherapies/infusion-volume-analysis/images/summary_bar_chart.png)

![](../images/summary_bar_chart.png)


As evident from the bar chart, `2024 patients` had the highest pay per visit, followed by `late-2023 patients`. `Before mid-2023 patients` paid the least per visit, while `all-time patients` fell in the middle range. This can be attributed to the fact that `all-time patients` experienced both lower initial expenses and the impact of rising prices over time.

# CONCLUSION 

This analysis offers a detailed look at the changes in payments charged to different categories of patients over time, revealing important trends and insights. After careful examination of the data, the following findings were made:

* Patients who visited before mid-2023 tended to pay the least per visit, with their payment amounts generally staying within a lower range.
* Patients in the late 2023 and 2024 categories paid significantly higher amounts per visit. This indicates a clear shift in pricing over time.
* A gradual increase in prices was observed starting from mid-2023, signaling potential changes in pricing policies or external factors that affected the cost of services.

Given these findings, here are a few key areas I recommend exploring further:

- Are there any hidden or additional charges that may have been introduced to patients, which might explain the increase in payments?
- Could inflation have played a role in driving up the cost of services for patients in the later periods?
- Were there any internal factors, such as the provision of services becoming more expensive, that contributed to the rise in payments?

I also suggest investigating why 188 patients(`all-time patients`) continued to undergo treatments despite the rising costs. There could be something unique about these patients or their situations that warrants further exploration.

While I have worked diligently to provide a comprehensive analysis, there were some challenges along the way. One key issue was the inability to obtain specific payment data for MH Infusion, which meant I had to assume it was priced similarly to other services in the analysis. This assumption could affect the accuracy of the findings, especially if the payment structure for MH Infusion differs.

Moving forward, I will continue to refine the analysis, particularly around the MH Infusion data, and will keep you updated on any new insights that emerge. I am confident that this analysis, with a bit more data, will provide even deeper insights into the trends shaping the patient payment experience.