## Recency, Frequency, Monetary (RFM) Analysis
  
https://www.techtarget.com/searchdatamanagement/definition/RFM-analysis  
https://www.investopedia.com/terms/r/rfm-recency-frequency-monetary-value.asp  

### Blood Donation Data Set
The data are from a blood donation marketing study by the Chung-Hua University, Taiwan.  

https://archive.ics.uci.edu/ml/datasets/Blood+Transfusion+Service+Center

In [None]:
# Some libraries
import numpy as np
import pandas as pd

In [None]:
# We download the data from the UCI ML repository.  The url for the data:
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/blood-transfusion/transfusion.data"

# Download the data
Blood = pd.read_csv(url)

# Replace the original column names with simpler names
Blood.columns = ["Recency", "Frequency", "Monetary", "Time", "Donated"]
Blood

Each row represents a potential blood donor
- Recency: The amount of time that has passed since the donor's last donation
- Frequency:  The number of times the donor has donated
- Monetary:  Total volume of blood donated
- Time:  The amount of time that has passed since the donor's first donation
- Donated:  The donor donated (1);  The donor did not dontae (0)

In [None]:
display(Blood.describe().round(2))

### Attempt to predict/classify a donation based on RFM

In [None]:
# Split data randomly into train and test sets
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
    Blood.loc[:, Blood.columns!='Donated'],
    Blood.loc[:, 'Donated'], 
    test_size = 0.2)

# Use logistic regression from Sci-kit learn
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier

# Create Classifier Object
Model = LogisticRegression()
# Model = RandomForestClassifier()

# Train Model with Classifier algorithm and Data
Model.fit(X_train, y_train)

# Use test inputs to score (predict on) test inputs
y_pred = Model.predict(X_test)

# Compare predictions with actual test values
CorrectPredictions = np.sum(y_pred == y_test)
AllPredictions = y_test.shape[0]
Accuracy = CorrectPredictions/AllPredictions
print('Accuracy: ', Accuracy)

In [None]:
print("Default Accuracy (just predict no donation):", 1-np.sum(y_test)/y_test.shape[0])