**AZ Watch** is a popular video streaming platform specialized in educational content, where creators publish online video tutorials and lessons about any topic, from speaking a new language to cooking to learning to play a musical instrument.

Their next goal is to leverage AI-driven solutions to analyze and make predictions about their subscribers and improve their marketing strategy around attracting new subscribers and retaining current ones. This project uses machine learning to predict subscribers likely to churn and find customer segments. This may help AZ Watch find interesting usage patterns to build subscriber personas in future marketing plans!

![Woman working on multiple screens](marketinganalytics.jpg)


The `data/AZWatch_subscribers.csv` **dataset** contains information about subscribers and their status over the last year:

|Column name|Description|
|-----------|-----------|
|`subscriber_id`|The unique identifier of each subscriber user|
|`age_group`|The subscriber's age group|
|`engagement_time`|Average time (in minutes) spent by the subscriber per session|
|`engagement_frequency`|Average weekly number of times the subscriber logged in the platform (sessions) over a year period|
|`subscription_status`|Whether the user remained subscribed to the platform by the end of the year period (subscribed), or unsubscribed and terminated her/his services (churned)|

Carefully observe and analyze the features in the dataset, asking yourself if there are any **categorical attributes** requiring pre-processing?

The subscribers dataset from the `data/AZWatch_subscribers.csv` file is already being loaded and split into training and test sets for you:

Define a function called format_date(), which formats a timestamp into a readable datetime string.

- It must accept two parameters: timestamp - the Unix timestamp integer, and datetime_format - a string specifying the desired date format.
- The function should return the date correctly formatted as a string.
- For example, calling format_date(1514665153, "%d-%m-%Y") should output "30-12-2017".

Define a function called calculate_landing_time(), which calculates the estimated landing time.

- It must accept two parameters: rocket_launch_dt - the rocket launch datetime object, and travel_duration - the expected travel time in days as an integer.
- The function should return the estimated Mars landing time as a datetime string in the format DD-MM-YYYY.
- For example, calling calculate_landing_time(datetime(2023, 2, 15), 20) should output "07-03-2023".
  
Define a function named days_until_delivery(), which calculates the days until a package arrives for customers.

- It must accept two parameters: expected_delivery_dt - the estimated delivery date as a datetime object for the package, and current_dt - the current date as a datetime object.
- The function should calculate the difference in days between the expected delivery datetime and the current datetime, then return the number of days remaining as an integer.
- For example, calling days_until_delivery(datetime(2023, 2, 15), datetime(2023, 2, 5)) should output 10.

In [1]:
# Import the necessary modules
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import confusion_matrix
from sklearn.cluster import KMeans
import seaborn as sns
from matplotlib import pyplot as plt

# Specify the file path of your CSV file
file_path = "data/AZWatch_subscribers.csv"

# Read the CSV file into a DataFrame
df = pd.read_csv(file_path)

# Separate predictor variables from class label
X = df.drop(['subscriber_id','subscription_status'], axis=1)
y = df.subscription_status

# Split intro training and test sets (20% test)
X_train, X_test, y_train, y_test = train_test_split(
                        X, y, test_size=.2, random_state=42)

In [None]:
from datetime import datetime, timedelta

def format_date(timestamp, datetime_format):
    datetime_obj = datetime.fromtimestamp(timestamp)
    datetime_str = datetime_obj.strftime(datetime_format)
    return datetime_str


def calculate_landing_time(rocket_launch_dt, travel_duration):
    landing_date = rocket_launch_dt + timedelta(days=travel_duration)
    landing_date_string = landing_date.strftime("%d-%m-%Y")
    return landing_date_string

def days_until_delivery(expected_delivery_dt, current_dt):
    time_until_delivery = expected_delivery_dt - current_dt
    days_until = time_until_delivery.days
    return days_until

# Solution

In [None]:
from datetime import datetime, timedelta

# Define format_date function accepting timestamp and datetime format args
def format_date(timestamp, datetime_format):
    # Convert timestamp arg to datetime object and save result in new variable
    datetime_obj = datetime.fromtimestamp(timestamp)  
    # Format datetime_obj to string using the datetime_format arg
    datetime_str = datetime_obj.strftime(datetime_format)
    # Return formatted datetime string
    return datetime_str

# Define calculate_landing_time function accepting launch datetime and duration
def calculate_landing_time(rocket_launch_dt, travel_duration):
    # Calculate landing by adding travel_duration to rocket_launch_dt arg and save result in new variable
    landing_date = rocket_launch_dt + timedelta(days=travel_duration)
    # Format landing datetime to string in specified format
    landing_date_string = landing_date.strftime("%d-%m-%Y") 
    # Return landing date time string 
    return landing_date_string

# Define days_until_delivery function accepting expected and current datetimes 
def days_until_delivery(expected_delivery_dt, current_dt):
    # Calculate the time until delivery by subtracting current_dt arg from the expected_delivery_dt arg 
    time_until_delivery = expected_delivery_dt - current_dt
    # Access the date component of the datetime object
    days_until = time_until_delivery.days
    # Return number of days until delivery
    return days_until