The overall philosophy here is as follows. 

Functions calculating exact values of metrics validate only business constraints. For example, Total Measured Ad Impressions for CTR cannot be lower than Total Measured Clicks because it makes no sense. For the same reason these values cannot be negative. Additional similar checks are performed to avoid zero divisions. However, these functions assume that the input they get has the corrent type and do not perform type casting. Besides, the output is never rounded. The reason is that sometimes precision can matter. If rounding is required, it can be performed when printing the results.

User input is consumed and type-validated in the last section of the assignment. There, the program checks whether the input can be correctly parsed and casts it to the correct types to pass it to the target metric functions. In the user-interation section I cast all values that can be either integers or floats to floats because if a value can be parsed as an integer, it can surely be parsed as a float, but not vice versa (well, it technically can but the decimal part is going to be lost). But the metrics functions were still designed so that they accept both types.

Sometimes it might seem not too user-friendly to validate formatting and business constraints separately. For example, when a user has to enter multiple values and one of them is invalid, they might have to start all over again. However, I think that single responsibility is more important. Validation functions can only know how to statically validate the input, and metric functions can only know their logical constraints. That is, the user must be responsible for any inconsistencies in their data.

## User Input Validations

The functions in this sections are used to perform basic validation of the user input and check whether the provided input can be parsed into the needed data types.

In [1]:
def is_integer(value: str) -> bool:
    """
    Checks whether a string can be parsed into an integer.
    
    Args:
        value (str) -- the string to validate
    
    Returns:
        bool: True if the string can be parsed into an integer, 
        False otherwise
    """
    try:
        int(value)
        return True
    except ValueError:
        return False

In [2]:
def is_float(value: str) -> bool:
    """
    Checks whether a string can be parsed into a float.
    
    Args:
        value (str) -- the string to validate
    
    Returns:
        bool: True if the string can be parsed into a float, 
        False otherwise
    """
    try:
        float(value)
        return True
    except ValueError:
        return False

In [3]:
def is_integer_list(values: list[str]) -> bool:
    """
    Checks whether each of the given string values 
    can be parsed into an integer.
    
    Args:
        values (list[str]) -- the list of strings to validate
    
    Returns:
        bool: True if the strings can be parsed into integers, 
        False otherwise
    """
    for value in values:
        if not is_integer(value):
            return False
    return True

## Utility

This section declares utility functions to parse user input or calculate intermediate values.

In [4]:
def parse_integer_list(values: list[str]) -> list[int]:
    """
    Parses given string values into a list of integers.
    The function assumes that all values in the given list 
    can be parsed into integers and does not additionally 
    validate the input.
    
    Args:
        values (list[str]) -- the list of strings to parse
    
    Returns:
        list[str]: the parsed strings
    """
    return list(map(int, values))

In [5]:
def average_list_value(values: list[int]) -> float:
    """
    Finds the average of the values in the given list.
    The function assumes that the list is not empty
    and all values in the given list are integers. 
    It does not additionally validate the input.
    
    Args:
        values (list[int]) -- a list of integers
    
    Returns:
        float: the average of the values in the given list
    """
    return sum(values) / len(values)

## Metrics

### Main

The functions in this section are used to calculate the following metrics:


* **Click-Through Rate (CTR)** is calculated as $$\text{CTR} = \frac{\text{Total Measured Clicks}}{\text{Total Measured Ad Impressions}} \times 100$$ where "total measured clicks" is the total amount of clicks on an ad; "total measured ad impressions" is the number of times an ad was loaded on a page. Click-through rates measure how successful an ad has been in capturing users' attention. The higher the click-through rate, the more successful the ad has been in generating interest.

* **Return on Investment (ROI)** is calculated as $$\text{ROI} = \frac{(\text{Amount Gained} - \text{Amount Spent})}{\text{Amount Spent}} \times 100$$ where "amount gained" is the amount of income that has been generated by an investment; "amount spent" is the total amount spent on an investment. ROI stands for Return on Investment and means the amount of money you get back relative to the amount of money you put into something. It is different to profit, which is simply the amount spent subtracted from the amount earned. ROI goes a step further and works out profit per the amount spent. This answers the question "How much profit can I earn per pound/dollar/euro etc. spent?"

* **Average Page Time** is calculated as $$\text{Average Page Time} = \frac{\sum{\text{Time Spent on a Page by a User}}}{\text{Number of Users}}$$ where "time spent on a page by a user" is time measured for each user who visits a webpage; "number of users" is the number of users who visit a webpage. Keep in mind, that usually users who spend less than 5 seconds on a webpage are not included in the calculations. Hint! You might think about parameters passed to a function as one of Python series structures.

* **Customer Lifetime Value (CLV)** is calculated as $$\text{CLV} = \text{Average Purchase Value} \times \text{Average Purchase Frequency} \times \text{Average Customer Lifespan}$$ and used to predict how much revenue a customer will drive over time. The formula differs from the assignment. [Source](https://www.netsuite.com/portal/resource/articles/ecommerce/customer-lifetime-value-clv.shtml).

* **Conversion Rate (CR)** is calculated as $$\text{CR} = \frac{\text{Total Attributed Conversion}}{\text{Total Measured Clicks}} \times 100$$ where "total attributed conversion" is the total amount of conversion recorded which have been caused clicks; "total clicks" – number of times an ad was clicked on.

In [6]:
def click_through_rate(total_measured_clicks: int, total_measured_ad_impressions: int) -> float:
    """
    Calculates Click-Through Rate (CTR). Click-through rates measure 
    how successful an ad has been in capturing users' attention. 
    The higher the click-through rate, the more successful the ad has been 
    in generating interest.
    
    CTR = (Total Measured Clicks / Total Measured Ad Impressions) * 100
        
    The function assumes that both of the values are integers 
    and does not additionally validate their type. 
    
    The function checks that Total Measured Ad Impressions is greater than zero,
    Total Measured Clicks is non-negative, and Total Measured Ad Impressions 
    is not less than Total Measured Clicks. If any of these conditions is violated, 
    a message is shown and None value is returned.
    
    Args:
        total_measured_clicks (int) -- Total Measured Clicks
        total_measured_ad_impressions (int) -- Total Measured Ad Impressions
    
    Returns:
        float: the calculated CTR
    """
    if total_measured_ad_impressions <= 0:
        print("The number of total measured ad impressions must be greater than zero.")
        return
    if total_measured_clicks < 0:
        print("The number of total measured clicks must be greater than or equal to zero.")
        return
    if total_measured_ad_impressions < total_measured_clicks:
        print("The number of total measured ad impressions must be greater than or equal to the number of total measured clicks.")
        return
    return 100 * total_measured_clicks / total_measured_ad_impressions

In [7]:
def return_on_investment(amount_gained: int | float, amount_spent: int | float) -> float:
    """
    Calculates Return on Investment (ROI). ROI measures the amount of money 
    gained back relative to the amount of money put into something.
    
    ROI = 100 * (Amount Gained – Amount Spent) / Amount Spent
    
    The function assumes that both of the values are integers or floats 
    and does not additionally validate their type.
    
    The function checks that Amount Spent is greater than zero,
    and Amount Gained is greater than or equal to zero.
    If this is false, then a message is shown and None value is returned.
    
    Args:
        amount_gained (int or float) -- Amount Gained
        amount_spent (int or float) -- Amount Spent
    
    Returns:
        float: the calculated ROI
    """
    if amount_spent <= 0:
        print("Amount spent must be greater than zero.")
        return
    if amount_gained < 0:
        print("Amount gained must be greater than or equal to zero.")
        return
    return 100 * (amount_gained - amount_spent) / amount_spent

In [8]:
def average_page_time(users: list[int]) -> float:
    """
    Calculates Average Page Time. The average page time metric shows 
    how much time is spent on your website page.
    
    Average Page Time = Σ(Time Spent on a Page by a User) / Number of Users
    
    The function assumes that the given list contains only integers 
    and does not additionally validate their type.
    
    The function checks whether the list is not empty.
    Users who spend less than 5 seconds on a webpage are not included in the calculations.
    If any of these is false, a message is shown and None value is returned.
    
    Args:
        users (list[int]) -- a list of data on how many seconds users spend on a website page
    
    Returns:
        float: the calculated Average Page Time
    """
    if len(users) == 0:
        print("The list of user data cannot be empty.")
        return
    users = [user for user in users if user > 5]
    if len(users) == 0:
        print("There are no users who spent more than 5 seconds on the webpage. Nothing to process.")
        return
    return average_list_value(users)

Please note that the formula of CLV was changed compared to the assignment. [Source](https://www.netsuite.com/portal/resource/articles/ecommerce/customer-lifetime-value-clv.shtml).

In [9]:
def customer_lifetime_value(average_purchase_value: int | float, 
                            average_purchase_frequency: int | float, 
                            average_customer_lifespan: int | float) -> int | float:
    """
    Calculates Customer Lifetime Value (CLV). CLV measures 
    how much a business can plan to earn from the average customer 
    over the course of the relationship.
    
    CLV = Average Purchase Value X Average Purchase Frequency X Average Customer Lifespan
    
    The function assumes that the given values are either integers or floats 
    and does not additionally validate their type.
    
    The function checks that none of the given values is negative. If any of them is,
    a message is shown and None value is returned.
    
    Args:
        average_purchase_value (int or float) -- Average Purchase Value
        average_purchase_frequency (int or float) -- Average Purchase Frequency
        average_customer_lifespan (int or float) -- Average Customer Lifespan
    
    Returns:
        int or float: the calculated CLV
    """
    if average_purchase_value < 0 or average_purchase_frequency < 0 or average_customer_lifespan < 0:
        print("All of the arguments must be non-negative.")
        return
    return average_purchase_value * average_purchase_frequency * average_customer_lifespan

In [10]:
def conversion_rate(total_attributed_conversion: int, total_measured_clicks: int) -> float:
    """
    Calculates Conversion Rate (CR). Conversion rate is the proportion of visitors 
    who convert on a website. A conversion could be a completed web form, 
    content downloads, trial sign-ups, or completed purchases.
    
    CR = 100 * Total Attributed Conversion / Total Measured Clicks
    
    The function assumes that both of the values are integers
    and does not additionally validate their type.
    
    The function checks that Total Measured Clicks is greater than zero.
    If this is false, a message is shown and None value is returned.
    
    Args:
        total_attributed_conversion (int) -- Total Attributed Conversion
        total_measured_clicks (int) -- Total Measured Clicks
    
    Returns:
        float: the calculated CR
    """
    if total_measured_clicks <= 0:
        print("The number of total measured clicks must be greater than zero.")
        return
    if total_attributed_conversion < 0:
        print("The number of total attributed conversion must be greater than or equal to zero.")
        return
    if total_measured_clicks < total_attributed_conversion:
        print("The number of total measured clicks must be greater than or equal to the number of total attributed conversion.")
        return
    return 100 * total_attributed_conversion / total_measured_clicks

### Additional

The functions in this section are used to calculate the following metrics:

* **Pages per Session** is the count of a website’s total page views divided by the total number of sessions that have taken place. It indicates the average number of pages on your website that users access per session. Having a high average Pages per Session means that the average visitor to your website is interested in exploring your website beyond the initial page they land on. $$\text{Pages per Session} = \frac{\sum{\text{Pages Visited in a Session}}}{\text{Number of Sessions}}$$ While your average session duration might be high, ask yourself — how are visitors using this time? Do they stick to one or two pages, or do they explore further? Pages per Session can answer these questions. [Source](https://www.klipfolio.com/metrics/marketing/page-views-per-session/).

* **Average Session Duration** measures the average amount of time spent per session on your website. It is calculated by dividing total time spent across sessions by the total number of sessions. $$\text{Average Session Duration} = \frac{\sum{\text{ Session Duration}}}{\text{Total Number of Sessions}}$$ The average session duration metric calculates how much time a visitor was active on your website. Like the average pageviews per session, this metric is an excellent indicator of your website’s engagement. The more clicks your website gets, the higher this metric will be. [Source](https://www.klipfolio.com/metrics/marketing/average-session-duration/).

* **Bounce Rate** is the percentage of people who land on a page on your website, then leave. They don't click on anything else or visit a second page on the site. A bounce is a single-page session on your site. If the success of your site depends on users viewing more than one page, then a high bounce rate is bad. If you have a single-page site like a blog, or offer other types of content for which single-page sessions are expected, then a high bounce rate is perfectly normal. $$\text{Bounce Rate} = \frac{\text{Number of Single-Page Sessions}}{\text{Total Number of Sessions}}$$ Bounce rate is single-page sessions divided by all sessions, or the percentage of all sessions on your site in which users viewed only a single page. [Source](https://support.google.com/analytics/answer/1009409).

In [11]:
def pages_per_session(sessions: list[int]) -> float:
    """
    Calculates Pages per Session. Pages per Session is the count of a website 
    total page views divided by the total number of sessions that have taken place.
    
    Pages per Session = Σ(Page Views in a Session) / Number of Sessions
    
    The function assumes that the given list contains only integers 
    and does not additionally validate their type.
    
    The function checks whether the list is not empty.
    If this is false, a message is shown and None value is returned.
    
    Args:
        sessions (list[int]) -- a list of data on how many pages 
        users visit in a single session
    
    Returns:
        float: the calculated Pages per Sessions
    """
    if len(sessions) == 0:
        print("The list of user data cannot be empty.")
        return
    return average_list_value(sessions)

In [12]:
def average_session_duration(sessions: list[int]) -> float:
    """
    Calculates Average Session Duration. This metric calculates 
    how much time a visitor was active on your website.
    
    Average Session Duration = Σ(Session Duration) / Number of Sessions
    
    The function assumes that the given list contains only integers 
    and does not additionally validate their type.
    
    The function checks whether the list is not empty.
    If this is false, a message is shown and None value is returned.
    
    Args:
        sessions (list[int]) -- a list of data on how much time
        users spend in a single session
    
    Returns:
        float: the calculated Average Session Duration
    """
    if len(sessions) == 0:
        print("The list of user data cannot be empty.")
        return
    return average_list_value(sessions)

In [13]:
def bounce_rate(pages_per_session: list[int]) -> float:
    """
    Calculates Bounce Rate. Bounce Rate is the percentage of people 
    who land on a page on your website, then leave.
    
    Bounce Rate = Number of Single-Page Sessions / Number of Sessions
    
    The function assumes that the given list contains only integers 
    and does not additionally validate their type.
    
    The function checks whether the list is not empty.
    If this is false, a message is shown and None value is returned.
    
    Args:
        sessions (list[int]) -- a list of data on how many pages
        users visit in a single session
    
    Returns:
        float: the calculated Bounce Rate
    """
    if len(pages_per_session) == 0:
        print("The list of user data cannot be empty.")
        return
    bounces = 0
    for visit in pages_per_session:
        bounces += visit == 1
    return bounces / len(pages_per_session)

## Extreme Cases Tests

Check that CTR accepts only non-negative Total Measured Clicks, natural Total Measured Ad Impressions, and Total Measured Ad Impressions must be not less than Total Measured Clicks.

In [14]:
click_through_rate(32, -4)

The number of total measured ad impressions must be greater than zero.


In [15]:
click_through_rate(-2, 36)

The number of total measured clicks must be greater than or equal to zero.


In [16]:
click_through_rate(32, 26)

The number of total measured ad impressions must be greater than or equal to the number of total measured clicks.


In [17]:
# This is invalid because the number of clicks cannont be float,
# but this should have been checked when dealing with keyboard input.
# In a real dataset, this would have been an outlier,
# but it's not the responsibility of the metrics to take care about that
click_through_rate(5.5, 6.5) 

84.61538461538461

Check that ROI validates that Amount Spent is greater than zero, and Amount Gained is greater than or equal to zero.

In [18]:
return_on_investment(0, 0)

Amount spent must be greater than zero.


In [19]:
return_on_investment(-1, 1)

Amount gained must be greater than or equal to zero.


In [20]:
return_on_investment(0, 1)

-100.0

Check that Average Page Time cannot be calculated for an empty list, or with no users who spent more than 5 seconds on a page.

In [21]:
average_page_time([])

The list of user data cannot be empty.


In [22]:
average_page_time([1, 2, 3, 4])

There are no users who spent more than 5 seconds on the webpage. Nothing to process.


In [23]:
average_page_time([23, 56, 4, 6, 6, 8, 3])

19.8

Check that CLV cannot be calculated when any of the values is negative.

In [24]:
customer_lifetime_value(-1, 4, 5)

All of the arguments must be non-negative.


In [25]:
customer_lifetime_value(45, -4, 78)

All of the arguments must be non-negative.


In [26]:
customer_lifetime_value(45, 4, -78)

All of the arguments must be non-negative.


In [27]:
customer_lifetime_value(45, 78, 4)

14040

Check that CR cannot be calculated when there are no measured clicks, or total attributed conversion is negative, or total measured clicks is less than total attributed conversion.

In [28]:
conversion_rate(23, 0)

The number of total measured clicks must be greater than zero.


In [29]:
conversion_rate(-23, 144)

The number of total attributed conversion must be greater than or equal to zero.


In [30]:
conversion_rate(23, 14)

The number of total measured clicks must be greater than or equal to the number of total attributed conversion.


In [31]:
conversion_rate(23, 144)

15.972222222222221

Check that Pages per Session cannot be calculated for an empty list.

In [32]:
pages_per_session([])

The list of user data cannot be empty.


In [33]:
pages_per_session([1, 4, 7, 3, 4, 6, 9, 36, 3])

8.11111111111111

Check that Average Session Duration cannot be calculated for an empty list.

In [34]:
average_session_duration([])

The list of user data cannot be empty.


In [35]:
average_session_duration([15, 42, 7, 35, 4, 63, 9, 36, 3])

23.77777777777778

Check that Bounce Rate cannot be calculated for an empty list.

In [36]:
bounce_rate([])

The list of user data cannot be empty.


In [37]:
bounce_rate([5, 2, 7, 5, 4, 3, 9, 6, 3])

0.0

In [38]:
bounce_rate([1, 2, 7, 1, 4, 3, 1, 1, 1])

0.5555555555555556

## The Runnable Section

Here, I ask the user for a keyboard input of statistics required for all eight metrics. The input is validated are re-queried util it's correct.

In [44]:
results = dict()

# Click-Through Rate
print("To calculate Click-Through Rate, please enter Total Measured Clicks and Total Measured Ad Impressions.")
print("The values must be valid integers. Total Measured Clicks must be non-negative. \
Total Measured Ad Impressions must be greater than zero. The values are expected to be entered on separate lines.")
print()

while True:
    total_measured_clicks = input("Total Measured Clicks: ")
    if not is_integer(total_measured_clicks):
        print("Please enter a valid integer.")
        print()
        continue
    
    total_measured_ad_impressions = input("Total Measured Ad Impressions: ")
    if not is_integer(total_measured_ad_impressions):
        print("Please enter a valid integer.")
        print()
        continue
    
    total_measured_clicks, total_measured_ad_impressions = int(total_measured_clicks), int(total_measured_ad_impressions)
    ctr = click_through_rate(total_measured_clicks, total_measured_ad_impressions)
    if ctr is None:
        print()
        continue
    results["CTR"] = ctr
    break
print()


# Return on Investment
print("To calculate Return on Investment, please enter Amount Gained and Amount Spent.")
print("The values must be valid integers or floats. Amount Gained must be non-negative. \
Amount Spent must be greater than zero. The values are expected to be entered on separate lines.")
print()

while True:
    amount_gained = input("Amount Gained: ")
    if not is_float(amount_gained):
        print("Please enter a valid number (integer or float).")
        print()
        continue
    
    amount_spent = input("Amount Spent: ")
    if not is_float(amount_spent):
        print("Please enter a valid number (integer or float).")
        print()
        continue
    
    amount_gained, amount_spent = float(amount_gained), float(amount_spent)
    roi = return_on_investment(amount_gained, amount_spent)
    if roi is None:
        print()
        continue
    results["ROI"] = roi
    break
print()


# Average Page Time
print("To calculate Average Page Time, please enter a sequence of integers \
representing how much time each user spent on a web page. The values must be valid integers separated by spaces.")
print()

while True:
    users = input("Absolute Page Times: ").split()
    if not is_integer_list(users):
        print("Please enter a valid sequence of integers.")
        print()
        continue
    
    users = parse_integer_list(users)
    apt = average_page_time(users)
    if apt is None:
        print()
        continue
    results["Average Page Time"] = apt
    break
print()


# Customer Lifetime Value
print("To calculate Customer Lifetime Value, please enter Average Purchase Value, Average Purchase Frequency \
and Average Customer Lifespan. The values must be valid non-negative integers or floats. \
The values are expected to be entered on separate lines.")
print()

while True:
    average_purchase_value = input("Average Purchase Value: ")
    if not is_float(average_purchase_value):
        print("Please enter a valid number (integer or float).")
        print()
        continue
    
    average_purchase_frequency = input("Average Purchase Frequency: ")
    if not is_float(average_purchase_frequency):
        print("Please enter a valid number (integer or float).")
        print()
        continue
    
    average_customer_lifespan = input("Average Customer Lifespan: ")
    if not is_float(average_customer_lifespan):
        print("Please enter a valid number (integer or float).")
        print()
        continue
    
    average_purchase_value, average_purchase_frequency, average_customer_lifespan = float(average_purchase_value), \
                                                                                    float(average_purchase_frequency), \
                                                                                    float(average_customer_lifespan)
    clv = customer_lifetime_value(average_purchase_value, average_purchase_frequency, average_customer_lifespan)
    if clv is None:
        print()
        continue
    results["CLV"] = clv
    break
print()


# Conversion Rate
print("To calculate Conversion Rate, please enter Total Attributed Conversion.")
print("The value must be a valid integer. Total Attributed Conversion must be non-negative. \
The previously entered value of Total Measured Clicks is used.")
print()

while True:
    total_attributed_conversion = input("Total Attributed Conversion: ")
    if not is_integer(total_attributed_conversion):
        print("Please enter a valid integer.")
        print()
        continue
    
    total_attributed_conversion = int(total_attributed_conversion)
    cr = conversion_rate(total_attributed_conversion, total_measured_clicks)
    if cr is None:
        print()
        continue
    results["CR"] = cr
    break
print()


# Average Session Duration
print("To calculate Average Session Duration, please enter a sequence of integers \
representing how much time each user spent on a website. The values must be valid integers separated by spaces.")
print()

while True:
    sessions = input("Session Length: ").split()
    if not is_integer_list(sessions):
        print("Please enter a valid sequence of integers.")
        print()
        continue
    
    sessions = parse_integer_list(sessions)
    asd = average_session_duration(sessions)
    if asd is None:
        print()
        continue
    results["Average Session Duration"] = asd
    break
print()


# Pages per Session and Bounce Rate
# Unified because they need the same data
print("To calculate Pages per Session and Bounce Rate, please enter a sequence of integers \
representing how many pages each user visited on a website. The values must be valid integers separated by spaces.")
print()

while True:
    sessions = input("Visited Pages: ").split()
    if not is_integer_list(sessions):
        print("Please enter a valid sequence of integers.")
        print()
        continue
    
    sessions = parse_integer_list(sessions)
    pps = pages_per_session(sessions)
    br = bounce_rate(sessions)
    if pps is None:
        print()
        continue 
    if br is None:
        print()
        continue
    results["Pages per Session"] = pps
    results["Bounce Rate"] = br
    break
print()


print("-" * 43)
for metric_name, metric_value in results.items():
    print(f"{metric_name:30} | {metric_value:.2f}")

To calculate Click-Through Rate, please enter Total Measured Clicks and Total Measured Ad Impressions.
The values must be valid integers. Total Measured Clicks must be non-negative. Total Measured Ad Impressions must be greater than zero. The values are expected to be entered on separate lines.

Total Measured Clicks: 30
Total Measured Ad Impressions: 50

To calculate Return on Investment, please enter Amount Gained and Amount Spent.
The values must be valid integers or floats. Amount Gained must be non-negative. Amount Spent must be greater than zero. The values are expected to be entered on separate lines.

Amount Gained: 43.3
Amount Spent: 23.2123

To calculate Average Page Time, please enter a sequence of integers representing how much time each user spent on a web page. The values must be valid integers separated by spaces.

Absolute Page Times: 2 4 6 4 3 65 5 3 5 6

To calculate Customer Lifetime Value, please enter Average Purchase Value, Average Purchase Frequency and Average C