# Ticket aggregator

## Introduction

The program is designed to classify an airfare offer based on the criteria provided in the user input. Specifically user inputs the *ticket price*, *number of transfers*, *refund* and *luggage* policy. Each parameter has it's rules, e.g. price can be only integer or float number, transfers number either natural number or zero, refund and luggage included is 'y' for yes or 'n' for no. The values typed into fields are controlled, the program will ask to repeat input if the value is not what has been expected. Some parameters have default values to ease inputs and testing.

There are four classification categories based on the info in the brief:

*   "**The best offer**". Price less than \$200, direct or 1 transfer, refund and luggage included
*   "**Good enough**". Price from \$200 to \$250, less than 2 transfers, refund and luggage don't matter
*   "**The worst offer**". Price is more than \$250, 3 or more transfers, refund and luggage don't matter
*   "**Other**". For the cases which cannot be classified as one of above
</p>



## Program code



Through the dictionaries, let us map flight categories and yes/no to numbers for more consistent logic and future use.

In [None]:
# categories mapping
categories = {0: "other", 1: "the best offer", 2: "good enough", 3: "the worst offer"}
# yes/now answer mapping
yes_no = {"y": 1, "n": 0}

*Processing user inputs*. The program takes user inputs and uses a while loops to ensure that the user provides the expected input or stops the program. In the case of the ticket price, the input can be a natural or float number. Therefore, a ValueError try-except block is used to catch the situation where the input cannot be converted to float. In addition, the price cannot be negative number *(let us assume zero is for situations where ticket is free)*. For the number of transfers, only digits are expected. For refund and luggage, either 'y' or 'n' can be entered. Once the parameters are received as expected, the program casts the transfer number to an integer and the 'y' or 'n' answers to 1 or 0 type integers. This way, the last two parameters can be used directly as Boolean values.  Additionally, all three parameters have default values.

In [None]:
print("\n---Ticket aggregator---")
print("Please input parameters to start the process\n")

# user input
while True:
    try:
        price = input("Enter ticket price: ")

        # casting to float
        # if failed - goes to expection block and repeat input()
        price = float(price)

        if price < 0:
            print("The price cannot be negative, retry please.")
        else:
            # price is good, break from while True loop
            break
    except ValueError:
        print("Price must be a number, please retry")

# provides default value since bool("") == False
transfers = input("Enter the number of transfers (default = 0): ") or "0"

# strips spaces and check if str consist of digits
# while the transfer num is not digit keep asking
while not transfers.strip().isdigit():
    transfers = input("Number of transfers must be a digit, retry: ")

# cast str to int
transfers = int(transfers)

# if input str is blank, value is 'n'
refund = input("Is the ticket refundable? (y/n) (default 'n'): ").lower() or "n"

# while it's not 'y' neither 'n' letter keep asking
while refund.strip() not in {"y", "n"}:
    refund = input("Please answer (y/n for yes/no): ").lower()

# str 'y'/'n' for int 1/0 using yes_no dict
refund = yes_no[refund]

# same here with defaul value
luggage = input("Ticket includes luggage? (y/n) (default 'n'): ").lower() or "n"
while luggage.strip() not in {"y", "n"}:
    luggage = input("Please answer (y/n for yes/no): ").lower()

# y/n to 1/0
luggage = yes_no[luggage]


---Ticket aggregator---
Please input parameters to start the process

Enter ticket price: -123
The price cannot be negative, retry please.
Enter ticket price: aaa
Price must be a number, please retry
Enter ticket price: 199.99
Enter the number of transfers (default = 0): a
Number of transfers must be a digit, retry: -2
Number of transfers must be a digit, retry: 1
Is the ticket refundable? (y/n) (default - no): i don't know
Please answer (y/n for yes/no): y
Does the ticket include luggage? (y/n) (default - no): who cares
Please answer (y/n for yes/no): y


Afterwards the parameters are processed though if-elif construction and the category being set upon the criteria above.

In [None]:
print(f'Ticket price: {price:.2f}')
print(f'Number of transfers: {transfers}')
print(f"Refund is allowed: {str(refund).replace('1','YES').replace('0', 'NO')}")
print(f"Luggage included: {str(luggage).replace('1','YES').replace('0', 'NO')}")

Ticket price: 199.99
Number of transfers: 1
Refund is allowed: YES
Luggage included: YES


In [None]:
flight_category = 0  # base case "other"
if price < 200 and transfers <= 1 and refund and luggage:
    flight_category = 1  # the best
elif 200 <= price <= 250 and transfers <= 2:
    flight_category = 2  # good enough
elif price > 250 and transfers > 2:
    flight_category = 3  # the worst

print(f'\nThe category of this flight is "{categories[flight_category]}"\n')


The category of this flight is "the best offer"



## Analysis

Considered solution for the ticket aggregator indeed provide some simple classification based on strict criteria for a few offer types, three specified and one for the other cases, to be specific. It has its pros and cons, some of which are described below.

Adavantages:

*   Solution easy to use, develop and understand;
*   Clear classification, based on straightforward parameters;
*   Provides consice information about the offer;
*   Includes important parameters such as price, luggage and refund policies;
*   Classes have strict and fully specified measures

Disadavantages:

*   Aggregator limited only to four categories;
*   The classification criteria may not be sufficient to capture all aspects of the offer;
*   Other important parameters such as flight duration, airline reputation, etc. are not included;
*   Classification rule may not be able to accurately categorize all flight offers, particularly those outside the specified price ranges.










### Alternative solution 

<p>Ticket aggregator of such implementation is hardly can be exact, but it still provides an example of how the simplest classification algorithms could work. My opinion there's a place for improvements. For example, the price ranges may be different due to inflation and currency exchange rates. Also some other parameters could be included in input for evaluations (e.g. flight duration). It's possible to add new flight offer categories such as the "Cheapest" (let's assume price < 150 and luggage and return policies don't matter), or the "Fastest" based on possibly added "flight duration" metric, or "convenient" with no transfers and "best value" for the price range with luggage and refund included.</p>

<p>Also as for program code itself, the functions could be added and the refactoring can be provided.</p>

<p>In any case, all those categories are meaningful only if we compare currently inspected offer to the significant amount of other different ones. So that's all just theoretical situation on the topic of how the specific solution could be improved.</p>

<p>The best known professional decision is to add machine learning algorithms to the solution. But for this purpose first the one must accumulate a significant amount of different flight offers records, which include much more parameters than in user inputs in previous solution. Those records will provide data for a subsequent analysis, afterward the one can decide which parameters are important and which can be dropped. Thereafter, the chosen ML algorithm could be trained upon that data with supervised learning (if already known which offer categories corresponds to our records), or with unsupervised learning (if give the classification making to the algorithm). But this theme is a bit outside of the current task topic.</p>

Alternative solution could look something like this. A few changes has been made. It includes functions use for the repeated code parts. Completely new categorization for the flights offer. Such as "*the cheapest*", "*the most convenient*" and "*the fastest*" and "*the best value*". Also *duration* parameter being added. Criteria written below are based on theoretical situation in term of Moscow - New York flights from the task brief.

The criteria for the new categories:

*   "**The cheapest**". Ticket price less than \$150, other parameters don't matter;
*   "**The most convenient**". Price start from \$150 and no more than \$250, no transfers, any luggage and refund policies;
*   "**The fastest**". Flight duration is 10 hours maximum, other parameters is not included, price doesn't matter;
*   "**The best value**". No transfers, ticket price is below \$300, luggage and refund options must be provided.

As for *flight duration* parameter, the sanity of it is controlled by the user, it's still cannot be negative, but if it's zero let's assume this means we have no data on this. In program code when "the fastest" category being checked, the flight can't be less than 8 hours (for a flight Moscow to New York).



In [1]:
def process_input(offer_params):
    offer_params["price"] = check_float("ticket price")
    offer_params["duration"] = check_float("duration (hours)")
    offer_params["transfers"] = check_int("transfers")
    offer_params["refund"] = check_yn("refund")
    offer_params["luggage"] = check_yn("luggage")


def check_float(val_name):
    while True:
        try:
            value = input(f"Enter {val_name}: ")
            value = float(value)
            if value < 0:
                print(f"The {val_name} cannot be negative, retry please.")
            else:
                break
        except ValueError:
            print(f"{val_name.title()} must be a number, please retry")
    return value


def check_yn(val_name):
    value = input(f"Is {val_name} available? (y/n) (default - no): ").lower() or "n"
    while value.strip() not in {"y", "n"}:
        value = input("Please answer (y/n for yes/no): ").lower()
    value = int(value.replace("y", "1").replace("n", "0"))
    return value


def check_int(val_name):
    value = input(f"Enter the number of {val_name} (default = 0): ") or "0"
    while not value.strip().isdigit():
        value = input(f"Number of {val_name} must be a digit, retry: ")
    value = int(value)
    return value


def classify_offer(offer_params):
    # categories mapping
    categories = {
        0: "other",
        1: "the cheapest",
        2: "the most convenient",
        3: "the fastest",
        4: "the best value",
    }
    flight_category = 0  # base case "other"
    if offer_params["price"] < 150:
        flight_category = 1  # the cheapest
    elif 150 <= offer_params["price"] <= 250 and offer_params["transfers"] == 0:
        flight_category = 2  # the most convenient
    elif 8 < offer_params["duration"] <= 10:
        flight_category = 3  # the fastest
    elif (
        offer_params["transfers"] == 0
        and offer_params["price"] < 300
        and offer_params["refund"]
        and offer_params["luggage"]
    ):
        flight_category = 4  # the best value

    print(f'\nThe category of this flight is "{categories[flight_category]}"\n')


if __name__ == "__main__":
    params = {"price": 0, "duration": 0, "transfers": 0, "refund": 0, "luggage": 0}
    print("\n---Ticket aggregator---")
    print("Please input parameters to start the process\n")

    process_input(params)
    classify_offer(params)



---Ticket aggregator---
Please input parameters to start the process

Enter ticket price: 251.99
Enter duration (hours): 9
Enter the number of transfers (default = 0): 0
Is refund available? (y/n) (default - no): n
Is luggage available? (y/n) (default - no): n

The category of this flight is "the fastest"



### Conclusion

<p>Other possible solution could include scikit-learn library and some machine learning algorithms from it for the multi-label classification or keras/tensorflow solutions. However, to be able to use considered tools there is must be prepared dataset with the parameters (such as price, duration, policies and transfers). In terms of the provided task it's not possible to fully complete the machine learning solution correct way. That's why I decided not to include that improvement in my alternative ticket aggregator implementation. </p>

<p>Still the example illustrates the use of functions, if-elif conditions and provides proper user input checking. It's still has some flows and could be re-made once more. The main problem here, that's all the criteria in the task and those which I proposed, are all extremely subjective. They are not based on the research or data analysis, and seems quite theoretical. Therefore, making a correct ticket aggregator is non-easy task, which takes lot's of effort, including data accumulation or extraction, exploration, processing and afterwards using it either way for proper classification.</p>