# Task 2

---

## Predictive modeling of customer bookings

This Jupyter notebook includes some code to get you started with this predictive modeling task. We will use various packages for data manipulation, feature engineering and machine learning.

### Exploratory data analysis

First, we must explore the data in order to better understand what we have and the statistical properties of the dataset.

In [1]:
import pandas as pd # type: ignore

In [2]:
df = pd.read_csv("data/customer_booking.csv", encoding="ISO-8859-1")
df.head()

Unnamed: 0,num_passengers,sales_channel,trip_type,purchase_lead,length_of_stay,flight_hour,flight_day,route,booking_origin,wants_extra_baggage,wants_preferred_seat,wants_in_flight_meals,flight_duration,booking_complete
0,2,Internet,RoundTrip,262,19,7,Sat,AKLDEL,New Zealand,1,0,0,5.52,0
1,1,Internet,RoundTrip,112,20,3,Sat,AKLDEL,New Zealand,0,0,0,5.52,0
2,2,Internet,RoundTrip,243,22,17,Wed,AKLDEL,India,1,1,0,5.52,0
3,1,Internet,RoundTrip,96,31,4,Sat,AKLDEL,New Zealand,0,0,1,5.52,0
4,2,Internet,RoundTrip,68,22,15,Wed,AKLDEL,India,1,0,1,5.52,0


The `.head()` method allows us to view the first 5 rows in the dataset, this is useful for visual inspection of our columns

In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 50000 entries, 0 to 49999
Data columns (total 14 columns):
 #   Column                 Non-Null Count  Dtype  
---  ------                 --------------  -----  
 0   num_passengers         50000 non-null  int64  
 1   sales_channel          50000 non-null  object 
 2   trip_type              50000 non-null  object 
 3   purchase_lead          50000 non-null  int64  
 4   length_of_stay         50000 non-null  int64  
 5   flight_hour            50000 non-null  int64  
 6   flight_day             50000 non-null  object 
 7   route                  50000 non-null  object 
 8   booking_origin         50000 non-null  object 
 9   wants_extra_baggage    50000 non-null  int64  
 10  wants_preferred_seat   50000 non-null  int64  
 11  wants_in_flight_meals  50000 non-null  int64  
 12  flight_duration        50000 non-null  float64
 13  booking_complete       50000 non-null  int64  
dtypes: float64(1), int64(8), object(5)
memory usage: 5.3+ 

The `.info()` method gives us a data description, telling us the names of the columns, their data types and how many null values we have. Fortunately, we have no null values. It looks like some of these columns should be converted into different data types, e.g. flight_day.

To provide more context, below is a more detailed data description, explaining exactly what each column means:

- `num_passengers` = number of passengers travelling
- `sales_channel` = sales channel booking was made on
- `trip_type` = trip Type (Round Trip, One Way, Circle Trip)
- `purchase_lead` = number of days between travel date and booking date
- `length_of_stay` = number of days spent at destination
- `flight_hour` = hour of flight departure
- `flight_day` = day of week of flight departure
- `route` = origin -> destination flight route
- `booking_origin` = country from where booking was made
- `wants_extra_baggage` = if the customer wanted extra baggage in the booking
- `wants_preferred_seat` = if the customer wanted a preferred seat in the booking
- `wants_in_flight_meals` = if the customer wanted in-flight meals in the booking
- `flight_duration` = total duration of flight (in hours)
- `booking_complete` = flag indicating if the customer completed the booking

Before we compute any statistics on the data, lets do any necessary data conversion

In [4]:
df["flight_day"].unique()

array(['Sat', 'Wed', 'Thu', 'Mon', 'Sun', 'Tue', 'Fri'], dtype=object)

In [5]:
mapping = {
    "Mon": 1,
    "Tue": 2,
    "Wed": 3,
    "Thu": 4,
    "Fri": 5,
    "Sat": 6,
    "Sun": 7,
}

df["flight_day"] = df["flight_day"].map(mapping)

In [6]:
df["flight_day"].unique()

array([6, 3, 4, 1, 7, 2, 5])

In [7]:
df.describe()

Unnamed: 0,num_passengers,purchase_lead,length_of_stay,flight_hour,flight_day,wants_extra_baggage,wants_preferred_seat,wants_in_flight_meals,flight_duration,booking_complete
count,50000.0,50000.0,50000.0,50000.0,50000.0,50000.0,50000.0,50000.0,50000.0,50000.0
mean,1.59124,84.94048,23.04456,9.06634,3.81442,0.66878,0.29696,0.42714,7.277561,0.14956
std,1.020165,90.451378,33.88767,5.41266,1.992792,0.470657,0.456923,0.494668,1.496863,0.356643
min,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,4.67,0.0
25%,1.0,21.0,5.0,5.0,2.0,0.0,0.0,0.0,5.62,0.0
50%,1.0,51.0,17.0,9.0,4.0,1.0,0.0,0.0,7.57,0.0
75%,2.0,115.0,28.0,13.0,5.0,1.0,1.0,1.0,8.83,0.0
max,9.0,867.0,778.0,23.0,7.0,1.0,1.0,1.0,9.5,1.0


The `.describe()` method gives us a summary of descriptive statistics over the entire dataset (only works for numeric columns). This gives us a quick overview of a few things such as the mean, min, max and overall distribution of each column.

From this point, you should continue exploring the dataset with some visualisations and other metrics that you think may be useful. Then, you should prepare your dataset for predictive modelling. Finally, you should train your machine learning model, evaluate it with performance metrics and output visualisations for the contributing variables. All of this analysis should be summarised in your single slide.

In [None]:
{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "provenance": [],
      "authorship_tag": "ABX9TyPe7oeddpcwN3LWCXQGZrjX",
      "include_colab_link": true # type: ignore
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/Ruksaana0509/British-Airways-virtual-internship/blob/main/British_Airways_Task2.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "# Task 2 - Predict customer buying behaviour"
      ],
      "metadata": {
        "id": "_prEVJqDc8yH"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "get data"
      ],
      "metadata": {
        "id": "dzlj4GKxNH6H"
      }
    },
    {
      "cell_type": "code",
      "execution_count": null, # type: ignore
      "metadata": {
        "id": "UOXjfMLc04tD"
      },
      "outputs": [],
      "source": [
        "%matplotlib inline\n",
        "\n",
        "import pandas as pd\n",
        "import numpy as np\n",
        "import matplotlib.pyplot as plt\n",
        "import seaborn as sns\n",
        "\n",
        "import warnings\n",
        "warnings.filterwarnings(\"ignore\")"
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "data= '/customer_booking.csv'\n"
      ],
      "metadata": {
        "id": "RzQ7IcNFNeH0"
      },
      "execution_count": null, # type: ignore
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "import chardet\n",
        "\n",
        "with open('/customer_booking.csv', 'rb') as rawdata:\n",
        "  result = chardet.detect(rawdata.read(100000))\n",
        "\n",
        "print(result)"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "eaOUJcHQOhDe",
        "outputId": "5066c5a9-f03e-410a-ae57-3cc54076189f"
      },
      "execution_count": null, # type: ignore
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "{'encoding': 'ISO-8859-1', 'confidence': 0.73, 'language': ''}\n"
          ]
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "df= pd.read_csv(data, encoding='ISO-8859-1')"
      ],
      "metadata": {
        "id": "XPoUFUywNNIV"
      },
      "execution_count": null, # type: ignore
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "df.head()"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 270
        },
        "id": "cFOVMUO9NfwF",
        "outputId": "b1d2e810-dfba-4ccc-a453-411f95954268"
      },
      "execution_count": null, # type: ignore
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "   num_passengers sales_channel  trip_type  purchase_lead  length_of_stay  \\\n",
              "0               2      Internet  RoundTrip            262              19   \n",
              "1               1      Internet  RoundTrip            112              20   \n",
              "2               2      Internet  RoundTrip            243              22   \n",
              "3               1      Internet  RoundTrip             96              31   \n",
              "4               2      Internet  RoundTrip             68              22   \n",
              "\n",
              "   flight_hour flight_day   route booking_origin  wants_extra_baggage  \\\n",
              "0            7        Sat  AKLDEL    New Zealand                    1   \n",
              "1            3        Sat  AKLDEL    New Zealand                    0   \n",
              "2           17        Wed  AKLDEL          India                    1   \n",
              "3            4        Sat  AKLDEL    New Zealand                    0   \n",
              "4           15        Wed  AKLDEL          India                    1   \n",
              "\n",
              "   wants_preferred_seat  wants_in_flight_meals  flight_duration  \\\n",
              "0                     0                      0             5.52   \n",
              "1                     0                      0             5.52   \n",
              "2                     1                      0             5.52   \n",
              "3                     0                      1             5.52   \n",
              "4                     0                      1             5.52   \n",
              "\n",
              "   booking_complete  \n",
              "0                 0  \n",
              "1                 0  \n",
              "2                 0  \n",
              "3                 0  \n",
              "4                 0  "
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-69669214-31d3-44b7-ae6c-f729d01de312\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>num_passengers</th>\n",
              "      <th>sales_channel</th>\n",
              "      <th>trip_type</th>\n",
              "      <th>purchase_lead</th>\n",
              "      <th>length_of_stay</th>\n",
              "      <th>flight_hour</th>\n",
              "      <th>flight_day</th>\n",
              "      <th>route</th>\n",
              "      <th>booking_origin</th>\n",
              "      <th>wants_extra_baggage</th>\n",
              "      <th>wants_preferred_seat</th>\n",
              "      <th>wants_in_flight_meals</th>\n",
              "      <th>flight_duration</th>\n",
              "      <th>booking_complete</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>2</td>\n",
              "      <td>Internet</td>\n",
              "      <td>RoundTrip</td>\n",
              "      <td>262</td>\n",
              "      <td>19</td>\n",
              "      <td>7</td>\n",
              "      <td>Sat</td>\n",
              "      <td>AKLDEL</td>\n",
              "      <td>New Zealand</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>5.52</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>1</td>\n",
              "      <td>Internet</td>\n",
              "      <td>RoundTrip</td>\n",
              "      <td>112</td>\n",
              "      <td>20</td>\n",
              "      <td>3</td>\n",
              "      <td>Sat</td>\n",
              "      <td>AKLDEL</td>\n",
              "      <td>New Zealand</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>5.52</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>2</td>\n",
              "      <td>Internet</td>\n",
              "      <td>RoundTrip</td>\n",
              "      <td>243</td>\n",
              "      <td>22</td>\n",
              "      <td>17</td>\n",
              "      <td>Wed</td>\n",
              "      <td>AKLDEL</td>\n",
              "      <td>India</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>5.52</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>1</td>\n",
              "      <td>Internet</td>\n",
              "      <td>RoundTrip</td>\n",
              "      <td>96</td>\n",
              "      <td>31</td>\n",
              "      <td>4</td>\n",
              "      <td>Sat</td>\n",
              "      <td>AKLDEL</td>\n",
              "      <td>New Zealand</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>5.52</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>2</td>\n",
              "      <td>Internet</td>\n",
              "      <td>RoundTrip</td>\n",
              "      <td>68</td>\n",
              "      <td>22</td>\n",
              "      <td>15</td>\n",
              "      <td>Wed</td>\n",
              "      <td>AKLDEL</td>\n",
              "      <td>India</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>5.52</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-69669214-31d3-44b7-ae6c-f729d01de312')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-69669214-31d3-44b7-ae6c-f729d01de312 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-69669214-31d3-44b7-ae6c-f729d01de312');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 60
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "# Exploratory Data Analysis"
      ],
      "metadata": {
        "id": "t2gjofW4PSfW"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "#checking for datatypes\n",
        "\n",
        "df.dtypes"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "TM3bYzjJPYgm",
        "outputId": "75d6904c-a91e-40f8-c44a-e8d804c48c16"
      },
      "execution_count": null, # type: ignore
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "num_passengers             int64\n",
              "sales_channel             object\n",
              "trip_type                 object\n",
              "purchase_lead              int64\n",
              "length_of_stay             int64\n",
              "flight_hour                int64\n",
              "flight_day                object\n",
              "route                     object\n",
              "booking_origin            object\n",
              "wants_extra_baggage        int64\n",
              "wants_preferred_seat       int64\n",
              "wants_in_flight_meals      int64\n",
              "flight_duration          float64\n",
              "booking_complete           int64\n",
              "dtype: object"
            ]
          },
          "metadata": {},
          "execution_count": 61
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "df.shape"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "TitPzv6sPhcs",
        "outputId": "16aeb6ac-2bdd-4e50-d93e-97fc9992b086"
      },
      "execution_count": null, # type: ignore
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "(50000, 14)"
            ]
          },
          "metadata": {},
          "execution_count": 62
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "#null values\n",
        "\n",
        "df.isnull().sum()\n",
        "\n",
        "#there is no null values"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "i6XYDIpNPflG",
        "outputId": "07415245-46fc-4f1c-a7f7-1bb98b5d6b27"
      },
      "execution_count": null, # type: ignore
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "num_passengers           0\n",
              "sales_channel            0\n",
              "trip_type                0\n",
              "purchase_lead            0\n",
              "length_of_stay           0\n",
              "flight_hour              0\n",
              "flight_day               0\n",
              "route                    0\n",
              "booking_origin           0\n",
              "wants_extra_baggage      0\n",
              "wants_preferred_seat     0\n",
              "wants_in_flight_meals    0\n",
              "flight_duration          0\n",
              "booking_complete         0\n",
              "dtype: int64"
            ]
          },
          "metadata": {},
          "execution_count": 63
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "df.booking_complete.value_counts()"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "ma007WpIP3U-",
        "outputId": "1670dc55-020c-412b-fa01-6da44ee96b81"
      },
      "execution_count": null, # type: ignore
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "0    42522\n",
              "1     7478\n",
              "Name: booking_complete, dtype: int64"
            ]
          },
          "metadata": {},
          "execution_count": 64
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "# Mutual Information"
      ],
      "metadata": {
        "id": "kO58us30PxJk"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "X= df.drop('booking_complete',axis=1)\n",
        "y= df.booking_complete         \n",
        "\n",
        "#changing object dtype to int dtype\n",
        "for colname in X.select_dtypes(\"object\"):\n",
        "    X[colname], _ = X[colname].factorize()"
      ],
      "metadata": {
        "id": "xAVmW976UzgN"
      },
      "execution_count": null, # type: ignore
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "X.dtypes"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "GLbLIRnYVFMe",
        "outputId": "c934b465-9dc6-4141-c6c4-a68c9f7f77fa"
      },
      "execution_count": null, # type: ignore
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "num_passengers             int64\n",
              "sales_channel              int64\n",
              "trip_type                  int64\n",
              "purchase_lead              int64\n",
              "length_of_stay             int64\n",
              "flight_hour                int64\n",
              "flight_day                 int64\n",
              "route                      int64\n",
              "booking_origin             int64\n",
              "wants_extra_baggage        int64\n",
              "wants_preferred_seat       int64\n",
              "wants_in_flight_meals      int64\n",
              "flight_duration          float64\n",
              "dtype: object"
            ]
          },
          "metadata": {},
          "execution_count": 66
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "from sklearn.feature_selection import mutual_info_classif\n",
        "\n",
        "mi_scores = mutual_info_classif(X, y)\n",
        "mi_scores = pd.Series(mi_scores, name=\"MI Scores\", index=X.columns)\n",
        "mi_scores = mi_scores.sort_values(ascending=False)\n",
        "\n",
        "mi_scores # show a few features with their MI scores"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "C_sax3VTP0qN",
        "outputId": "c165558f-1691-4f03-eb01-d481499740db"
      },
      "execution_count": null, # type: ignore
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "route                    0.051057\n",
              "booking_origin           0.046168\n",
              "flight_duration          0.018031\n",
              "length_of_stay           0.008297\n",
              "wants_extra_baggage      0.006588\n",
              "num_passengers           0.003603\n",
              "purchase_lead            0.003480\n",
              "flight_hour              0.002446\n",
              "wants_in_flight_meals    0.001987\n",
              "flight_day               0.000921\n",
              "sales_channel            0.000000\n",
              "trip_type                0.000000\n",
              "wants_preferred_seat     0.000000\n",
              "Name: MI Scores, dtype: float64"
            ]
          },
          "metadata": {},
          "execution_count": 67
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "def plot_mi_scores(scores):\n",
        "    scores = scores.sort_values(ascending=True)\n",
        "    width = np.arange(len(scores))\n",
        "    ticks = list(scores.index)\n",
        "    plt.barh(width, scores)\n",
        "    plt.yticks(width, ticks)\n",
        "    plt.title(\"Mutual Information Scores\")\n",
        "\n",
        "\n",
        "plt.figure(dpi=100, figsize=(8, 5))\n",
        "plot_mi_scores(mi_scores)"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 460
        },
        "id": "52zW510SVhxM",
        "outputId": "67558676-5307-48a1-e701-5a0fefd0e123"
      },
      "execution_count": null, # type: ignore
      "outputs": [
        {
          "output_type": "display_data",
          "data": {
            "text/plain": [
              "<Figure size 800x500 with 1 Axes>"
            ],
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAyIAAAG7CAYAAAA/uyH8AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAAPYQAAD2EBqD+naQAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nOzdebxd093H8c83QVriXqoPTStB1VhDalZUPOahqiPlQbSq1ccQpVqKpIZHtKmiqqZWaA2NtrQUKSVaU00hSkQarnmO3Js5xO/5Y63DtnOHc29uzp2+79frvHLO2muv9dv7Xi/nd9ewFRGYmZmZmZnVUr+uDsDMzMzMzPoeJyJmZmZmZlZzTkTMzMzMzKzmnIiYmZmZmVnNORExMzMzM7OacyJiZmZmZmY150TEzMzMzMxqzomImZmZmZnVnBMRMzMzMzOrOSciZmbWrUkaLikkrb6E+1lF0h8kvZn7G7Ek+6slSWMlNXR1HGZmRU5EzMx6mcIX95C0bTPHJen5fPzGDvaxh6RRix1sJ5I0Kl/TRzvYxM+BXYEzgQOBWzotuBqQ9PF8D4Z2dSxFkv5L0rmSnpQ0V9Jrku6XdJakgV0dn5l1naW6OgAzM1ti5gH7A3eVyrcHVgXmL0bbewD/C4xajDa6m/8G/hwRY7o6kA76ODASaAAeKR37Fl3wx0dJHwEeBOqA3wBPAisBGwGHA78CZtU6LjPrHpyImJn1XjcBX5V0VES8UyjfH3gI6OjIQW+1MjCjsxqT9CFgQUS821ltdlREvN1FXX8TGAJsExH3FA9IqgMW1CoQSctFxOxa9WdmbfPULDOz3utq0l+fd64USFoG+ApwVbmypGF5atOwUvnquXx4/jyWNBpCYQpYtKeNXLZRXrvwtKR5kl6R9BtJK3XGxec+Jkj6t6T1Jd0haY6kFyUdX6gzPMcv4H+L15OPf1LStZKm5/Pvk7RnqZ/Kde8n6XRJLwJzgLp8jbMkDZF0Y37/oqTKPdxQ0u2SZkt6VtL+pbY/ImmMpMfyuU2Sbpa0cbF/4IH88bLCz2V4Pr7IGhFJy0n6WZ6mN1/SFEnHSVKpXkg6X9I++V7Ol/S4pN2q+BGsCSwE7isfiIimiJhX6mtLSTdJeivfj0mSji7V+W9J/8zHZ0j6s6T1SnUq0/TWl3SVpLcojAxK+h9JD+WpYtMlXSNpcKmNtST9Mf9ezpP0Qq5XX8V1m1kVPCJiZtZ7NQD3Al8Hbs5luwP1wDXAUR1s9yLSNKCdSWspOmpn4JPAZcArwKeBw4BPS9oqIqK1k9thRdJ6jz8B40iJ2FmSHouIm4F/kK7jt8CtwBWVEyWtAtwDLAucB7wJHAz8RdJXIuK6Ul8nk/7KPwYYwPt/8e9P+hn8AzgeOAA4X9Js4Azgyhzfd4ArJN0bEc/kcz8J7ANcCzwDrAJ8G7hT0voR8RIwGTgFOBW4GPhnPvcDoxCF6xLwF2AH4NekqVy7Aj8FPgEcUzplW+BLwAXATNLvzh8lDYmIN5vrI3s2X/uBwOWt1EPSzsCNwMvAuaTfifWAvfJnJO1Euo9Pk6YFfhg4Erhb0iYR0VBq9lpgKnAiKdFE0o+A00i/C5cC/5Xb+Iekz0TEDKWEfTzpZ/iLHMsnciwrAI2tXYuZVSki/PLLL7/86kUvYDgQwGakkYsm4MP52Djg9vy+AbixcN6wfN6wUnur5/LhhbLz0/9CFum7PW18uJnz98v1tmvmelZv47pH5XofLZRNyGUHFsqWIX3Z/UPp/ADOL5X9PJdvWygbSPoi/AzQr3Td08rXBYzNx04olK1AGjF5F9i3UL5OrjuqUDag0k/pfs4DTi6UbVa+x6UYGgqfv5Dr/qhU79oc05ql+zK/VLZRLj+ijZ/JKsBrue5k0pqQrwP1pXr98z1tAFYoHVPh/UTgVeAjpVgWApc387twVamt1YB3gBNL5RsAb1fKgaH5/K901X/HfvnVF16emmVm1ruNI/3VeC9Jy5P+orvItKyuEBFzK+8lfUhpt6vKFJ5NOrGrWcDvCv0uAO4njTS0ZQ/g/oh4b1pPRMwijTqsDqxfqn958bpKLi20MQOYAswm/Ywq5VNI61Q+WSibH3mdiaT+eerarHx+R+/THqQv7+eVyn9GGjnYvVR+W0RMK8Q0iZTgtnoPI+JVYGPgQtLI1HdIv3+vSTq5MA3sM8AawDn53hTbqEz7G0RKEMZGxPRSLLfmayq7sPT5S6Rp6eMkfbTyIo14TCWNEMH7Ix67Slq2tWs0s45zImJm1otFxOvAbaQF6l8i/eX5D10aVJbXPpwr6VVgLvA6aZQB0vSxzvJC5ctswVukL8ZtWY30hb9scuF40TPlitm8/LMoamwhtsZibJL6STpG0lTSyMQbpHu1ER2/T6sBL0XEzFJ5S9f1XDNtVHUPI+LliDgcGEQa8TmKFP+ppMXskNaSAPy7jZih5Z/HRyUtVyov/zzWIiVaU3MMxdd6pA0LiDQt7mzgUOANSeMl/a/Xh5h1Lq8RMTPr/a4CLgE+Btxc/otzQUtrMvq3o6/2tDEO+CxpXcIjpL/y9yOt5+jMP5QtbKFcLZQvjpZGQ1qKoZrYTiStafgNaQ3KdNL0qXOo3R8UF/se5oTrKeApSX8lJQMHUBgpWgLKP49+pN/R3Wn+mt7bSjgijlXamOELwC6k0aMT8vqlF5ZMuGZ9ixMRM7Pe7zrSAvOtgH1bqfdW/neFUnn5r+PQcsJRVRuSVgR2BEZGxKmF8rVaia8rPEv6K37ZuoXjS9pXgDsi4pvFQkkrkEZHKtqzuP9ZYCdJy5dGRWpyXRHxdN7JalAuqkz72oA0gtecSkwt/TzeiLa3551GSp6eiYinqojzMeAx4HRJnwXuJk0vO6mtc82sbZ6aZWbWy+U1DYeTFvDe0ErVZ0l/Jf5cqfy7zdSdDe99Ge5IG5W/Rpf/oj6ilfi6wk3AFpK2rhTk6T+HkRZWP1GDGBZSuk+Svkraxamo8iW8/DNpzk2kUaojSuXHkBKamxc5owPydrzl6VJI2oK0tXRlmtXDpGlUI8q/U5V1JBHxMmnk7OBiHUkbkEYsbqoipD+R7ufIZrYpVl5/g6Q6SeU/1j5GGokaUEU/ZlYFj4iYmfUBEdHq1qm5TqOka4EjlZ6jMY20uH3lZqo/lP89T9J4YGFEXFNtGxHRJOkfwPGSlgZeJH2ZXKODl7ikjCZvfyzpPNK0qINJcX45avOwwhuBUyRdRtqOd0PSlKanS/WmkRa6f0fSTFJi8q94fxvgohuAO4AzJK0OPEq6/18gLRif1sw5HXEgcICk60i/MwtIazG+Qdr16/8AIuJdSYfnuB7J1/oyaaTj06SthQG+T0qS7pX0a97fvreRlGi3KiKmSToJOBNYXdL1pO2I1wC+SNqEYAzw36Ttla8lTSdbKl/LQuCPi3E/zKzAiYiZmRUdCSxNmn4yn7SO4/ssuoj4T6TnK+wH/A/pL/bXtLON/XMb/5vP/xtp7v5LnXlBiyMiXs1Tcs4iXdeHgEnA5yPirzUK4/+A5Uj3a1/S6MGepCSpGOvbkg4mfcm+kPT/+ENoZgF9/uK/N2nB+L65XgPp5/SzToz9ItI2xTuSkpw60sLwvwFnRsTEQkzjJe0AjASOJc3amEZa31Spc1t+kOKPc+xvA3cCP2gh4VpERIyW9BRp9GdkLn4+x/SX/PlR0nNEPk8aeZqTy3aPiEUezmhmHaNFN+swMzMzMzNbsrxGxMzMzMzMas6JiJmZmZmZ1ZwTETMzMzMzqzknImZmZmZmVnNORMzMzMzMrOaciJiZmZmZWc35OSLWo+Qn4X6c9AAqMzMzM+uelgdeilaeFeJExHqajwMvdHUQZmZmZtamVYEXWzroRMR6mpkAzz//PHV1dV0di5mZmZmVNDU1MXjwYGhjBosTEeuR6urqnIiYmZmZ9WBerG5mZmZmZjXnRMTMzMzMzGrOiYiZmZmZmdWcExEzMzMzM6s5JyJmZmZmZlZzTkTMzMzMzKzmnIiYmZmZmVnNORExMzMzM7OacyJiZmZmZmY150TEzMzMzMxqzomImZmZmZnVnBMRMzMzMzOrOSciZmZmZmZWc05EzMzMzMys5pbq6gDMOmKDkePpN2DZrg7DzMzMrFtrGL1nV4fQIo+ImJmZmZlZzTkRMTMzMzOzmnMiYmZmZmZmNedExMzMzMzMas6JiJmZmZmZ1ZwTETMzMzMzqzknIrZESApJ+3R1HGZmZmbWPTkRsQ+QtExXx2BmZmZmvZ8TkT5O0gRJ50s6R9IbwHhJ20u6X9J8SS9LGi1pqcI5DZJGlNp5RNKoyvFcfF0eGWko1PuCpIclzZP0tKSRxbbNzMzMrG/wF0ADOBj4FbAN8DHgJmAscBCwLnAJMA8YVWV7mwOvAYcAtwALASRtB1wBHAX8E1gTuDif8+PmGpI0ABhQKFq+yhjMzMzMrBvziIgBTI2I4yNiCrAL8DxwREQ8GRHXAyOBYyVV9fsSEa/ntzMi4pXC55HA6Ii4PCKejohbgZOBb7fS3AlAY+H1QruvzszMzMy6HSciBvBQ4f16wL0REYWyu4GBwKqL2c/GwCmSZlVepNGWQZKWbeGcM4H6wmtxYzAzMzOzbsBTswxgdjvrvwuoVLZ0FecNJI2K/KmZY/OaOyEi5gPzK5+lcrdmZmZm1hM5EbGyycCXJakwKrINMJP3p0W9DgyqnCCpDlij1M7bQP9S2cPAOhHxn06P2szMzMx6FE/NsrILgMHALyStK+kLpIXkZ0fEu7nO7cCBkraTtCFwOXlBekEDsKOkj0laMZedChyUd8r6tKT1JO0n6fQlflVmZmZm1q04EbEPiIgXgT2ALYBHgQuBXwPFZOFM4E7gRuCvwPXAtFJTxwI7kxa+T8xtjwf2Ii2IfwC4DzgGeHbJXI2ZmZmZdVf64Jpks+4tTwNrHDxiHP0GtLS+3czMzMwAGkbvWfM+m5qaqK+vB6iPiKaW6nlExMzMzMzMas6JiJmZmZmZ1ZwTETMzMzMzqzmvEbEepbJGpLGxkbq6uq4Ox8zMzMxKvEbEzMzMzMy6LSciZmZmZmZWc05EzMzMzMys5pyImJmZmZlZzTkRMTMzMzOzmluqqwMw64gNRo73k9XNzMxssXXFk8ct8YiImZmZmZnVnBMRMzMzMzOrOSciZmZmZmZWc05EzMzMzMys5pyImJmZmZlZzfXZRETSBEnnLOE+GiSN6MoYOoukkLRPO+oPy+essCTjMjMzM7Oeydv3dq0vAW93dRBVGgS81Y769+RzGpdMOGZmZmbWkzkR6UIRMb2rY2iLpGUiYkFEvNKe8yJiAdCuc8zMzMys7+izU7OypSSdL6lR0huSTpMkAEkrSrpC0luS5ki6WdJaxZMlfVnS45Lm52lYx7bWmaRDJc2QtGP+/IGpWbmNEyX9RtJMSc9JOqzUxmclPSJpnqQHJe2Tp0ANreaCJW0v6f4c88uSRktaqnB8Qr4n50h6Axifyz8wNautOMpTsyQNz9e+q6TJkmZJukXSoDbiHSCprvIClq/mOs3MzMyse+vricjBwDvAFsDRwPeAQ/OxscBmwN7A1oCAmyQtDSBpU2AccA2wITAKOE3S8OY6knQ8MBrYJSL+3kpMxwIPAp8BLgB+JWmd3EYdcAPwGLAJcDJwVrUXK+kTwE3AA8DGwOHAN4GTSlUPBhYA2wDfaaadjsaxLHAccCDwOWAIMKaNc04gTe+qvF6ooh8zMzMz6+b6+tSs54FjIiKAKZI2BI6RNIGUgGwTEfcASDog198HuJaUtPw9Ik7LbT0laX3g+6Qk5j2SziJ9+d4+Ih5vI6abIuKCwnnHADsAU4D9gQC+FRHzgCdycnFJldf73XwNR+RrflLSx4GzJJ0aEe/melMj4vhW2uloHEsD34mIafn6zgdOaeOcM4GzC5+Xx8mImZmZWY/X10dE7stfyCvuBdYC1ieNlPyrciAi3iQlA+vlovWAu0vt3Q2sJal/oexY4FvAtlUkIQCTCn0GaZ3FyrloHWBS/vJfcX8VbVasB9xbuua7gYHAqoWyh9pop6NxzKkkIdnLvH9tzYqI+RHRVHkBM6vox8zMzMy6ub6eiNTCP4H+wNeqrF/eRSuo/c9p9hJqt7lr0xLqy8zMzMy6sb6eiGxZ+rwVMBV4gjRt7b3jklYijQQ8kYsmk9ZQFG0DPBURCwtl9wO7AydKOm4x450CbChpQKFs83acPxnYurIgP9uGNMrQnulOixuHmZmZmfVxfT0RGSLpbEnrSPo6cCRwbkRMBf4MXCJpW0kbA78DXszlAD8DdpR0sqS1JR0MHEEzi6/zOpM9gJGtPeCwCleRfmYXS1pP0q6kxd+QRhfacgEwGPiFpHUlfQH4MXB2YX1ILeIwMzMzsz6uryciVwAfJo1a/BI4F7g4HzuEtFbiRtLaEQF7RMTbABHxMGm61X7Av4FTgVMiYmxzHUXEXcCewOmSjuxIsHmNxOeBocAjwBm5X4B5LZ1XOP9FUkK0BfAocCHwa+D0WsZhZmZmZqYPrlu2nibv5nUZUB8Rc3t7HHnr4MbBI8bRb8CyS6obMzMz6yMaRu/Z1SH0Ok1NTdTX10P6XtjUUr2+vn1vjyPpIOBp0jSxjUnP7xhX6ySku8RhZmZmZj2TE5Ge52OkaVAfI21/ey3wIwBJFwL/08J5v4uIRR5OuCTiMDMzMzNri6dm9SKSVgbqWjjcFBGv1TKeJcFTs8zMzKwzeWpW56t2apYTEetRKolIY2MjdXUt5VxmZmZm1lWqTUT6+q5ZZmZmZmbWBZyImJmZmZlZzTkRMTMzMzOzmnMiYmZmZmZmNedExMzMzMzMas7PEbEeaYOR4719r1kHeJtKMzPrLjwiYmZmZmZmNedExMzMzMzMas6JiJmZmZmZ1ZwTETMzMzMzq7k+nYgouVjSdEkhaYakcwrHGySNaEd7q+d2hi6ZiD/QV7tiWwL9D5c0o6v6NzMzM7OerU8nIsBuwHBgL2AQ8O/S8c2Bizuzw574Bb6FpOf3wNpdEY+ZmZmZ9Xx9ffveNYGXI+IeAEnvFA9GxOtdElUNSBLQPyLeabNyMyJiLjC3c6MyMzMzs76iz46ISBoL/AIYkqdTNTRT5wMjAZLWlXSXpHmSnpC0Uz53n9Kpn5R0h6Q5kh6VtHU+fxhwGVCfzwtJo6qIdWVJN0iaK+kZSQeUji8yJUzSCrlsWKXv/Hl3SQ8B84FtJa0p6c+SXpU0S9IDknYqtDMBWA34eSXmXL7IyI6kwyVNk7RA0hRJB5aOh6RDJV2X781USXu3df1mZmZm1vv02UQEOBo4BXiBNC1r89YqS+oPXA/MAbYEDgPOaKH6GcAYYCjwFHC1pKWAe4ARQFPuc1Cu15axwGBgB+ArwHeBlas4rzmjgR8C6wGTgIHATcCOwGeAW4AbJA3J9b9EukenFGJehKQvAucCPwM2AC4CLpO0Q6nqSGAcsFHu90pJH2kpWEkDJNVVXsDy7b5iMzMzM+t2+uzUrIholDQTWBgRrwCk2Uot2pk0lWtYof6PgFubqTsmIv6a64wEHgc+FRFPSmpM3ac22iJpbWB3YIuIeCCXfROYXM35zTglIooxTwceLXw+OScVewPnR8R0SQuBmW3EfBwwNiIuyJ/PlrRVLr+jUG9sRFydr+NE4ChgC1IC1JwTSMmLmZmZmfUifXlEpL3WAZ4vfRm/v4W6kwrvX87/dnQEYz3gHeChSkFEPAl0dMH7g8UPkgZKGiNpct41bFbuc0jzp7ca592lsrtzedF79yYiZpNGh1q7N2cC9YXXqu2My8zMzMy6oT47IrKEvV14H/nfJZn0vZv/LQ7pLN1C3dmlz2NIoz3HAf8hLUD/A7BMZwZY8Hbpc9DKvYmI+aT1LECbo1ZmZmZm1kN4RKR6U4DBklYplLW6rqQFC4D+7aj/JClh3LRSIGkdYIVCncruXsX1G9U+y2Qb0nSp6yLiMeAVYPUOxDw5t1Vu+4kq4zAzMzOzPsQjItW7FZgGXC7peNKi6dPzsWjxrEU1AAMl7UhamzEnIua0VDkipki6BbhI0uGkaVrnUNg6NyLmSroP+KGkZ0hTnU5vtsFFTQW+JOmGfB2nsWiC2gB8TtI1wPyIeKOZdn4KjJM0EbgN+DxpoftOzdQ1MzMzsz7OIyJVioiFwD6kXaYeAC7l/V2z5rWjnXuAC0kPBHwdOL6K0w4BXgLuBP5Eesjia6U63yAllg+REpWTqgzpe8BbpB29bgDGAw+X6pxCGiWZxvujLx8QEdeTdiI7jrQ4/9vAIRExoco4zMzMzKwPUUR7/phvRZK2Ae4i7Yg1ravj6QvyFr6Ng0eMo9+AZbs6HLMep2H0nl0dgpmZ9XJNTU3U19cD1EdEU0v1PDWrHfK2trNI05k+RXpuxt1OQszMzMzM2seJSPssD5xF2tr2DdJaiGMXp0FJ2wE3t3Q8IgYuTvtmZmZmZt2RE5F2iIgrgCs6udkHqX6HKzMzMzOzXsGJSBeLiLmk53eYmZmZmfUZXqxuPUplsXpjYyN1dXVdHY6ZmZmZlVS7WN3b95qZmZmZWc05ETEzMzMzs5pzImJmZmZmZjXnRMTMzMzMzGrOu2ZZj7TByPF+snoX8ZO5zczMrDN4RMTMzMzMzGrOiYiZmZmZmdWcExEzMzMzM6s5JyJmZmZmZlZzTkTMzMzMzKzm+nQiImmCpHO6QRzDJIWkFZZgHx+TdKuk2ZJmLKl+zMzMzMyq0acTka7QhcnPMcAgYCiw9uI0JKlB0ohOicrMzMzM+iQ/R6TvWBN4KCKmdnUgZmZmZmYeEckkDZA0RtKLefrSvyQNKxwfLmmGpF0lTZY0S9ItkgYV6iwl6bxc701JZ0m6XNL1+fhYYHvg6DwVKyStXghjU0kPSpoj6R5J67Qj/sMlTZO0QNIUSQcWjjUAXwYOyn2ObaMtSRol6TlJ8yW9JOm8fGwCsBrw88o15PKVJF2d798cSY9J+nqhzYPyPRlQ6ut6Sb+t9jrNzMzMrHdwIvK+84Gtgf2AjYBrgVskrVWosyxwHHAg8DlgCDCmcPwHwAHAIcA2QB2wT+H40cC9wCWkaVKDgOcLx88AjgU2A94BflNN4JK+CJwL/AzYALgIuEzSDrnK5sAtwLjc59FtNPll0lSubwNr5Wt4LB/7EvACcErhGgA+BDwE7JljuBj4raQt8vFrgf7A3oW4V871W7zOnCDWVV7A8m3EbmZmZmY9gKdmAZKGkJKHIRHxUi4eI2m3XH5iLlsa+E5ETMvnnU/6Ql5xJHBmRFyXjx8B7FE5GBGNkhYAcyLilUL/lbc/iog7c9lo4K+SPhQR89q4hOOAsRFxQf58tqStcvkdEfG6pPnA3GK/rRgCvALcFhFvA88B9+drmC5pITCz2FZEvMgHk7JfSNoV+Bpwf0TMlXQV6X5em+v8T257QiuxnACMrCJmMzMzM+tBPCKSbEj6a/1TecrVLEmzSNOo1izUm1NJQrKXgZUBJNUDq5C/sANExELSKEG1JpXaptJ+G9YD7i6V3Z3LO+Ja4MPA05IukfRFSa0mrZL6Szo5T8manu/frqSkpuISYBdJn8ifh5MSqGil6TOB+sJr1Y5dkpmZmZl1Jx4RSQYCC4FN879Fswrv3y4dC0B0nmL7lS/nNU8WI+L5vD5lJ2Bn4ALg+5K2zyMkzfk+acrXCNI0rtnAOcAyhXYnSnqUtFblb8CnSVOzWotlPjC/8rkwemRmZmZmPZhHRJKJpBGRlSPiP6VXNVOZiIhG4FXSegwgjRIAm5SqLsh9dabJpDUpRdsAT3S0wYiYGxE3RMRRwDDS+pkN8+HmrmEb4M8R8buIeBR4mua3Cb6UNBJyCGnq1/PN1DEzMzOzXs4jIkBEPCXpSuAKSceSEpP/AnYEJkXEX6ts6hfACZL+AzxJWjOyIu+PbgA0AFvm3bJmAdM74RJ+CoyTNBG4Dfg8aVH5Th1pTNJwUqLxL2AOaS3HXODZXKUB+Jyka4D5EfEGMBX4iqTPAm8B3yNNVSsnQ1eR1pJ8CzioI/GZmZmZWc/nEZH3HQJcQdp5agpwPWl047l2tHEWcHVu515SojEeKC42H0Oa/vUE8DofXEPRIRFxPWla1HHA46Tdrg6JiAkdbHIGKVG4m7RuZSfg8xHxZj5+CrA6MI10DQCnAw+TrncCabH79c3E2gj8kXRvFjluZmZmZn2DWl8nbItDUj/StKlxEXFyV8fTXUj6O/B4nvbV3nPrgMbBI8bRb8CynR+ctalhdKvLeszMzKyPa2pqor6+HqA+IppaquepWZ1I0mrALsCdwADgCGAN0nSkPk/SiqT1JsOA73ZpMGZmZmbWpTw1q3O9S1qI/QBpWtOGwE4RMXlxGpX0eHFb4dLrgA60d0Ar7T2+OLG2YSIwFvhBRExZgv2YmZmZWTfnEZFOlHeAKu9e1Rn2ID1MsTmvdqC9v5AWojenpe15F1tErL6k2jYzMzOznsWJSA8QEc+2Xatd7c0EZnZmm2ZmZmZm7eHF6tajVBarNzY2UldX19XhmJmZmVlJtYvVvUbEzMzMzMxqzomImZmZmZnVnBMRMzMzMzOrOSciZmZmZmZWc941y3qkDUaO95PVq+QnoZuZmVl35BERMzMzMzOrOSciZmZmZmZWc05EzMzMzMys5pyImJmZmZlZzTkRMTMzMzOzmnMi0o1JCkn7dHUcZmZmZmadrdsnIt3ty7ikBkkjujoOMzMzM7OerNsnIj2RpP6SfG/NzMzMzFrQri/LkvaSNENS//x5aB6xGF2oc6mk30laSdLVkl6UNEfSY5K+XmpvgqTzJP1E0nRJr0gaVTjekN9el/tpyOUbS7pD0kxJTZIekrRZldewraR/Spor6fnc/3L52EGSZklaq1D/AklPSlpW0gRgNeDnOZ7IdYbn+7K3pCeA+cAQSZtLulXSG5IaJd0paZP23HNgkKSbc7xPS/pK6XrOkvRUvsdPSzpN0tKlOidJei3fr0sljZb0SOH4Uvk+zJD0Zm7zcknXF+rsJumuQp0bJa1Z6uezkh6RNE/Sg5L2yfdpaKHOBvl6Zkl6VdJvJX20nffEzMzMzHq49v7V/p/A8jmFSgMAACAASURBVMBn8uftgTeAYYU62wMTgA8BDwF7AhsAFwO/lbRFqc2DgdnAlsDxwCmSds7HNs//HgIMKny+Enghf94UGA283Vbw+YvzLcAfgY2AfYFtgfMBIuIK4CbgyvzlfE/gUOCAiJgDfCn3e0qOZ1Ch+WWBH+T6nwZey/fq8tzHVsBU4CZJy7cVa8FpOd6N83VfI2m9wvGZwHBgfeBo4FvAMYVrPgD4UY5tU+A54PBSHz8ADiDd522AOqA8HW454GxgM2BH4F1Sgtgv91MH3AA8BmwCnAycVWxA0grA7cDE3M5uwCrAuJYuXtIASXWVF+mempmZmVkPp4ho3wnSQ8DVETFG0nXAA8BIYCWgnvRFfe2ImNrMuTcCT0bEcfnzBKB/RGxXqHM/cHtE/DB/DuCLEVH863wTcGREXN7O2C8FFkbEtwtl2wJ3AstFxDxJKwKTSF+qvwScFxH/V6jfAJwTEecUyoYDlwFDI+LRVvrvB8wA9o+IG6uIN4ALI+LwQtl9wMMR8d0WzjkO2C8iNivUfzAijijUuQsYGBFD8+dXgDERMSZ/7g88DUyMiGbX5+RRjNeBDSPi35K+A5wOrBoR83KdQ4FLgM9ExCOSTgK2i4hdC+2sCjwPrBMRTzXTzyjS79cHDB4xjn4Dlm32vtkHNYzes6tDMDMzsz6kqamJ+vp6gPqIaGqpXkfWMdwJDJMkYDvgT8Bk0l/9twdeioipSuskTlaakjVd0ixgV2BIqb1Jpc8vAyu3EcPZwKWSbpP0w/IUoVZsDAzP04Jm5ZjGk+7DGgAR8RbwTdKowTTSaEs1FlC6FkmrSLpE0lRJjUATMJBF70Fr7m3m83sjIpL2lXS30rS2WaRkoNj+OsD9pTbuL5xfTxqVeK8sIhaSRrOK17KW0lS7p3Mi2JAPVfpaB5hUSULK/WQbAzuU7v+T+VhLP8MzSQlu5bVqC/XMzMzMrAdZqgPnTAC+QfpS+XZEPJlHNoYBK5ISFYDvk6YKjSBN15kNnAMsU2qvPKUqaCNBiohRkq4iTfvaHfixpP0i4ro2Yh8IXASc18yx5wrvPwcsJE29Wo40/aktc2PR4aXLSSNFRwPPktaO3Mui96BDJG1Nmq41kpRQNQL7Acd2RvslN5Cu4VvAS6Sf0b9p37UMzO38oJljLzd3QkTMJ903AFL+a2ZmZmY9XUdGRCrrRI7h/aRjAikRGZbfQ1pr8OeI+F2ervQ0sHYH+nsb6F8ujIinIuLnEbELaVTmkCraehhYPyL+08xrAaQF16Qvyp8HZpHXjxQsaC6eFmxDmtp1U0Q8TvpC3d6F2Vs183lyfv9Z4NmIOCMiHszT4VYr1Z/C+2trKt77HBGNwKvFsjw1a5PC55VIIx6nR8TfI2IyKeks97OhpAHN9ZM9TFo/09DM/Z/d3MWbmZmZWe/U7kQkT12aRFrcPCEX/4P0xXVt3k9OpgI7552U1iONRKzSgRgbgB0lfUzSipI+LOl8ScMkrSZpG9IX3smtNwOkxdOfzecPzdONviDpfIC8iPy3pOTh5nyN++qDO1U1AJ+T9IkqdnuaChwoaT1JW5JGL+a249oBvirpG5LWlvRjYAveT46mknbn2k/SmpKOAr5YOv8XwDclHZyv9yTSQv0o1Tkh34t1gHNJiUalzlvAm8Bhkj4l6b9J0+OKriL9Pl2cr3dX4Lh8rNLOL4GPAFcr7Si2pqRdJV2Wkx8zMzMz6yM6+qyLO0mjAhMAImI68ATwSkRMyXVOJ/0FfHyu9wpwfbmhKhwL7Exa0DyRNGVqJeAK4CnSjks308yC5rKImERax7I2aWRnInAqaaoRpC/gs4ETc/3H8vuLJH0i1zkFWJ20fuT1Nrr8JukL/cPkBIe0m1Z7jCRNt5oEHAR8PSKeyPH9Bfg5KTF5hDRCclrpmq8krbMYk+NYAxgLFNdynAVcTbqn95JGgsZX6kTEuzmGTUnTsX5OmnpX7KeJNIo0NMdyBuneUmjnJdIoUX/gb6Qpe+eQFvC/2877YmZmZmY9WLt3zbKeT9KtpKTxwBaO9yONMI2LiJMXo58DSLuJ1UdEe0eCWmqzDmj0rlnV865ZZmZmVkvV7prVkcXq1oNIWhb4DmmEYyHwdWAn0ihTpc5qwC6kka4BwBGkkZOr2tnXQaS1QC+SNjM4i5TMdEoSYmZmZma9R0enZnVLev+J3c29Tuzq+IokHdBKrI93YlcB7EFax/MQafrUlyPitkKdd0kPRXwAuBvYENgpL0pvj48BvyONpvwcuBY4bHGCNzMzM7PeqbeNiBwKfLiFY9NrGUgV/gL8q4VjbT4lvlp5NGKnNuo8T1q7sbh9/QT4yeK2Y2ZmZma9X69KRCLixa6OoVoRMZPqnk9iZmZmZtbreLG69SiVxeqNjY3U1dV1dThmZmZmVlLtYvVetUbEzMzMzMx6BiciZmZmZmZWc05EzMzMzMys5pyImJmZmZlZzfWqXbOs79hg5Phu/WR1P83czMzMrHUeETEzMzMzs5pzImJmZmZmZjXnRMTMzMzMzGrOiYiZmZmZmdWcExEzMzMzM6s5JyJmZmZmZlZzTkTMzMzMzKzmnIhYTUjqL8m/b2ZmZmYG9PBERNIESedJ+omk6ZJekTQqH1tdUkgaWqi/Qi4blj8Py593lTRR0lxJt0taWdLukiZLapJ0laSqnp6XYzo/vxolvSHpNEkq1DlQ0oOSZuaYr5K0cuH4ipKulPR6jmmqpEPysWVy2y9LmifpWUknlK7x0nxuU76ejQvHR0l6JMfQkGO8RtLyhTrL5/5n536Oydd1TqHOAEljJL2Y6/2rcl/z8eGSZkjaW9ITwHxgSL7n9+dzZki6W9JqVf3AzczMzKzX6NGJSHYwMBvYEjgeOEXSzu1sYxRwBPBZYDAwDhgB7A/sCewCHNnOmN4BtgCOBr4HHFo4vjRwMrAxsA+wOjC2cPw0YH1gd2A94HDgjXzsKGBv4GvAOsABQEPh3GuBlfO5mwIPA3+X9JFCnTVzv3vl1/bADwvHzwa2yf3sDGwHbFK6xvOBrYH9gI1yv7dIWqtQZ1ngB/naPw1MB64H7sznbA1cDAQtyAlPXeUFLN9SXTMzMzPrOZbq6gA6waSI+HF+P1XSEcCOwNR2tHFSRNwNIOnXwJnAmhHxdC77A7ADcFaV7T0PHBMRAUyRtCFwDHAJQET8plD3aUlHAQ9IGhgRs4AhwMSIeDDXaSjUH5Kv7a7c/rOVA5K2JSU/K0fE/Fx8nKR9gK+QvvRDSkCHR8TMfN5vSffsR3lk5GBg/4j4ez5+CPBSoZ8hwCHAkIiolI+RtFsuPzGXLQ18NyIezed9BKgHboyIabnO5Dbu5QnAyDbqmJmZmVkP0xtGRCaVPr9MGhHoaBuvAnMqSUihrD1t3peThIp7gbUk9QeQtKmkGyQ9J2kmaYQAUpIB8CtgvzyF6ieSPltoaywwlJTgnCdpl8KxjYGBwJuSZlVewBqkUZCKhkoSkhXv2SdJCcT9lYMR0QhMKdTfEOgPPFXqZ/tSPwso3NuImJ7jH5+v/2hJg8o3r+RMUvJSea3aRn0zMzMz6wF6w4jI26XPQUqw3s2fVTi2dBVtRCttLjZJywHj8+sA4HVSAjIeWAYgIm7O6yb2IE2N+rukX0bEcRHxsKQ1SFOvdgLGSbotIr5CSkJeBoY10/WMwvvFvb6BwELS1K+FpWOzCu/nlhIyIuIQSecBuwH7AqdL2jki7muuozyyUxndobDUxszMzMx6sN6QiLTk9fzvIGBifj+0hbqdbcvS562AqRGxUNK6wErADyPieQBJm5UbiIjXgcuByyX9E/gpcFw+1gT8Hvh9njZ2S5729DDwMeCdiGjoYOxPkxKVzYHncnz1wNrAP3KdiaQRkZUj4p/t7SAiJuY2zpR0L2ktTrOJiJmZmZn1Tr02EYmIuZLuA34o6RnS1KPTa9T9EElnAxeRFnkfCRybjz1HmrJ0pKQLgQ1IC9ffI+lU4CHgcWAAaUH55Hzse6RRj4mkUZ+vAq+QRjxuI00Du17S8cBTwMdJC+6vK6w5aVFEzJR0OfBTSdOB14Af574i13lK0pXAFZKOzbH8F2mdyaSI+GtzbeeRnMOAv5DWnKwDrAVc0VZcZmZmZta79IY1Iq35BinZegg4BzipRv1eAXyYtM7il8C55IXieaRjOCmBeIK0W9VxpfMXkNZGTCKNQiwk7U4FMJO0O9iDwAOkHbf2iIh38zSoPfI5l5ESkWuA1UjrXKr1PVJCcyMpubmblAjNK9Q5JF/nz0jrR66nMIrSgjnAusAfc2wXk+7PRe2IzczMzMx6AZWm8NtikjQBeCQiRnR1LJ0lr2t5ETg2In7dxbHUAY2DR4yj34CqHu3SJRpG79nVIZiZmZl1iaamJurr6wHq85KCZvXaqVnWcZI+Qxq5uJ+0U9Up+dCfuywoMzMzM+tVnIi0Q35+xhOtVFm/VrHUwHGkNRwLSFPbtouIN1o/xczMzMysOk5E2uclWt9566WIGFajWJaYvKvVpl0dh5mZmZn1Xl4jYj1KZY1IY2MjdXV1XR2OmZmZmZVUu0akt++aZWZmZmZm3ZATETMzMzMzqzknImZmZmZmVnNORMzMzMzMrOaciJiZmZmZWc15+17rkTYYOb7bPlndT1U3MzMza5tHRMzMzMzMrOaciJiZmZmZWc05ETEzMzMzs5pzImJmZmZmZjXnRMTMzMzMzGrOiUgXkRSS9unqOFpSq/i6+30wMzMzsyXDiYiZmZmZmdWcE5ElQNLSXR2DmZmZmVl31ucTEUkTJJ2fX42S3pB0miTl44tMHZI0Q9Lw/H71XGdfSXdKmgcckI99Q9LjkuZLelnS+aXuPyrpOklzJE2VtHehj/6Sfi3pGUlzJU2RdHQpjmGS7pc0O8d0t6TVCse/IOlhSfMkPS1ppKQOPcRS0mBJ43I/0yX9WdLqheObS7o137/GfC82KbWxlqR/5HiekLRzR2IxMzMzs56vzyci2cHAO8AWwNHA94BD29nGaOBcYD1gvKTDgV8CFwMbAnsD/ymdMxIYB2wE3ARcKekj+Vg/4AXgq8D6wKnA/0n6GkBOKK4H7sznb537inx8O+CKHNP6wLeB4cCP2nldlRGe8cBMYDtgG2AWcIukZXK15YHLgW2BrYCpwE2Sls9t9AP+BCwAtgS+A5xVRd8DJNVVXrkfMzMzM+vhOvTX8V7oeeCYiAhgiqQNgWOAS9rRxjkR8afKB0knAT+LiHMLdR4onTM2Iq7O9U8EjiIlQ7dExNukRKXiGUlbA18jJS91QD1wY0RMy3UmF+qPBEZHxOX589OSTgZ+Avy4HdcFsC8pMTo03yMkHQLMAIYBf4uI24snSDosH98euBHYCVgX2DUiXipc881t9H0CH7wPZmZmZtYLeEQkua/yBTu7F1hLUv92tPFg5Y2klYGPA39v45xJlTcRMRtoAlYutPO/kh6S9LqkWcBhwJBcfzowljT6coOkoyUNKrS9MXCKpFmVFymxGiRp2XZcV6WtTwEzC21NBz4ErJljXUXSJXmKWWO+loGVeEkjRc9XkpDs3ir6PpOUcFVeq7YzdjMzMzPrhjwi0rYAVCprbjH67ML7uVW2/XYzffUDkLQfMAY4lvSFfSbwfdK0plQ54hBJ5wG7kUYtTpe0c0TcR0oCRpKmQ5XNqzK+ioHAQ+S1LyWv538vB1YiTW17Fpif416mmXOqFhHzc1sA5KU7ZmZmZtbDORFJtix93gqYGhELJb0OvDfSIGktoNURhYiYKakB2BG4o4MxbQPcExEXFPpes5m+JgITgTMl3QvsD9wHPAysExHldSkd8TAp0XktIppaife7EXFTjnUw8NHC8cnAYEmDIuLlXLZVJ8RmZmZmZj2Qp2YlQySdLWkdSV8HjiQt8ga4HThC0mckbQZcyKIjGc0ZBRwr6ai8W9Qmko5sR0xTgc0k7SppbUmnAZtXDkpaQ9KZkraWtJqkXYC1eH+dyKnAQXmnrE9LWk/SfpJOb0cMFVcCbwB/lrRd7nuYpPMkVaZKTQUOzP1smc8pjgzdBjwFXC5p47yY/owOxGJmZmZmvYATkeQK4MPA/aSdrs4l7UAFaWrU88A/gatI06XmtNVgXiQ+Avgu8DhpwfZa7YjpItK0qt8D/yJNe7qgcHwOafH3H0lf8C/OsV+U+x8P7AXsQlokfx9pAf6z7Yihci1zgM8Bz+WYJgO/Jq0RqYyQfBNYkTR68lvgPOC1QhvvAl/k/ft8KR3YwcvMzMzMegd9cI123yNpAvBIRIzo6lisbXkL38bBI8bRb0B719zXRsPoPbs6BDMzM7Mu09TURH19PUB9K9P6PSJiZmZmZma150SkD5J0QHFb39Lr8a6Oz8zMzMx6vz6/a1ZEDOvqGLrAX0jrTppTzUJ8MzMzM7PF0ufXiFjPUlkj0tjYSF1dXVeHY2ZmZmYlXiNiZmZmZmbdlhMRMzMzMzOrOSciZmZmZmZWc05EzMzMzMys5pyImJmZmZlZzfX57XutZ9pg5PiaPVndT0o3MzMz63weETEzMzMzs5pzImJmZmZmZjXnRMTMzMzMzGrOiYiZmZmZmdWcExEzMzMzM6s5JyI1ouRiSdMlhaQZks4pHG+QNKId7a2e2xnayXG2Kw4zMzMzs45wIlI7uwHDgb2AQcC/S8c3By7uzA4lDZc0ozPbNDMzMzPrDH6OSO2sCbwcEfcASHqneDAiXu+SqLoJSctExIKujsPMzMzMasMjIjUgaSzwC2BInk7V0EydD0yJkrSupLskzZP0hKSd8rn7lE79pKQ7JM2R9KikrfP5w4DLgPp8XkgaVWXIy0r6jaSZkp6TdFgp1g0l3S5prqQ385SzgYXjE4rTznLZ9fk+FK/3ZElXSGqihdEgSQMk1VVewPJVXoOZmZmZdWNORGrjaOAU4AXStKzNW6ssqT9wPTAH2BI4DDijhepnAGOAocBTwNWSlgLuAUYATbnPQbleNY4FHgQ+A1wA/ErSOjm25YDxwFv5Or4K7AScX2XbRccBj+Z+TmuhzglAY+H1Qgf6MTMzM7NuxolIDUREIzATWBgRr1QxDWtn0lSugyLi0Yi4C/hRC3XHRMRfI+IpYCSwGvCpPM2pMXUfr+TXrCpDvikiLoiI/wBnAW8AO+Rj+wMfyrH9OyJuB44ADpS0SpXtV9weET+LiGkRMa2FOmcC9YXXqu3sw8zMzMy6ISci3dM6wPMR8Uqh7P4W6k4qvH85/7vyYvb/XpsREcArhTbXAx6NiNmF+neTfpfWaWc/D7ZVISLmR0RT5UVK6MzMzMysh3Mi0vO9XXgf+d/F/bm+Xfoc7WzzXUClsqWbqTe7mTIzMzMz6wOciHRPU4DBpalOra4racECoH/nhPSeycDGea1IxTak5GNK/vw6aU0K8N6alw06OQ4zMzMz68GciHRPtwLTgMslbSRpG+D0fCxaPm0RDcBASTtK+qikZTshtiuBeTm2DSTtQNoR7LcR8Wquczuwp6Q9Ja0L/ApYoRP6NjMzM7NewolINxQRC4F9gIHAA8ClvL9r1rx2tHMPcCHwe9IoxfGdENscYFfgIzm2PwB/Jy1Yr/gNcDlwBXAn8DRwx+L2bWZmZma9h9JaZOvu8qjIXaQdsVraYarXy88SaRw8Yhz9BnTGAE/bGkbvWZN+zMzMzHqDpqYm6uvrAerzZkPN8pPVuylJXwRmAVOBTwHnAnf35STEzMzMzHoPJyLd1/KkZ3gMIT3H4zbSgwY7TNJ2wM0tHY+IgS0dMzMzMzPrTE5EuqmIuIK0xqIzPUh6AruZmZmZWZfyGhHrUSprRBobG6mrq+vqcMzMzMyspNo1It41y8zMzMzMas6JiJmZmZmZ1ZwTETMzMzMzqzknImZmZmZmVnPeNct6pA1Gjl8iDzT0wwvNzMzMasMjImZmZmZmVnNORMzMzMzMrOaciJiZmZmZWc05ETEzMzMzs5pzImJmZmZmZjXXpxIRScMlzejE9iTpYknTJYWkoZImSDqnUKdB0oh2tLl6pa3OirMrSRqWr2eFro7FzMzMzLqPbp+I5C+x+3RSc78H1u6ktgB2A4YDewGDgH83U2dz4OJO7LPTEyozMzMzs1rrU88RiYi5wNxObHJN4OWIuKdSIKnc5+ud2J+ZmZmZWa/QrhERSXtJmiGpf/48NI9YjC7UuVTS7yStJOlqSS9KmiPpMUlfL7U3QdJ5kn6Spze9ImlU4XhDfntd7qchl28s6Q5JMyU1SXpI0mZVxP+BkQRJoyQ98v/t3XmYJVV9//H3h10dZlSUxbAKgriBoqwikyAuaBSNBndBjb9oIuASxZVBIbiQqIgRCSqCBMSoYERFwAwqi6wCERAQB0EYRZHuGXbw+/ujquV67X267+3ueb+ep57uqjp16nvumYb7vXXOuUle2w6hGkhyUpK1x1HXscBngI07Yxum3J8NzUry+CQ/TnJ3kiuTPHuEpz6Pbdt4Z5LLkuzUXr8Q+BKwoL2uOl+zUeJdkuQDSY5LsjzJDUlelOTRSU5tj13e/TomeWaSHyW5K8mNbX89rOP8a5Nc1PbF0iT/lWTdUeLYJMn/JPlDkjuS/CzJnmPFL0mSpLllokOzfgSsDTy13d8N+B2wsKPMbsBiYC3gYuAFwJNohicdn2T7rjpfD9wB7AC8G/hQkj3ac89of+5LM/RpaP8E4KZ2fzvgo8B9E2zLkM2BvWiGV72wjf/AcVy3P/ChNo7O2EbUJnCnAHfStPfNwKEjFD8UOBzYFrgGODHJasC5wAHAYHvfDdpy4/F24Bya/jsNOB44DvgK8DTgF8BxaR/rJNkc+B7wdeApwN7AM4EjO+pcHfggsA3N67gpcOwoMXwWWBN4FvBk4D3A8pEKJ1kzyfyhjebfnyRJkma5CQ3NqqqBJD+lSTwuan9+EjgoyTxgAbAFcHZV/Zo/f4P8mSTPBf4euKDj+OVVdXD7+7VJ/hnYHTijqm5t3xPfXlVLO67ZGPhEVV09dN1E2tFlFWCfqloGkOT49v7vH+2i9rVYBjzQFdto9qBJfBYOXZPk/cAZw5Q9vKpOa8scBPwM2KKqrk4y0IQw7vsO+U5Vfb6t88PAW4ALq+pr7bGPAecB6wFLgfcCJ1TV0OT7a5PsB5yd5C1VdXdVfbGj/uvb8xcmmVdVwyUYGwNfr6orhq4ZI+b3AgdNsJ2SJEma4SYzWf1sYGH7qfmuwDeAq2g+Kd8NuLmqrk2yapIPtkOybkuyHHguzRvRTpd37d8CjDi0p/XvwDFJzkxyYPvJ/WQtGUpCJnD/ydoKuLErgbhghLKdr8st7c8Vjauzzt+0P68Y5tjQfbYB9mmHbS1v+/B0mn83mwEk2a4davWrNjE7u722u5+HHAF8IMk5SQ5O8pQxYj6MJsEd2jYco7wkSZJmgckkIotpko5tgPvapxKLaZ6O7MaDb0T/hWb40seAv6YZYnQ6sEZXfd1DqmqsuKpqEfBEmuFFfwNcmeQlk2jLpO7fI51xVftzReP6U51VVd3HhrnPPODzNH03tG0DPA74RTtX5HSaYWKvphmeNtQP3f08dN9jgMfSDAt7MnBRkreNFHBV3VNVg0MbsGykspIkSZo9JvPGdmieyNt5MOlYTJOILGx/B9gFOLWqvlJVl9EMwZnM0rn3Aat2H6yqa6rqk1X1HJqnMvtOou5e+zmwUZL1Oo6NObdkGPcyzGsyDS4BnlBV1w2z3Qs8HlgHOLCqftQmpWM+tamqG6vqqKp6KfBvwD9MayskSZI040w4EamqP9AM8Xk1DyYdP6SZ7LwlDyYn1wJ7JNk5ydY0n6yvx8QtAXZPsn6SRyR5SJIj03xR3iZJdqF5M3/VJOrutTNoJoR/OclT2tgPac/VyJf9hSXAvCS7J3lUkodOcZxDPgbs3L7e2yZ5XJIXJxmarP4rmqTobUkem+RFNBPXR5TkU0mem2SzJE+jeVo2G/pOkiRJU2iyQ33OpvlEfjFAVd0GXAksraqft2UOoflE/fS23FKaFaMm6p00k7xvBC4FHqD5FP44mtWkTga+yyyY0FxVD9CsLDUPuBA4hgdXzbp7AvWcCxxF8wWNt9KsNjblqupymuF2W9I8CbsU+DBwc3v+VpovdHw5Tf8fCLxrjGpXpVk56yqaFbmuAd469dFLkiRpJsuDUwXUD+1TkR/TrIj1i37HM9O1S/gObHTAyayy5tQ/CFry0RdMeZ2SJEkrk8HBQRYsWACwoJ3jO6yV6pvVZ4J2Uv1ymqFrWwCfBs4xCZEkSdLKZCasDjVlkny3c6nZru19E6xr41HqWp5kpOVpx7I2zdCkq2m++O9C4MWTrGso1l1Hi3VF6pYkSZKmw1x7IvIm4CEjnLttgnXdTLNc7WjnJ6yqjqOZ3zKVLmL0WCVJkqQZZU4lIu23uU9VXfcD101VfdOpqu5ilsQqSZIkgZPVNcsMTVYfGBhg/vz5/Q5HkiRJXcY7WX1OzRGRJEmSNDuYiEiSJEnqORMRSZIkST1nIiJJkiSp5+bUqllaeTzpoNPH/GZ1vyVdkiRp5vKJiCRJkqSeMxGRJEmS1HMmIpIkSZJ6zkREkiRJUs+ZiEiSJEnqORORPkvj6CS3Jakktyf5VMf5JUkOmEB9m7b1bDs9Ef/ZvSYUmyRJkjTERKT/ngfsA7wQ2AD4v67zzwCOnsobJtknye1TWackSZI0EX6PSP9tDtxSVecCJLm/82RV3dqXqCRJkqRp5BORPkpyLPAZYON2ONWSYcr82fCnJI9P8uMkdye5Msmz22v36rr0sUn+N8mdSS5LslN7/ULgS8CC9rpKsmgcsa6b5H+S3JXkl0lePUyZdyS5IskdSW5M8h9J5rXnHpZkMMnLuq7Zqy2/9lgxSJIkae4wEemv/YEPATfRDMt6xmiFk6wKnALcCewAvBk4dITihwKHA9sC1wAnJlkNYt8w9AAAGXNJREFUOBc4ABhs77lBW24sxwIbAX8NvAx4K7BuV5k/AvsBTwReD/wN8HGAqroDOAnYt+uafYH/rqplw900yZpJ5g9tgAmLJEnSHODQrD6qqoEky4AHqmopQJLRLtmDZijXwo7y7wfOGKbs4VV1WlvmIOBnwBZVdXWSgeb2TR1jSbIl8Hxg+6q6sD32RuCqrvZ8qmN3SZIPAEfRJC0AxwDnJtmgqm5Jsi6wJ/DsUW7/XuCg8cQpSZKk2cMnIrPLVsCNXQnEBSOUvbzj91van91PMMZra+B+4OKhA1V1NfBnE97bYWJnJfl1m2AdD6yT5KHtNRfQJESvby95DXAD8MNR7n0YsKBj23CSbZAkSdIMYiIyd93X8Xu1P6etv5NsCnybJgH6O2A74J/a02t0FD2GZpUwaIZlfamqihFU1T1VNTi0AcMO4ZIkSdLsYiIyu/wc2CjJeh3HRp1XMoJ7gVUnUP5qmmF82w0dSLIV8PCOMtvR/Ht6Z1WdX1XXAI8Zpq6vAJsk2Q94AvDlCcYuSZKkOcBEZHY5A/gF8OUkT0myC3BIe27EpwrDWALMS7J7kkcNDZ0aSVX9HPge8PkkOyTZjubJxl0dxa4DVgfeluSxSV4L/OMwdf0B+AbwCeD7VXXTBOKWJEnSHGEiMotU1QPAXsA84EKaZGBo1ay7J1DPuTSTyL8K3Aq8exyX7QvcDJxNk0gcDfy2o87LgHcA76H5UsZX00w0H84XaIZrfXG8MUuSJGluySjD8zULtE9FfkyzItYv+h3PeLRPSz4JPKaq7p3gtfOBgY0OOJlV1hz1QQ5LPvqCyQcpSZKkSRkcHGTBggUAC9o5vsNy+d5ZJslLgOXAtcAWwKeBc2ZDEtIOAdsAOBD4/ESTEEmSJM0dDs2afdYGPkszgfxYmiFaL16RCpPsmmT5SNuKh/wn76aJeynNsrySJElaSflEZJapquOA46a42otovoF9WlXVImDRdN9HkiRJM5+JiKiqu2hWvZIkSZJ6wsnqmlWGJqsPDAwwf/78focjSZKkLuOdrO4cEUmSJEk9ZyIiSZIkqedMRCRJkiT1nImIJEmSpJ4zEZEkSZLUcyYikiRJknrORESSJElSz5mISJIkSeo5ExFJkiRJPWciIkmSJKnnTESmSZI3J7kxyR+THNDveACSLEry037HIUmSJM2qRCRJJdmr33GMJcl84EjgY8BfAUf3N6KZJ8k+SW7vdxySJEnqj1mViPRbkjXGWXRjYHXgtKq6parunOT9Vl+BGCRJkqQZa9KJSJIXJrk9yart/rbtE4uPdpQ5JslXkqyT5MQkv05yZ5Irkryyq77FSY5I8vEktyVZmmRRx/kl7a/fbO+zpD2+TZL/TbIsyWCSi5M8fRzx79PGv1eSa5PcneT0JBt1lFmU5KdJ3pTkl8Dd7fGHt227tb3nD5JsM1QvcEVbxfVtrJu2516c5JL2XtcnOSjJah33qyRvSfKtJHcA759MDB31HZjkN+1r8wVgrbFel45rFya5IMkd7et0TpJNOs6P1ZZ3tP18RztE7T+SzBuqG/gSsKBtc3X2tSRJkua+FXki8iNgbeCp7f5uwO+AhR1ldgMW07wBvhh4AfAkmqFKxyfZvqvO1wN3ADsA7wY+lGSP9twz2p/7Aht07J8A3NTubwd8FLhvnG14KPB+4HXALsDDgZO6ymwB/B3wUmDb9tjXgHWB57f3vAQ4K8kjga8Cz27Lbd/GemOSXYHjgE8DTwD+H7BPe/9Oi4BvAk8GvjjJGEjy921d7wOeDtwCvHU8L0qbUJwCnA08BdiJps+qPT+etvwR2A94Ik2//g3w8fbcucABwGD7+mwAHD5CLGsmmT+00fybkyRJ0iyXqpr8xcnFwIlVdXiSbwIXAgcB6wALaBKELavq2mGu/TZwdVW9q91fDKxaVbt2lLkA+EFVHdjuF/CSqjqlo8wg8Laq+vIEY9+H5lP5HavqJ+2xxwNXATtU1QXtp/TvA/6qqm5tyzwTOA1Yt6ru6ajvOuDjVXV0km2BS4HNqmpJe/5M4KyqOqzjmte01zymo32fqqq3d5SZbAznApdW1T91nD8fWKuqhpKZkV6bRwK/BxZW1dnDnB+zLcNc8zLgqKp6VLu/T9vWh48RyyKaf1N/ZmBggPnz5492qSRJkvpgcHCQBQsWACyoqsGRyq3oHJGzgYVJAuwKfIPmjfwzaZ6G3FxV1yZZNckH26E6tyVZDjyXZi5Fp8u79m+h+dR/NP8OHJPkzHYo0uYTiP9+muQJgKq6Grgd2LqjzA1DCUBrG2Ae8Psky4c2YDNgtHtvQ/OEp/Oa/wQ2SPLQjnIXDXPtZGLYGvhJVz3njRLfn1TVbcCxwOlJ/ifJ/kk2mEhbkjw7yVlphuMtA44H1ulq63gcRpPUDm0bTvB6SZIkzUCrjV1kVIuBN9C8Mb2vqq5un2wsBB5Bk6gA/AuwP81wnCtohl99CuieeN09pKoYI1mqqkVJ/otm2NfzgYOTvKKqvjm5Jv2FO7r259EkSAuHKTvaKlDzaD7Z/8Yw5+4e5X5TGcO4VdW+SY4AngfsDRySZI+qOp8x2tLOifk28Dma4Vq30SSnX6Dp83FP3m+f+HQ+9ZlMcyRJkjTDrGgiMjRP5O08mHQsBg6kSUT+rT22C3BqVX0FIMkqwJbAlRO8333Aqt0Hq+oa4Brgk0lOpJlHMp5EZDWa+RMXtHFtRTNP5KpRrrkEWB+4f2jY1ThdAmxVVddN4JoVieEqmrk2x3Uc23EiN6mqS2mGmB2W5DzgVcD5jNGWJNvRJJDvrKo/tsf+vqvYvQzTl5IkSVo5rNDQrKr6A81wqlfTJCAAPwSeRpNoDCUn1wJ7JNk5ydbA54H1JnHLJcDuSdZP8ogkD0lyZLvC0yZJdqGZtD5aItHpPuAzSXZo3zwfC5xfVReMcs2ZNEOcTknynCSbtu06NKOv1vVh4HXt6lJPTLJ1klckOWScsU40hk8Db0iyb5ItkxxMM3F8TEk2S3JYkp3a1/U5wON48HUdqy3X0Sxf/LYkj03yWuAfu26zBJiXZPckj5rEkC1JkiTNYlPxPSJn03yyvRj+NL/gSmBpVf28LXMIzafop7flltKsyjRR7wT2AG6k+aT+AZqJ8cfRPBE5Gfguw0xuHsGdNF86+F/AOcBymmFII6pmdv+eNAnXl9r7ngRsAvxmlOtOB14IPIdmXsr5NE+SbhhnrBOKoaq+CnyEZqWqi9tznxvnLe4EHg98va37aOCzNAnkmG2pqsuAdwDvAf6PJlF9b1cbzgWOolll7FaaVdIkSZK0klihVbNms/Gu2qSZpV3Cd8BVsyRJkmamXq2aJUmSJEkTtqKT1WesJN+lWVJ4OP8K3NzDcGacdsndkTy/qn7Us2AkSZK00pmziQjwJuAhI5y7reO7MlZWo32p4a97FoUkSZJWSnM2Eakq30yPYoqWEZYkSZImxTkikiRJknrORESSJElSz5mISJIkSeo5ExFJkiRJPWciIkmSJKnnTEQkSZIk9ZyJiCRJkqSeMxGRJEmS1HMmIpIkSZJ6zkREkiRJUs+ZiEiSJEnqORORWS7JoiQ/7XcckiRJ0kSYiMxQSRYn+dQ4ih4O7N6H+0qSJEmTtlq/A9DkJAmwalUtB5b3Ox5JkiRpInwiMgMlORbYDdg/SbXbPu3P5ye5GLgHeGb30KwkxyY5JclBSW5NMpjkqCRrTPK+myW5Lsm7uspu257fot2vJG9J8t0kdyW5PsnLuq7ZKMnJSW5PcluSU5NsumKvliRJkmYjE5GZaX/gPOA/gQ3a7cb23EeBA4GtgctHuH739vxC4JXAS4GDJnnfXwFfBPbtKrsv8MOquq7j2EeArwPbACcAJyXZGiDJ6sDpwDJgV2AXmic53xstSUqyZpL5Qxuw9jjaIUmSpBnORGQGqqoB4F7gzqpaWlVLgQfa0x+qqjOq6hdVddsIVdwLvKGqflZVpwEfAvZLMmp/D3ffqnoAOBbYKsn28Kek4lU0CUqnr1XVMVV1TVV9ELgIeFt7bm+af29vqqorquoqmmRmY5qEaSTvBQY6tptGa4MkSZJmBxOR2eeicZS5rKru7Ng/D5gHbDSZG1bVzcBpwBvaQ38LrAl8ravoecPsb93+vg2wBbAsyfIky4HbgLWAzUe5/WHAgo5tw8m0QZIkSTOLk9Vnnzv6dN9jgOOTvJ3mScZXu5KdscwDLgZePcy5W0e6qKruoZkPA0AzR1+SJEmznYnIzHUvsOokr90myUOq6q52f0ea+Rg3jnLNWPf9Dk0S9BbgecCzhimzI3Bc1/6l7e+X0AzP+m1VDY4jDkmSJM1hDs2auZYAOyTZNMmjmFhfrQF8IckTkuwJHAwcWVV/nOh9h+aVdMwVOQy4tqq6h2EBvDzJG5JsmeRgYHvgyPbcCcDvgFOT7NquxrUwyRFJHG4lSZK0kjERmbkOp5mgfiXN0KWNJ3DtWcC1wA+BrwLfAhZNwX2/QJPkfGmEaw8CXkGzmtfrgFdW1ZUA7TCuZ9GswvUN4Kq2vrUAn5BIkiStZFJV/Y5BU6j9LpCHV9Ve01D3rjRJzkZV9ZuucwW8pKpOmer7dt1nPjAwMDDA/Pnzp/NWkiRJmoTBwUEWLFgAsGC0IfnOEdGYkqwJPJrmqcrXupMQSZIkaaIcmrUSSbLx0NK5I2wjDf96JXAD8HDg3b2LWJIkSXOVQ7NWIklWAzYdpciSqrq/R+FMikOzJEmSZjaHZukvtEnGdf2OQ5IkSXJoliRJkqSeMxGRJEmS1HMmIpIkSZJ6zkREkiRJUs+ZiEiSJEnqORMRSZIkST1nIiJJkiSp50xEJEmSJPWciYgkSZKknjMRkSRJktRzJiKSJEmSes5EZAolqSR7zYA4liQ5oN9xjEeSRUl+2u84JEmS1FsmIpIkSZJ6zkREkiRJUs+ZiHRJ8rIkVyS5K8nvk5yZ5GFJnpHkjCS/SzKQ5OwkTxujro2SnJzk9iS3JTk1yaYd5xcmuSDJHW2Zc5JsMs44/zbJhUnubmP6ZleRhyb5YpJlSX6V5M1d138syTVJ7kxyfZKPJFm94/yiJD9N8tp2qNdAkpOSrN1RZnGSI5J8vG3f0iSLuu7z8CTHJLk1yWCSHyTZZjxtbK9fM8n8oQ1Ye8yLJEmSNOOZiHRIsgFwIvBFYGtgIfANIDRvgL8MPBPYEbgW+E7nG/OuulYHTgeWAbsCuwDLge8lWSPJasApwNnAU4CdgKOBGkecLwC+CXwHeCqwO3BBV7F3Ahe15/8D+FySrTrOLwP2AZ4A7A/8A/D2rjo2B/YCXthuuwEHdpV5PXAHsAPwbuBDSfboOP81YF3g+cB2wCXAWUkeOVY7W+8FBjq2m8Z5nSRJkmawVI35vnel0T7huBjYtKpuGKPsKsDtwKuq6tvtsQJeUlWnJHkN8AFg62pf5CRrtNfsRZMk/B5YWFVnTzDOc4Hrq+o1I5xfAvyoql7b7gdYChxUVUeNcM27gFdU1dPb/UXAvwDrV9Wy9tjHgWdV1Y7t/mJg1arataOeC4AfVNWBSZ4JnAasW1X3dJS5Dvh4VR3d3mevqtp2hLjWBNbsOLQ2cNPAwADz588f5VWSJElSPwwODrJgwQKABVU1OFK51XoX0qxwGXAWcEWS04HvA/9dVX9Ish5wCM1TknWBVYGHAhuPUNc2wBbAsiYP+JO1gM2r6vtJjgVOT3IGcCZwclXdMo44twX+c4wylw/9UlWVZGkbNwBJ9gb2o3nqMY/m30L3P5QlQ0lI65bOOrrvM0yZbdq6f9/1Gjykve+Y2gSmM4kZz2WSJEma4UxEOlTVA+2wop2B5wBvAw5NsgPwOWAdmmFMN9C8OT4PWGOE6ubRPF159TDnbm3vt2+SI4DnAXsDhyTZo6rOHyPUu8bRnPu69ot2KF6SnYATgINoho8NAK+gGc41rjrGWWYeTWKycJj4bh81ekmSJM1pJiJd2mFU5wDnJPkwTdLxEpo5Hm+tqu9AMxEdeNQoVV1Ck1z8drRHUlV1KXApcFiS84BXAWMlIpfTzAv50rga9Zd2Bm6oqkOHDox3kvwEXQKsD9xfVUumoX5JkiTNUk5W75BkhyTvS/L0JBsDLwUeDVxFMzn9tUm2bp+QnMDoTyZOAH4HnJpk1ySbtatkHZFkw3b/sCQ7JdkkyXOAx7X3GsvBwCuTHNzG8+Qk75lAU68FNk7yiiSbJ9mPJtmaamfSPDU6JclzkmyaZOckhyZ5+jTcT5IkSbOEicifGwSeRbMa1TU0c0LeWVXfBd4IPILmU/7jgSOA345UUVXd2db1K5qVt64CvkAzR2QQuBN4PPD19l5HA58FPj9WkFW1GHg58CLgp8APgO3H28iq+hbwSeDI9vqdgY+M9/oJ3KeAPYEf0jy9uQY4CdgE+M1U30+SJEmzh6tmaVZpv0tkwFWzJEmSZqbxrprlExFJkiRJPWciMgMl+VmS5SNsw63CJUmSJM0qrpo1M+0JrD7COedWSJIkadYzEZmBxvpWd0mSJGm2c2iWJEmSpJ4zEZEkSZLUcyYikiRJknrORESSJElSz5mISJIkSeo5ExFJkiRJPWciIkmSJKnnTEQkSZIk9ZyJiCRJkqSeMxGRJEmS1HMmIpIkSZJ6zkREkiRJUs+ZiEiSJEnqORMRSZIkST1nIiJJkiSp51brdwDSZAwODvY7BEmSJA1jvO/TUlXTHIo0dZJsCvyyz2FIkiRpbBtW1a9HOukTEc02t7U/NwSW9TMQrbC1gZuwL+cK+3NusT/nDvtybplN/bk2cPNoBUxENFstqyrHZ81iSYZ+tS/nAPtzbrE/5w77cm6ZZf05ZnxOVpckSZLUcyYikiRJknrORESzzT3Awe1PzW725dxif84t9ufcYV/OLXOqP101S5IkSVLP+UREkiRJUs+ZiEiSJEnqORMRSZIkST1nIiJJkiSp50xEJEmSJPWciYj6Ksk/JVmS5O4kP0my/RjlX57k6rb8FUn27DqfJB9OckuSu5KcmeRx09sKDZmG/nxpku8n+X2SSrLt9LZAnaayP5OsnuRj7fE7ktyc5Lgkj5n+lmga/jYXtefvSPKH9r+1O0xvKzRkqvuzq+xR7X9vD5j6yDWcafj7PLbtw87te9PbiskxEVHfJNkb+Hea9bCfBlwGnJ5k3RHK7wycCHwBeCpwCnBKkid1FHs3sB/wj8AOwB1tnWtNVzvUmKb+fBjwY+A90xi6hjEN/fnQtp6PtD9fCmwFfGsamyGm7W/zGuCfgScDzwSWAN9P8uhpaoZa09SfQ2VfAuwI3Dw90avbNPbn94ANOrZXTksDVlRVubn1ZQN+AhzZsb8K8GvgwBHKfxX4dtex84Gj2t8D3AK8q+P8AuBu4BX9bu9c36a6P7uObwoUsG2/27mybNPZnx3nn9H268b9bu9c3nrUl/Pbvty93+2d69t09SfwV8BNwBNpEssD+t3WlWGbjv4EjgVO6XfbxrP5RER9kWQNYDvgzKFjVfXHdn+nES7bqbN86/SO8psB63fVOUDzRz5SnZoC09Sf6pMe9ucCmjevt086WI2qF33Z3uPNwADNp7maJtPVn0lWAY4HPlFVP5vKmDWyaf77XJjkt0l+nuRzSdaZorCnlImI+uVRwKrAb7qO/4YmmRjO+mOUX7/j2Hjr1NSYjv5U/0x7f7bDJT8GnFhVg5MPVWOYtr5M8sIky2meOr8d2KOqfrfCEWs009Wf7wHuB46Yghg1ftPVn98DXgfsTtO3uwHfTbLqigY81VbrdwCSpJVLktWBk2mGU76lz+Fo8v4X2JbmzdQ/ACcn2aGqftvfsDQRSbYD9geeVu24Hs1uVXVSx+4VSS4HfgEsBM7qS1Aj8ImI+uV3wAPAel3H1wOWjnDN0jHKL+04Nt46NTWmoz/VP9PWnx1JyCY0n6D7NGR6TVtfVtUdVXVdVZ1fVW+k+UT9jSseskYxHf25K7Au8Ksk9ye5n+bv89+SLJmKoDWinvy/s6qub++1xeTCnD4mIuqLqroXuJjmsSHwpzGquwPnjXDZeZ3lW3t0lP8lzR9iZ53zaVbPGqlOTYFp6k/1yXT1Z0cS8jjg2VX1+ykMW8Po8d/mKsCak4tU4zFN/Xk88BSap1tD283AJ4DnTlXs+ku9+vtMsiGwDs2CPjNLv2fLu628G7A3zdji1wNbA58H/gCs154/Djiso/zOwH3AO4HHA4uAe4EndZR5T1vHi2iWlTwFuB5Yq9/tnevbNPXnI2n+p7gnzaTmvdv99fvd3rm+TXV/AqsDpwI3AtvQjGce2tbod3vn8jYNffkw4F9plnndhGay7Rfbezyx3+2d69t0/Ld2mHsswVWzZmV/AvNoksgdaVac3J0m2bkGWLPf7f2L9vc7ALeVe6NZh/4G4B6a1a126Di3GDi2q/zLgZ+35f8P2LPrfIAP0zwZuZtmZYkt+93OlWWbhv7chyYB6d4W9butK8M2lf3Jg0swD7ct7Hdb5/o2xX25FvANmiVG76H59PxU4Bn9bufKsk31f2uHqX8JJiKzsj+Bh9CsovVbmgRlCXA0bWIz07a0QUuSJElSzzhHRJIkSVLPmYhIkiRJ6jkTEUmSJEk9ZyIiSZIkqedMRCRJkiT1nImIJEmSpJ4zEZEkSZLUcyYikiRJknrORESSJElSz5mISJIkSeo5ExFJkiRJPff/Af2qaego5kGwAAAAAElFTkSuQmCC\n"
          },
          "metadata": {
            "needs_background": "light"
          }
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "sns.jointplot(data=df,x='route',y='')"
      ],
      "metadata": {
        "id": "a7yl_yQ-uQ2I"
      },
      "execution_count": null, # type: ignore
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "We can see\n",
        "\n",
        "1.   route \n",
        "2.   booking_origin\n",
        "3.   flight_duration          \n",
        "4.   wants_extra_baggage      \n",
        "5.   length_of_stay          \n",
        "\n",
        " are the top 5 features which are dependant with booking_complete feature"
      ],
      "metadata": {
        "id": "GoizpwewVoju"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "#test train split\n",
        "\n",
        "from sklearn.model_selection import train_test_split\n",
        "\n",
        "# dataset split\n",
        "# creating a function for dataset split\n",
        "def dataset(X,y):\n",
        "    train_full_X, val_X, train_full_y, val_y = train_test_split(X, y,test_size=0.2,random_state = 0)\n",
        "\n",
        "# Use the same function above for the validation set\n",
        "    train_X, test_X, train_y, test_y = train_test_split(train_full_X, train_full_y, test_size=0.25,random_state = 0)\n",
        "    return (train_X, val_X, train_y, val_y)"
      ],
      "metadata": {
        "id": "Losgj--AWuoc"
      },
      "execution_count": null, # type: ignore
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "from sklearn.preprocessing import MinMaxScaler\n",
        "\n",
        "def scale(X):\n",
        "    scaler = MinMaxScaler()\n",
        "    scaler.fit(X)\n",
        "    return X"
      ],
      "metadata": {
        "id": "q7nI9Tr9W1IU"
      },
      "execution_count": null, # type: ignore
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "# Model 1 : Random forest classifier with top 6 features"
      ],
      "metadata": {
        "id": "8d7Oq8N6W8Lc"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "from sklearn.metrics import roc_auc_score\n",
        "from sklearn.metrics import accuracy_score\n",
        "from sklearn.ensemble import RandomForestClassifier\n",
        "\n",
        "\n",
        "\n",
        "#assigning the features under a list\n",
        "\n",
        "features=['route','booking_origin','flight_duration','wants_extra_baggage', 'length_of_stay','num_passengers']\n",
        "X= df[features]\n",
        "#one hot encoding\n",
        "X = pd.get_dummies(X, columns=features)\n",
        "X= scale(X)\n",
        "y= df.booking_complete       \n",
        "\n",
        "X_train,X_val,y_train,y_val= dataset(X,y)\n",
        "\n",
        "forest_model= RandomForestClassifier(random_state=1)\n",
        "forest_model.fit(X_train, y_train)\n",
        "preds= forest_model.predict(X_val)\n",
        "\n",
        "print('ACCURACY: ',accuracy_score(y_val,preds)*100)\n",
        "print('AUC score: ',roc_auc_score(y_val,preds))"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "qu8qczXoXUj7",
        "outputId": "2d640c57-5c28-42f1-859e-52c7679226d0"
      },
      "execution_count": null, # type: ignore
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "ACCURACY:  83.36\n",
            "AUC score:  0.5657818407546988\n"
          ]
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "# Model 2 : Random forest classifier with all features"
      ],
      "metadata": {
        "id": "LefYDSU_XGZ8"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "X= df.drop('booking_complete',axis=1)\n",
        "#one hot encoding\n",
        "X = pd.get_dummies(X)\n",
        "X= scale(X)\n",
        "y= df.booking_complete       \n",
        "\n",
        "X_train,X_val,y_train,y_val= dataset(X,y)\n",
        "\n",
        "forest_model= RandomForestClassifier(random_state=1)\n",
        "forest_model.fit(X_train, y_train)\n",
        "preds= forest_model.predict(X_val)\n",
        "\n",
        "print('ACCURACY: ',accuracy_score(y_val,preds)*100)\n",
        "print('AUC score: ',roc_auc_score(y_val,preds))"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "FRQuEttcYuNM",
        "outputId": "7971b91b-f5e9-429c-e3d2-ef20ebc41daa"
      },
      "execution_count": null, # type: ignore
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "ACCURACY:  84.76\n",
            "AUC score:  0.5479604084813514\n"
          ]
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "# Model 3 : XGB classifier with top 6 features"
      ],
      "metadata": {
        "id": "xbatS3WXXKFm"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "from xgboost import XGBClassifier\n",
        "\n",
        "X= df[features]\n",
        "\n",
        "#one hot encoding\n",
        "X = pd.get_dummies(X, columns=features)\n",
        "X= scale(X)\n",
        "\n",
        "y= df.booking_complete    \n",
        "\n",
        "X_train,X_val,y_train,y_val= dataset(X,y)\n",
        "xgb_model = XGBClassifier()\n",
        "\n",
        "xgb_model.fit(X_train, y_train)\n",
        "prediction_xgb = xgb_model.predict(X_val)\n",
        "print('ACCURACY: ',accuracy_score(y_val, prediction_xgb)*100)\n",
        "print('AUC score: ',roc_auc_score(y_val,prediction_xgb))"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "Mpfm2eO3ZSvb",
        "outputId": "5914107b-6b07-4064-9720-8fa6c7a2c652"
      },
      "execution_count": null, # type: ignore
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "ACCURACY:  84.87\n",
            "AUC score:  0.5005431112674873\n"
          ]
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "# Model 4 : XGB classifier with all features"
      ],
      "metadata": {
        "id": "n5b68yEWXO4b"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "X= df.drop('booking_complete',axis=1)\n",
        "#one hot encoding\n",
        "X = pd.get_dummies(X)\n",
        "X= scale(X)\n",
        "y= df.booking_complete \n",
        "\n",
        "X_train,X_val,y_train,y_val= dataset(X,y)\n",
        "\n",
        "\n",
        "xgb_model = XGBClassifier()\n",
        "xgb_model.fit(X_train, y_train)\n",
        "prediction_xgb = xgb_model.predict(X_val)\n",
        "print('ACCURACY: ',accuracy_score(y_val, prediction_xgb)*100)\n",
        "print('AUC score: ',roc_auc_score(y_val,prediction_xgb))"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "UsVzf8iAZR0c",
        "outputId": "d7442c87-b782-4c03-c588-b8f2ac97b5ed"
      },
      "execution_count": null, # type: ignore
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "ACCURACY:  84.83000000000001\n",
            "AUC score:  0.5065532363131326\n"
          ]
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "Finalizing random forest model with all features as final model, as it has goos accuracy and higher auc score compared to other models\n",
        "\n",
        "Validating with test data set"
      ],
      "metadata": {
        "id": "TtPr9DwFqpwo"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "X= df.drop('booking_complete',axis=1)\n",
        "#one hot encoding\n",
        "X = pd.get_dummies(X)\n",
        "X= scale(X)\n",
        "y= df.booking_complete       \n",
        "\n",
        "train_full_X,test_X, train_full_y, test_y = train_test_split(X, y,test_size=0.2,random_state = 0)\n",
        "\n",
        "forest_model= RandomForestClassifier(random_state=1)\n",
        "forest_model.fit(train_full_X, train_full_y)\n",
        "preds= forest_model.predict(test_X)\n",
        "\n",
        "print('ACCURACY: ',accuracy_score(test_y,preds)*100)\n",
        "print('AUC score: ',roc_auc_score(test_y,preds))"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "LECxTwU1ql6o",
        "outputId": "5b8a01ec-cff1-4b55-c865-4ffb7ed6ea8e"
      },
      "execution_count": null, # type: ignore
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "ACCURACY:  85.09\n",
            "AUC score:  0.5577796717361984\n"
          ]
        }
      ]
    }
  ]
}