<a href="https://colab.research.google.com/github/aturant/ActuarialDataScience/blob/master/submission_week1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<div style="text-align: center">
  <img alt="AIcrowd" src="https://gitlab.aicrowd.com/jyotish/pricing-game-notebook-scripts/raw/master/pricing-game-banner.png">
</div>

# How to use this notebook 📝

1. **Copy the notebook**. This is a shared template and any edits you make here will not be saved. _You should copy it into your own drive folder._ For this, click the "File" menu (top-left), then "Save a Copy in Drive". You can edit your copy however you like.
2. **Link it to your AICrowd account**. In order to submit your code to AICrowd, you need to provide your account's API key (see [_"Configure static variables"_](#static-var) for details).
3. **Stick to the function definitions**. The submission to AICrowd will look for the pre-defined function names:
  - `fit_model`
  - `save_model`
  - `load_model`
  - `predict_expected_claim`
  - `predict_premium`

    Anything else you write outside of these functions will not be part of the final submission (including constants and utility functions), so make sure everything is defined within them, except for:
4. **Define your preprocessing**. In addition to the functions above, anything in the cell labelled [_"Define your data preprocessing"_](#data-preprocessing) will also be imported into your final submission. 

# Your pricing model 🕵️

In this notebook, you can play with the data, and define and train your pricing model. You can then directly submit it to the AICrowd, with some magic code at the end.

# Setup the notebook 🛠

In [1]:
!bash <(curl -sL https://gitlab.aicrowd.com/jyotish/pricing-game-notebook-scripts/raw/master/python/setup.sh)
from aicrowd_helpers import *

⚙️ Installing AIcrowd utilities...
  Running command git clone -q https://gitlab.aicrowd.com/yoogottamk/aicrowd-cli /tmp/pip-req-build-d_2vx725
✅ Installed AIcrowd utilities


# Configure static variables 📎
<a name="static-var"></a>

In order to submit using this notebook, you must visit this URL https://aicrowd.com/participants/me and copy your API key. 

Then you must set the value of `AICROWD_API_KEY` wuth the value.

In [2]:
import sklearn

class Config:
  TRAINING_DATA_PATH = 'training.csv'
  MODEL_OUTPUT_PATH = 'model.pkl'
  #MODEL_OUTPUT_PATH = 'model'
  AICROWD_API_KEY = '0b70146b9714e6c7f3393b86c15e9c4d'  # You can get the key from https://aicrowd.com/participants/me
  ADDITIONAL_PACKAGES = [
    'numpy',  # you can define versions as well, numpy==0.19.2
    'pandas',
    'scikit-learn==' + sklearn.__version__,
    'h2o',
    'xgboost'
  ]

In [3]:
!pip install h2o

Collecting h2o
[?25l  Downloading https://files.pythonhosted.org/packages/26/c5/d63a8bfdbeb4ebfb709c010af3e061d89a363204c437cb5527431f6de3d2/h2o-3.32.0.2.tar.gz (164.6MB)
[K     |████████████████████████████████| 164.6MB 94kB/s 
Building wheels for collected packages: h2o
  Building wheel for h2o (setup.py) ... [?25l[?25hdone
  Created wheel for h2o: filename=h2o-3.32.0.2-py2.py3-none-any.whl size=164620456 sha256=4035483d082e1584bd1b7b855472de17d5cd94fb5b735d2f759007cd0dcf917a
  Stored in directory: /root/.cache/pip/wheels/42/bd/ea/218fd15724eddf6fa7fc8fab802b6fa592e623d87199679721
Successfully built h2o
Installing collected packages: h2o
Successfully installed h2o-3.32.0.2


# Download dataset files 💾

In [4]:
%download_aicrowd_dataset

💾 Downloading dataset...
Verifying API Key...
API Key valid
Saved API Key successfully!
✅ Downloaded dataset


# Packages 🗃

<a name="packages"></a>

Import here all the packages you need to define your model. **You will need to include all of these packages in `Config.ADDITIONAL_PACKAGES` for your code to run properly once submitted.**

In [None]:
%%track_imports

import numpy as np
import pandas as pd
import pickle
from sklearn.linear_model import LinearRegression 
#import h2o
#h2o.init()
import xgboost as xgb
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.ensemble import RandomForestRegressor
from sklearn.ensemble import AdaBoostRegressor
from sklearn.experimental import enable_hist_gradient_boosting
from sklearn.ensemble import HistGradientBoostingRegressor

In [None]:
import importlib
import global_imports
importlib.reload(global_imports)
from global_imports import *  # do not change this

# Loading the data 📲

In [None]:
df = pd.read_csv(Config.TRAINING_DATA_PATH)
X_train = df.drop(columns=['claim_amount'])
y_train = df['claim_amount']

## How does the data look like? 🔍

In [None]:
X_train.sample(n=4)

Unnamed: 0,id_policy,year,pol_no_claims_discount,pol_coverage,pol_duration,pol_sit_duration,pol_pay_freq,pol_payd,pol_usage,drv_sex1,drv_age1,drv_age_lic1,drv_drv2,drv_sex2,drv_age2,drv_age_lic2,vh_make_model,vh_age,vh_fuel,vh_type,vh_speed,vh_value,vh_weight,population,town_surface_area
163257,PL085089,3.0,0.0,Max,5,3,Yearly,Yes,Retired,F,90.0,64.0,No,0,,,nilvygybpajtnxnr,5.0,Gasoline,Tourism,170.0,11405.0,1197.0,220.0,124.5
14559,PL024390,1.0,0.0,Max,14,3,Biannual,No,Retired,M,68.0,49.0,Yes,F,66.0,33.0,rwtwnvhjqabvovnz,2.0,Diesel,Tourism,172.0,29455.0,1910.0,540.0,42.4
122520,PL052513,3.0,0.0,Max,16,5,Yearly,No,Retired,M,68.0,48.0,Yes,F,60.0,41.0,gsooyxmnwsucrksh,19.0,Gasoline,Tourism,227.0,66150.0,1484.0,300.0,226.2
20126,PL033685,1.0,0.0,Max,4,2,Yearly,No,WorkPrivate,M,63.0,41.0,Yes,F,61.0,40.0,iwhqpdfuhrsxyqxe,10.0,Gasoline,Tourism,150.0,14159.0,1193.0,790.0,79.4


In [None]:
y_train.sample(n=4)

186290    0.0
211811    0.0
124036    0.0
221512    0.0
Name: claim_amount, dtype: float64

# Training the model 🚀

You must first define your first function: `fit_model`. This function takes training data as arguments, and outputs a "model" object -- that you define as you wish. For instance, this could be an array of parameter values.

## Define your data preprocessing

<a name="data-preprocessing"></a>

You can add any class or function in this cell for preprocessing. Just make sure that you use the functions here in the `fit_model`, `predict_expected_claim` and `predict_premium` functions if necessary. *italicised text*

In [None]:
%%aicrowd_include
# This magical command saves all code in this cell to a utils module.

# include your preprocessing functions and classes here.
import xgboost as xgb

def vh_value_to_weight(x):
  if x['vh_weight'] == 0:
    return 10^6
  else:
    return x['vh_value']/x['vh_weight']

In [None]:
import importlib
import utils
importlib.reload(utils)
from utils import *  # do not change this

## Define the training logic

In [None]:
def fit_model(X_raw, y_raw):
    """Model training function: given training data (X_raw, y_raw), train this pricing model.

    Parameters
    ----------
    X_raw : Pandas dataframe, with the columns described in the data dictionary.
        Each row is a different contract. This data has not been processed.
    y_raw : a Numpy array, with the value of the claims, in the same order as contracts in X_raw.
        A one dimensional array, with values either 0 (most entries) or >0.

    Returns
    -------
    self: this instance of the fitted model. This can be anything, as long as it is compatible
        with your prediction methods.

    """

    # TODO: train your model here.
    # Don't forget any preprocessing of the raw data here
    
    #linear regression
    from sklearn.linear_model import LinearRegression 

    #X = pd.get_dummies(X_raw[['pol_coverage','drv_age1']])
    X_raw[['vh_age','vh_speed','vh_value','vh_weight']] = X_raw[['vh_age','vh_speed','vh_value','vh_weight']].fillna(X_raw[['vh_age','vh_speed','vh_value','vh_weight']].mean())
    X = pd.get_dummies(X_raw.drop(columns=['id_policy','vh_make_model','drv_age1','drv_age_lic1','drv_age2','drv_age_lic2']))
    X['drv_age_min']=X_raw[['drv_age1','drv_age2']].min(axis=1)
    X['drv_age_lic_min']=X_raw[['drv_age_lic1','drv_age_lic2']].min(axis=1)
    #X['drv_age_max']=X_raw[['drv_age1','drv_age2']].max(axis=1)
    #X['drv_age_lic_max']=X_raw[['drv_age_lic1','drv_age_lic2']].max(axis=1)
    #X['drv_age_diff']=X['drv_age_max'] - X['drv_age_min']
    X['drv_age_diff']=X_raw[['drv_age1','drv_age2']].max(axis=1) - X['drv_age_min']
    #X['drv_age_lic_diff']=X['drv_age_lic_max'] - X['drv_age_lic_min']
    X['one_driver'] = (X_raw['drv_age2'].isnull())*1.0
    #X['vh_value_to_weight'] = X_raw['vh_value']/X_raw['vh_weight']
    
    #function to compute vh_value_to_weight taking into account 0 weight
    #X['vh_value_to_weight'] = X_raw.apply(lambda x: vh_value_to_weight(x), axis = 1)


    reg = LinearRegression().fit(X, y_raw)

    #h2o gbm
    #import h2o
    #from h2o.estimators import H2OGradientBoostingEstimator
    
    #cat_columns = ['pol_coverage','pol_pay_freq','pol_payd','pol_usage','drv_sex1','drv_sex2','vh_fuel','vh_type']
    #col_types = dict(zip(cat_columns, ['enum']*len(cat_columns)))

    #train_data_h2o = h2o.H2OFrame(X_raw.merge(y_raw.to_frame(), left_index=True, right_index=True),column_types = col_types)

    #model_h2o = H2OGradientBoostingEstimator(ntrees = 500, max_depth=4, stopping_metric = 'rmse')
    #predictors = list(X_raw.drop(['id_policy', 'year', 'vh_make_model'], axis=1).columns)

    #model_h2o.train(x=predictors,y='claim_amount',training_frame = train_data_h2o)
    
    import xgboost as xgb
    #X = pd.get_dummies(X_raw.drop(columns=['id_policy','vh_make_model']))

    #model_xgb = xgb.XGBRegressor(
    #    n_estimators=100,
    #    reg_lambda=1,
    #    gamma=0,
    #    max_depth=2
    #    )
    
    #model_xgb.fit(X, y_raw)
    
    #model_gbm = GradientBoostingRegressor(random_state=0, max_depth = 2)
    #model_gbm = GradientBoostingRegressor(random_state=0, n_estimators = 500, subsample = 0.9, max_depth = 4)
    #model_gbm.fit(X, y_raw)

    #model_rf = RandomForestRegressor(random_state=0)
    #model_rf.fit(X, y_raw)

    #model_ada = AdaBoostRegressor(random_state = 0)
    #model_ada.fit(X, y_raw)

    #model_hist = HistGradientBoostingRegressor(random_state = 0)
    #model_hist.fit(X, y_raw)

    both_data = X.merge(y_raw.to_frame(), left_index=True, right_index=True)
    both_data_sev = both_data[both_data['claim_amount']>0]

    both_data_max = both_data[both_data['pol_coverage_Max'] == 1]
    both_data_sev_max = both_data_sev[both_data_sev['pol_coverage_Max'] == 1]

    both_data_notmax = both_data[both_data['pol_coverage_Max'] == 0]
    both_data_sev_notmax = both_data_sev[both_data_sev['pol_coverage_Max'] == 0]

    model_gbm_sev = GradientBoostingRegressor(random_state=0, max_depth = 2, n_estimators = 30)
    #model_gbm = GradientBoostingRegressor(random_state=0, n_estimators = 500, subsample = 0.9, max_depth = 4)
    model_gbm_sev.fit(both_data_sev.drop(['claim_amount'],axis=1), both_data_sev['claim_amount'])

    model_gbm_sev_max = GradientBoostingRegressor(random_state=0, max_depth = 2, n_estimators = 25)
    #model_gbm = GradientBoostingRegressor(random_state=0, n_estimators = 500, subsample = 0.9, max_depth = 4)
    model_gbm_sev_max.fit(both_data_sev_max.drop(['claim_amount'],axis=1), both_data_sev_max['claim_amount'])
    
    model_gbm_sev_notmax = GradientBoostingRegressor(random_state=0, max_depth = 2, n_estimators = 15)
    #model_gbm = GradientBoostingRegressor(random_state=0, n_estimators = 500, subsample = 0.9, max_depth = 4)
    model_gbm_sev_notmax.fit(both_data_sev_notmax.drop(['claim_amount'],axis=1), both_data_sev_notmax['claim_amount'])

    model_gbm_freq = GradientBoostingClassifier(random_state=0, max_depth = 3, n_estimators = 120)
    #model_gbm = GradientBoostingRegressor(random_state=0, n_estimators = 500, subsample = 0.9, max_depth = 4)
    model_gbm_freq.fit(both_data.drop(['claim_amount'],axis=1), (both_data['claim_amount']>0)*1)

    model_gbm_freq_max = GradientBoostingClassifier(random_state=0, max_depth = 3, n_estimators = 100)
    #model_gbm = GradientBoostingRegressor(random_state=0, n_estimators = 500, subsample = 0.9, max_depth = 4)
    model_gbm_freq_max.fit(both_data_max.drop(['claim_amount'],axis=1), (both_data_max['claim_amount']>0)*1)

    model_gbm_freq_notmax = GradientBoostingClassifier(random_state=0, max_depth = 3, n_estimators = 50)
    #model_gbm = GradientBoostingRegressor(random_state=0, n_estimators = 500, subsample = 0.9, max_depth = 4)
    model_gbm_freq_notmax.fit(both_data_notmax.drop(['claim_amount'],axis=1), (both_data_notmax['claim_amount']>0)*1)

    return [model_gbm_freq, model_gbm_sev,model_gbm_freq_max, model_gbm_sev_max,model_gbm_freq_notmax, model_gbm_sev_notmax]#[model_gbm_freq, model_gbm_sev]#model_h2o#[model_gbm_freq, model_gbm_sev]#[model_gbm, reg, both_data] #[model_gbm, reg, model_hist] #[model_hist, model_hist, model_hist] #[model_gbm, reg, model_rf]#[model_gbm, reg, model_xgb] #[model_gbm, reg] #reg #model_gbm #reg #model_xgb #model_h2o #reg #np.mean(y_raw)  # By default, training a model that returns a mean value (a mean model).

## Train your model

In [None]:
trained_model = fit_model(X_train, y_train)

In [None]:
(trained_model[2]).columns
((trained_model[2]['claim_amount']>0)*1.0).sum()

23292.0

**Important note**: your training code should be able to run in under 10 minutes (since this notebook is re-run entirely on the server side). 

If you run into an issue here we recommend using the *zip file submission* (see the [challenge page](https://www.aicrowd.com/challenges/insurance-pricing-game/#how-to%20submit)). In short, you can simply do this by copy-pasting your `fit_model`, `predict_expected_claim` and `predict_premium` functions to the `model.py` file.

Note that if you want to perform extensive cross-validation/hyper-parameter selection, it is better to do them offline, in a separate notebook.

## Saving your model

You can save your model to a file here, so you don't need to retrain it every time.

In [None]:
def save_model(model_path):
  with open(model_path, 'wb') as target_file:
      pickle.dump(trained_model, target_file)

#def save_model(model_path):
#  model_path_2 = h2o.save_model(model=trained_model, path=model_path, force=True)
#  return model_path_2

In [None]:
save_model(Config.MODEL_OUTPUT_PATH)
#print(Config.MODEL_OUTPUT_PATH)
#model_path_2 = h2o.save_model(model=trained_model, path=Config.MODEL_OUTPUT_PATH, force=True)
#s = save_model(Config.MODEL_OUTPUT_PATH)
#print(s)

In [None]:
#class Config:
#  TRAINING_DATA_PATH = 'training.csv'
#  #MODEL_OUTPUT_PATH = 'model.pkl'
#  MODEL_OUTPUT_PATH = s
#  AICROWD_API_KEY = '0b70146b9714e6c7f3393b86c15e9c4d'  # You can get the key from https://aicrowd.com/participants/me
#  ADDITIONAL_PACKAGES = [
#    'numpy',  # you can define versions as well, numpy==0.19.2
#    'pandas',
#    'scikit-learn==' + sklearn.__version__,
#    'h2o'
#  ]

If you need to load it from file, you can use this code:

In [None]:
def load_model(model_path):
  with open(model_path, 'rb') as target:
      return pickle.load(target)

#def load_model(model_path):
#  saved_model = h2o.load_model(model_path)
#  return saved_model

In [None]:
trained_model = load_model(Config.MODEL_OUTPUT_PATH)
#trained_model = load_model(s)


In [None]:
#print(s)

/content/model/GBM_model_python_1608744081390_4


# Predicting the claims 💵

The second function, `predict_expected_claim`, takes your trained model and a dataframe of contracts, and outputs a prediction for the (expected) claim incurred by each contract. This expected claim can be seen as the probability of an accident multiplied by the cost of that accident.

This is the function used to compute the _RMSE_ leaderboard, where the model best able to predict claims wins.

In [None]:
def predict_expected_claim(model, X_raw):
    """Model prediction function: predicts the expected claim based on the pricing model.

    This functions estimates the expected claim made by a contract (typically, as the product
    of the probability of having a claim multiplied by the expected cost of a claim if it occurs),
    for each contract in the dataset X_raw.

    This is the function used in the RMSE leaderboard, and hence the output should be as close
    as possible to the expected cost of a contract.

    Parameters
    ----------
    model: a Python object that describes your model. This can be anything, as long
        as it is consistent with what `fit` outpurs.
    X_raw : Pandas dataframe, with the columns described in the data dictionary.
        Each row is a different contract. This data has not been processed.

    Returns
    -------
    avg_claims: a one-dimensional Numpy array of the same length as X_raw, with one
        expected claim per contract (in same order). These expected claims must be POSITIVE (>0).
    """

    # TODO: estimate the expected claim of every contract.
    # Don't forget any preprocessing of the raw data here
    
    #linear regression
    #X = pd.get_dummies(X_raw[['pol_coverage','drv_age1']])
    X_raw[['vh_age','vh_speed','vh_value','vh_weight']] = X_raw[['vh_age','vh_speed','vh_value','vh_weight']].fillna(X_raw[['vh_age','vh_speed','vh_value','vh_weight']].mean())
    X = pd.get_dummies(X_raw.drop(columns=['id_policy','vh_make_model','drv_age1','drv_age_lic1','drv_age2','drv_age_lic2']))
    X['drv_age_min']=X_raw[['drv_age1','drv_age2']].min(axis=1)
    X['drv_age_lic_min']=X_raw[['drv_age_lic1','drv_age_lic2']].min(axis=1)
    #X['drv_age_max']=X_raw[['drv_age1','drv_age2']].max(axis=1)
    #X['drv_age_lic_max']=X_raw[['drv_age_lic1','drv_age_lic2']].max(axis=1)
    #X['drv_age_diff']=X['drv_age_max'] - X['drv_age_min']
    X['drv_age_diff']=X_raw[['drv_age1','drv_age2']].max(axis=1) - X['drv_age_min']
    #X['drv_age_lic_diff']=X['drv_age_lic_max'] - X['drv_age_lic_min']
    X['one_driver'] = (X_raw['drv_age2'].isnull())*1.0
    #X['vh_value_to_weight'] = X_raw['vh_value']/X_raw['vh_weight']
    #X['vh_value_to_weight'] = X_raw.apply(lambda x: vh_value_to_weight(x), axis = 1)

    #h2o
    #cat_columns = ['pol_coverage','pol_pay_freq','pol_payd','pol_usage','drv_sex1','drv_sex2','vh_fuel','vh_type']
    #col_types = dict(zip(cat_columns, ['enum']*len(cat_columns)))

    #X = h2o.H2OFrame(X_raw,column_types = col_types)
    
    #xgb
    import xgboost as xgb
    #X = pd.get_dummies(X_raw.drop(columns=['id_policy','vh_make_model']))




    #return ((model[0]).predict(X) + (model[1]).predict(X) + (model[2]).predict(X))/3 #0.5*((model[0]).predict(X) + (model[1]).predict(X)) #model.predict(X) #np.full( (len(X_raw.index),), model )  # Estimate that each contract will cost 114 (this is the naive mean model). You should change this!
    #return 0.5*((model[0]).predict(X) + (model[1]).predict(X))
    return [c[1] for c in (model[0]).predict_proba(X)] * (model[1]).predict(X)
    #return X
    #if (X['pol_coverage_Max'] == 1):
    #  return [c[1] for c in (model[2]).predict_proba(X)] * (model[3]).predict(X)
    #else:
    #  return [c[1] for c in (model[4]).predict_proba(X)] * (model[5]).predict(X)
    
    #return (X['pol_coverage_Max'] == 1) * [c[1] for c in (model[2]).predict_proba(X)] * (model[3]).predict(X) + (X['pol_coverage_Max'] == 0)*[c[1] for c in (model[4]).predict_proba(X)] * (model[5]).predict(X)

To test your function, run it on your training data:

In [None]:
X_train[['vh_age','vh_speed','vh_value','vh_weight']] = X_train[['vh_age','vh_speed','vh_value','vh_weight']].fillna(X_train[['vh_age','vh_speed','vh_value','vh_weight']].mean())
X = pd.get_dummies(X_train.drop(columns=['id_policy','vh_make_model','drv_age1','drv_age_lic1','drv_age2','drv_age_lic2']))
X['drv_age_min']=X_train[['drv_age1','drv_age2']].min(axis=1)
X['drv_age_lic_min']=X_train[['drv_age_lic1','drv_age_lic2']].min(axis=1)
X['one_driver'] = (X_train['drv_age2'].isnull())*1.0
print(trained_model[0].predict(X).sum())

0.0


In [None]:
predict_expected_claim(trained_model, X_train)

array([126.65886694,  84.65301511, 105.20172993, ..., 117.7890563 ,
        62.90082219, 144.34193765])

# Pricing contracts 💰💰

The third and final function, `predict_premium`, takes your trained model and a dataframe of contracts, and outputs a _price_ for each of these contracts. **You are free to set this prices however you want!** These prices will then be used in competition with other models: contracts will choose the model offering the lowest price, and this model will have to pay the cost if an accident occurs.

This is the function used to compute the _profit_ leaderboard: your model will participate in many markets of size 10, populated by other participants' model, and we compute the average profit of your model over all the markets it participated in.

In [None]:
def predict_premium(model, X_raw):
    """Model prediction function: predicts premiums based on the pricing model.

    This function outputs the prices that will be offered to the contracts in X_raw.
    premium will typically depend on the average claim predicted in 
    predict_average_claim, and will add some pricing strategy on top.

    This is the function used in the average profit leaderboard. Prices output here will
    be used in competition with other models, so feel free to use a pricing strategy.

    Parameters
    ----------
    model: a Python object that describes your model. This can be anything, as long
        as it is consistent with what `fit` outpurs.
    X_raw : Pandas dataframe, with the columns described in the data dictionary.
        Each row is a different contract. This data has not been processed.

    Returns
    -------
    prices: a one-dimensional Numpy array of the same length as X_raw, with one
        price per contract (in same order). These prices must be POSITIVE (>0).
    """

    # TODO: return a price for everyone.
    # Don't forget any preprocessing of the raw data here

    return predict_expected_claim(model, X_raw)*1.05  # Default: price at the pure premium with no pricing strategy.

To test your function, run it on your training data.

In [None]:
prices = predict_premium(trained_model, X_train)

#### Profit on training data

In order for your model to be considered in the profit competition, it needs to make nonnegative profit over its training set. You can check that your model satisfies this condition below:

In [None]:
print('Premium offered:', prices.mean())
print('Premium offered min:', prices.min())
print('Premium offered max:', prices.max())
print('Income:', prices.sum())
print('Losses:', y_train.sum())

if prices.sum() < y_train.sum():
    print('Your model loses money on the training data! It does not satisfy market rule 1: Non-negative training profit.')
    print('This model will be disqualified from the weekly profit leaderboard, but can be submitted for educational purposes to the RMSE leaderboard.')
else:
    print('Your model passes the non-negative training profit test!')

Premium offered: 119.87324173993659
Premium offered min: 6.168047882032584
Premium offered max: 920.2660005203054
Income: 27356991.73692137
Losses: 26057988.080000006
Your model passes the non-negative training profit test!


# Ready? Submit to AIcrowd 🚀

If you are satisfied with your code, run the code below to send your code to the AICrowd servers for evaluation! This requires the variable `trained_model` to be defined by your previous code.

**Make sure you have included all packages needed to run your code in the [_"Packages"_](#packages) section.**

In [None]:
%aicrowd_submit

🚀 Preparing to submit...
⚙️ Collecting the submission code...
💾 Preparing the submission zip file...
adding: requirements.txt (stored 0%)
adding: config.json (deflated 8%)
adding: predict.py (deflated 52%)
adding: predict_premium.py (deflated 53%)
adding: utils.py (deflated 33%)
adding: model.pkl (deflated 65%)
adding: fit_model.py (deflated 75%)
adding: global_imports.py (deflated 60%)
adding: predict_expected_claim.py (deflated 63%)
adding: load_model.py (deflated 29%)
adding: save_model.py (deflated 33%)
Verifying API Key...
API Key valid
Saved API Key successfully!
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                           Successfully submitted!                                            │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Important links
┌──────────────────┬─────────────────────────────────────────────────