**Automated machine learning** (**AutoML**) is the process of automating the end-to-end process of applying machine learning to real-world problems. AutoML makes machine learning available in a true sense, even to people with no major expertise in this field.

# Advantages

The advantages of AutoML can be summed up in three major points:

-   **Increases productivity**  by automating repetitive tasks. This enables a  data scientist to focus more on the problem rather than the models.
-   Automating the ML pipeline also helps to  **avoid errors** that might creep in manually.
-   Ultimately,  AutoML is a step towards **democratizing machine learning** by making the power of ML accessible to everybody.

# [H2O AutoML](http://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html)
H2O’s AutoML can be used for automating the machine learning workflow, which includes automatic training and tuning of many models within a user-specified time-limit. [Stacked Ensembles](http://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/stacked-ensembles.html) – one based on all previously trained models, another one on the best model of each family – will be automatically trained on collections of individual models to produce highly predictive ensemble models which, in most cases, will be the top performing models in the AutoML Leaderboard.

Properties of H2O AutoML

* Basic data pre-processing (as in all H2O algos).

* Trains a random grid of GBMs, DNNs, GLMs, etc. using a carefully chosen hyper-parameter space.

* Individual models are tuned using cross-validation.

* Two Stacked Ensembles are trained (“All Models” ensemble & a lightweight “Best of Family” ensemble).

* Returns a sorted “Leaderboard” of all models. All models can be easily exported to production.


# Objective

Our job is to predict how long a car on a production line will take to pass the testing phase. This is a classical regression problem, and we're evaluated with the R2 metric.


In [45]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

pal = sns.color_palette()

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

/kaggle/input/house-prices-advanced-regression-techniques/sample_submission.csv
/kaggle/input/house-prices-advanced-regression-techniques/data_description.txt
/kaggle/input/house-prices-advanced-regression-techniques/train.csv
/kaggle/input/house-prices-advanced-regression-techniques/test.csv


# Start H2O
Import the h2o Python module and H2OAutoML class and initialize a local H2O cluster

In [46]:
import h2o
print(h2o.__version__)
from h2o.automl import H2OAutoML

h2o.init(max_mem_size='16G')

3.36.1.3
Checking whether there is an H2O instance running at http://localhost:54321 . connected.


0,1
H2O_cluster_uptime:,1 hour 20 mins
H2O_cluster_timezone:,Etc/UTC
H2O_data_parsing_timezone:,UTC
H2O_cluster_version:,3.36.1.3
H2O_cluster_version_age:,1 month and 11 days
H2O_cluster_name:,H2O_from_python_unknownUser_f3mkfz
H2O_cluster_total_nodes:,1
H2O_cluster_free_memory:,15.80 Gb
H2O_cluster_total_cores:,4
H2O_cluster_allowed_cores:,4


# Load data into H2O

In [47]:
%%time
train = h2o.import_file("../input/house-prices-advanced-regression-techniques/train.csv")
test = h2o.import_file("../input/house-prices-advanced-regression-techniques/test.csv")

Parse progress: |████████████████████████████████████████████████████████████████| (done) 100%
Parse progress: |████████████████████████████████████████████████████████████████| (done) 100%
CPU times: user 51.3 ms, sys: 7.08 ms, total: 58.4 ms
Wall time: 563 ms


Let's take a look at the data.

In [48]:
train.head(5)

Id,MSSubClass,MSZoning,LotFrontage,LotArea,Street,Alley,LotShape,LandContour,Utilities,LotConfig,LandSlope,Neighborhood,Condition1,Condition2,BldgType,HouseStyle,OverallQual,OverallCond,YearBuilt,YearRemodAdd,RoofStyle,RoofMatl,Exterior1st,Exterior2nd,MasVnrType,MasVnrArea,ExterQual,ExterCond,Foundation,BsmtQual,BsmtCond,BsmtExposure,BsmtFinType1,BsmtFinSF1,BsmtFinType2,BsmtFinSF2,BsmtUnfSF,TotalBsmtSF,Heating,HeatingQC,CentralAir,Electrical,1stFlrSF,2ndFlrSF,LowQualFinSF,GrLivArea,BsmtFullBath,BsmtHalfBath,FullBath,HalfBath,BedroomAbvGr,KitchenAbvGr,KitchenQual,TotRmsAbvGrd,Functional,Fireplaces,FireplaceQu,GarageType,GarageYrBlt,GarageFinish,GarageCars,GarageArea,GarageQual,GarageCond,PavedDrive,WoodDeckSF,OpenPorchSF,EnclosedPorch,3SsnPorch,ScreenPorch,PoolArea,PoolQC,Fence,MiscFeature,MiscVal,MoSold,YrSold,SaleType,SaleCondition,SalePrice
1,60,RL,65,8450,Pave,,Reg,Lvl,AllPub,Inside,Gtl,CollgCr,Norm,Norm,1Fam,2Story,7,5,2003,2003,Gable,CompShg,VinylSd,VinylSd,BrkFace,196,Gd,TA,PConc,Gd,TA,No,GLQ,706,Unf,0,150,856,GasA,Ex,Y,SBrkr,856,854,0,1710,1,0,2,1,3,1,Gd,8,Typ,0,,Attchd,2003,RFn,2,548,TA,TA,Y,0,61,0,0,0,0,,,,0,2,2008,WD,Normal,208500
2,20,RL,80,9600,Pave,,Reg,Lvl,AllPub,FR2,Gtl,Veenker,Feedr,Norm,1Fam,1Story,6,8,1976,1976,Gable,CompShg,MetalSd,MetalSd,,0,TA,TA,CBlock,Gd,TA,Gd,ALQ,978,Unf,0,284,1262,GasA,Ex,Y,SBrkr,1262,0,0,1262,0,1,2,0,3,1,TA,6,Typ,1,TA,Attchd,1976,RFn,2,460,TA,TA,Y,298,0,0,0,0,0,,,,0,5,2007,WD,Normal,181500
3,60,RL,68,11250,Pave,,IR1,Lvl,AllPub,Inside,Gtl,CollgCr,Norm,Norm,1Fam,2Story,7,5,2001,2002,Gable,CompShg,VinylSd,VinylSd,BrkFace,162,Gd,TA,PConc,Gd,TA,Mn,GLQ,486,Unf,0,434,920,GasA,Ex,Y,SBrkr,920,866,0,1786,1,0,2,1,3,1,Gd,6,Typ,1,TA,Attchd,2001,RFn,2,608,TA,TA,Y,0,42,0,0,0,0,,,,0,9,2008,WD,Normal,223500
4,70,RL,60,9550,Pave,,IR1,Lvl,AllPub,Corner,Gtl,Crawfor,Norm,Norm,1Fam,2Story,7,5,1915,1970,Gable,CompShg,Wd Sdng,Wd Shng,,0,TA,TA,BrkTil,TA,Gd,No,ALQ,216,Unf,0,540,756,GasA,Gd,Y,SBrkr,961,756,0,1717,1,0,1,0,3,1,Gd,7,Typ,1,Gd,Detchd,1998,Unf,3,642,TA,TA,Y,0,35,272,0,0,0,,,,0,2,2006,WD,Abnorml,140000
5,60,RL,84,14260,Pave,,IR1,Lvl,AllPub,FR2,Gtl,NoRidge,Norm,Norm,1Fam,2Story,8,5,2000,2000,Gable,CompShg,VinylSd,VinylSd,BrkFace,350,Gd,TA,PConc,Gd,TA,Av,GLQ,655,Unf,0,490,1145,GasA,Ex,Y,SBrkr,1145,1053,0,2198,1,0,2,1,4,1,Gd,9,Typ,1,TA,Attchd,2000,RFn,3,836,TA,TA,Y,192,84,0,0,0,0,,,,0,12,2008,WD,Normal,250000




In [49]:
test

Id,MSSubClass,MSZoning,LotFrontage,LotArea,Street,Alley,LotShape,LandContour,Utilities,LotConfig,LandSlope,Neighborhood,Condition1,Condition2,BldgType,HouseStyle,OverallQual,OverallCond,YearBuilt,YearRemodAdd,RoofStyle,RoofMatl,Exterior1st,Exterior2nd,MasVnrType,MasVnrArea,ExterQual,ExterCond,Foundation,BsmtQual,BsmtCond,BsmtExposure,BsmtFinType1,BsmtFinSF1,BsmtFinType2,BsmtFinSF2,BsmtUnfSF,TotalBsmtSF,Heating,HeatingQC,CentralAir,Electrical,1stFlrSF,2ndFlrSF,LowQualFinSF,GrLivArea,BsmtFullBath,BsmtHalfBath,FullBath,HalfBath,BedroomAbvGr,KitchenAbvGr,KitchenQual,TotRmsAbvGrd,Functional,Fireplaces,FireplaceQu,GarageType,GarageYrBlt,GarageFinish,GarageCars,GarageArea,GarageQual,GarageCond,PavedDrive,WoodDeckSF,OpenPorchSF,EnclosedPorch,3SsnPorch,ScreenPorch,PoolArea,PoolQC,Fence,MiscFeature,MiscVal,MoSold,YrSold,SaleType,SaleCondition
1461,20,RH,80.0,11622,Pave,,Reg,Lvl,AllPub,Inside,Gtl,NAmes,Feedr,Norm,1Fam,1Story,5,6,1961,1961,Gable,CompShg,VinylSd,VinylSd,,0,TA,TA,CBlock,TA,TA,No,Rec,468,LwQ,144,270,882,GasA,TA,Y,SBrkr,896,0,0,896,0,0,1,0,2,1,TA,5,Typ,0,,Attchd,1961,Unf,1,730,TA,TA,Y,140,0,0,0,120,0,,MnPrv,,0,6,2010,WD,Normal
1462,20,RL,81.0,14267,Pave,,IR1,Lvl,AllPub,Corner,Gtl,NAmes,Norm,Norm,1Fam,1Story,6,6,1958,1958,Hip,CompShg,Wd Sdng,Wd Sdng,BrkFace,108,TA,TA,CBlock,TA,TA,No,ALQ,923,Unf,0,406,1329,GasA,TA,Y,SBrkr,1329,0,0,1329,0,0,1,1,3,1,Gd,6,Typ,0,,Attchd,1958,Unf,1,312,TA,TA,Y,393,36,0,0,0,0,,,Gar2,12500,6,2010,WD,Normal
1463,60,RL,74.0,13830,Pave,,IR1,Lvl,AllPub,Inside,Gtl,Gilbert,Norm,Norm,1Fam,2Story,5,5,1997,1998,Gable,CompShg,VinylSd,VinylSd,,0,TA,TA,PConc,Gd,TA,No,GLQ,791,Unf,0,137,928,GasA,Gd,Y,SBrkr,928,701,0,1629,0,0,2,1,3,1,TA,6,Typ,1,TA,Attchd,1997,Fin,2,482,TA,TA,Y,212,34,0,0,0,0,,MnPrv,,0,3,2010,WD,Normal
1464,60,RL,78.0,9978,Pave,,IR1,Lvl,AllPub,Inside,Gtl,Gilbert,Norm,Norm,1Fam,2Story,6,6,1998,1998,Gable,CompShg,VinylSd,VinylSd,BrkFace,20,TA,TA,PConc,TA,TA,No,GLQ,602,Unf,0,324,926,GasA,Ex,Y,SBrkr,926,678,0,1604,0,0,2,1,3,1,Gd,7,Typ,1,Gd,Attchd,1998,Fin,2,470,TA,TA,Y,360,36,0,0,0,0,,,,0,6,2010,WD,Normal
1465,120,RL,43.0,5005,Pave,,IR1,HLS,AllPub,Inside,Gtl,StoneBr,Norm,Norm,TwnhsE,1Story,8,5,1992,1992,Gable,CompShg,HdBoard,HdBoard,,0,Gd,TA,PConc,Gd,TA,No,ALQ,263,Unf,0,1017,1280,GasA,Ex,Y,SBrkr,1280,0,0,1280,0,0,2,0,2,1,Gd,5,Typ,0,,Attchd,1992,RFn,2,506,TA,TA,Y,0,82,0,0,144,0,,,,0,1,2010,WD,Normal
1466,60,RL,75.0,10000,Pave,,IR1,Lvl,AllPub,Corner,Gtl,Gilbert,Norm,Norm,1Fam,2Story,6,5,1993,1994,Gable,CompShg,HdBoard,HdBoard,,0,TA,TA,PConc,Gd,TA,No,Unf,0,Unf,0,763,763,GasA,Gd,Y,SBrkr,763,892,0,1655,0,0,2,1,3,1,TA,7,Typ,1,TA,Attchd,1993,Fin,2,440,TA,TA,Y,157,84,0,0,0,0,,,,0,4,2010,WD,Normal
1467,20,RL,,7980,Pave,,IR1,Lvl,AllPub,Inside,Gtl,Gilbert,Norm,Norm,1Fam,1Story,6,7,1992,2007,Gable,CompShg,HdBoard,HdBoard,,0,TA,Gd,PConc,Gd,TA,No,ALQ,935,Unf,0,233,1168,GasA,Ex,Y,SBrkr,1187,0,0,1187,1,0,2,0,3,1,TA,6,Typ,0,,Attchd,1992,Fin,2,420,TA,TA,Y,483,21,0,0,0,0,,GdPrv,Shed,500,3,2010,WD,Normal
1468,60,RL,63.0,8402,Pave,,IR1,Lvl,AllPub,Inside,Gtl,Gilbert,Norm,Norm,1Fam,2Story,6,5,1998,1998,Gable,CompShg,VinylSd,VinylSd,,0,TA,TA,PConc,Gd,TA,No,Unf,0,Unf,0,789,789,GasA,Gd,Y,SBrkr,789,676,0,1465,0,0,2,1,3,1,TA,7,Typ,1,Gd,Attchd,1998,Fin,2,393,TA,TA,Y,0,75,0,0,0,0,,,,0,5,2010,WD,Normal
1469,20,RL,85.0,10176,Pave,,Reg,Lvl,AllPub,Inside,Gtl,Gilbert,Norm,Norm,1Fam,1Story,7,5,1990,1990,Gable,CompShg,HdBoard,HdBoard,,0,TA,TA,PConc,Gd,TA,Gd,GLQ,637,Unf,0,663,1300,GasA,Gd,Y,SBrkr,1341,0,0,1341,1,0,1,1,2,1,Gd,5,Typ,1,Po,Attchd,1990,Unf,2,506,TA,TA,Y,192,0,0,0,0,0,,,,0,2,2010,WD,Normal
1470,20,RL,70.0,8400,Pave,,Reg,Lvl,AllPub,Corner,Gtl,NAmes,Norm,Norm,1Fam,1Story,4,5,1970,1970,Gable,CompShg,Plywood,Plywood,,0,TA,TA,CBlock,TA,TA,No,ALQ,804,Rec,78,0,882,GasA,TA,Y,SBrkr,882,0,0,882,1,0,1,0,2,1,TA,4,Typ,0,,Attchd,1970,Fin,2,525,TA,TA,Y,240,0,0,0,0,0,,MnPrv,,0,4,2010,WD,Normal




In [50]:
print(f'Size of training set: {train.shape[0]} rows and {train.shape[1]} columns')

Size of training set: 1460 rows and 81 columns


Next, let's identify the response column and save the column name as y. In this dataset, we will use all columns except the response as predictors.

In [51]:
test.columns[0]

'Id'

In [52]:
train=train.drop(train.columns[0],axis=1)
test=test.drop(test.columns[0],axis=1)

In [53]:
x = train.columns
y = 'SalePrice'
x.remove(y)

# Run AutoML

Run AutoML, stopping after around 1 hour. The max_runtime_secs argument provides a way to limit the AutoML run by time. When using a time-limited stopping criterion, the number of models train will vary between runs. If different hardware is used or even if the same machine is used but the available compute resources on that machine are not the same between runs, then AutoML may be able to train more models on one run vs another.


In [54]:
aml = H2OAutoML(seed = 1, project_name = "j")
aml.train(x = x, y = y, training_frame = train)

AutoML progress: |███████████████████████████████████████████████████████████████| (done) 100%
Model Details
H2OStackedEnsembleEstimator :  Stacked Ensemble
Model Key:  StackedEnsemble_BestOfFamily_6_AutoML_2_20220820_131558

No model summary for this model

ModelMetricsRegressionGLM: stackedensemble
** Reported on train data. **

MSE: 244616949.36983633
RMSE: 15640.234952513863
MAE: 8544.007833099842
RMSLE: 0.07614469021835994
R^2: 0.9612137070936415
Mean Residual Deviance: 244616949.36983633
Null degrees of freedom: 1459
Residual degrees of freedom: 1456
Null deviance: 9207911334609.975
Residual deviance: 357140746079.96106
AIC: 32353.49846361273

ModelMetricsRegressionGLM: stackedensemble
** Reported on cross-validation data. **

MSE: 692839392.1613026
RMSE: 26321.842491765325
MAE: 15040.756750637172
RMSLE: 0.12464460252492175
R^2: 0.8901438691363823
Mean Residual Deviance: 692839392.1613026
Null degrees of freedom: 1459
Residual degrees of freedom: 1455
Null deviance: 9232987065719

Unnamed: 0,Unnamed: 1,mean,sd,cv_1_valid,cv_2_valid,cv_3_valid,cv_4_valid,cv_5_valid
0,mae,14843.71,1781.685,16093.39,13882.24,15189.85,12302.75,16750.32
1,mean_residual_deviance,686472800.0,295079000.0,955450800.0,455252600.0,716425400.0,321494700.0,983740800.0
2,mse,686472800.0,295079000.0,955450800.0,455252600.0,716425400.0,321494700.0,983740800.0
3,null_deviance,1846597000000.0,402049000000.0,2065166000000.0,1654839000000.0,1875663000000.0,1289687000000.0,2347632000000.0
4,r2,0.8946285,0.02734242,0.8638838,0.9210449,0.8852311,0.9256249,0.8773579
5,residual_deviance,200804400000.0,86716450000.0,280902500000.0,130657500000.0,214927600000.0,93233460000.0,284301100000.0
6,rmse,25661.61,5911.278,30910.37,21336.65,26766.12,17930.27,31364.64
7,rmsle,0.1227901,0.02097214,0.1542808,0.1103239,0.1266902,0.09839969,0.1242559




# Leaderboard
Next, we will view the AutoML Leaderboard. Since we specified a leaderboard_frame in the H2OAutoML.train() method for scoring and ranking the models, the AutoML leaderboard uses the performance on this data to rank the models.

A default performance metric for each machine learning task (binary classification, multiclass classification, regression) is specified internally and the leaderboard will be sorted by that metric. In the case of regression, the default ranking metric is mean residual deviance. In the future, the user will be able to specify any of the H2O metrics so that different metrics can be used to generate rankings on the leaderboard.

In [55]:

lb = aml.leaderboard
lb.head()  

model_id,rmse,mse,mae,rmsle,mean_residual_deviance
StackedEnsemble_BestOfFamily_6_AutoML_2_20220820_131558,26321.8,692839000.0,15040.8,0.124645,692839000.0
StackedEnsemble_BestOfFamily_4_AutoML_2_20220820_131558,26620.9,708674000.0,15329.7,0.12602,708674000.0
GBM_grid_1_AutoML_2_20220820_131558_model_3,26717.0,713801000.0,15893.1,0.130101,713801000.0
GBM_grid_1_AutoML_2_20220820_131558_model_28,26877.0,722372000.0,15702.2,0.130225,722372000.0
GBM_grid_1_AutoML_2_20220820_131558_model_20,26949.5,726274000.0,16131.6,0.133155,726274000.0
GBM_grid_1_AutoML_2_20220820_131558_model_37,26977.5,727788000.0,15954.6,0.131947,727788000.0
StackedEnsemble_AllModels_5_AutoML_2_20220820_131558,27020.3,730094000.0,15215.6,0.125061,730094000.0
GBM_grid_1_AutoML_2_20220820_131558_model_31,27030.1,730624000.0,15487.1,0.130766,730624000.0
GBM_grid_1_AutoML_2_20220820_131558_model_39,27097.7,734285000.0,15828.5,0.130384,734285000.0
GBM_grid_1_AutoML_2_20220820_131558_model_49,27142.2,736697000.0,15832.6,0.131661,736697000.0




In [56]:
# The leader model is stored here
aml.leader

Model Details
H2OStackedEnsembleEstimator :  Stacked Ensemble
Model Key:  StackedEnsemble_BestOfFamily_6_AutoML_2_20220820_131558

No model summary for this model

ModelMetricsRegressionGLM: stackedensemble
** Reported on train data. **

MSE: 244616949.36983633
RMSE: 15640.234952513863
MAE: 8544.007833099842
RMSLE: 0.07614469021835994
R^2: 0.9612137070936415
Mean Residual Deviance: 244616949.36983633
Null degrees of freedom: 1459
Residual degrees of freedom: 1456
Null deviance: 9207911334609.975
Residual deviance: 357140746079.96106
AIC: 32353.49846361273

ModelMetricsRegressionGLM: stackedensemble
** Reported on cross-validation data. **

MSE: 692839392.1613026
RMSE: 26321.842491765325
MAE: 15040.756750637172
RMSLE: 0.12464460252492175
R^2: 0.8901438691363823
Mean Residual Deviance: 692839392.1613026
Null degrees of freedom: 1459
Residual degrees of freedom: 1455
Null deviance: 9232987065719.154
Residual deviance: 1011545512555.5017
AIC: 33875.51132546654

Cross-Validation Metrics Sum

Unnamed: 0,Unnamed: 1,mean,sd,cv_1_valid,cv_2_valid,cv_3_valid,cv_4_valid,cv_5_valid
0,mae,14843.71,1781.685,16093.39,13882.24,15189.85,12302.75,16750.32
1,mean_residual_deviance,686472800.0,295079000.0,955450800.0,455252600.0,716425400.0,321494700.0,983740800.0
2,mse,686472800.0,295079000.0,955450800.0,455252600.0,716425400.0,321494700.0,983740800.0
3,null_deviance,1846597000000.0,402049000000.0,2065166000000.0,1654839000000.0,1875663000000.0,1289687000000.0,2347632000000.0
4,r2,0.8946285,0.02734242,0.8638838,0.9210449,0.8852311,0.9256249,0.8773579
5,residual_deviance,200804400000.0,86716450000.0,280902500000.0,130657500000.0,214927600000.0,93233460000.0,284301100000.0
6,rmse,25661.61,5911.278,30910.37,21336.65,26766.12,17930.27,31364.64
7,rmsle,0.1227901,0.02097214,0.1542808,0.1103239,0.1266902,0.09839969,0.1242559




## Ensemble Exploration
To understand how the ensemble works, let's take a peek inside the Stacked Ensemble "All Models" model. The "All Models" ensemble is an ensemble of all of the individual models in the AutoML run. This is often the top performing model on the leaderboard.

In [57]:

# Get model ids for all models in the AutoML Leaderboard
model_ids = list(aml.leaderboard['model_id'].as_data_frame().iloc[:,0])
# Get the "All Models" Stacked Ensemble model
se = h2o.get_model([mid for mid in model_ids if "StackedEnsemble_AllModels" in mid][0])
# Get the Stacked Ensemble metalearner model
metalearner = h2o.get_model(se.metalearner()['name'])



Examine the variable importance of the metalearner (combiner) algorithm in the ensemble. This shows us how much each base learner is contributing to the ensemble. 

Plotting the base learner contributions to the ensemble.

# Predicting Using Leader Model

In [58]:
pred = aml.predict(test)
pred.head()

stackedensemble prediction progress: |███████████████████████████████████████████| (done) 100%




predict
121290
155534
185434
193949
188887
175736
167828
169398
186831
122862




In [59]:
pred.shape

(1459, 1)

## Save Leader Model

You can also save and download your model and use it for deploying it to productiont.

In [60]:
h2o.save_model(aml.leader, path = "./product_backorders_model_bin")

'/kaggle/working/product_backorders_model_bin/StackedEnsemble_BestOfFamily_6_AutoML_2_20220820_131558'

## Submissions

In [61]:
sample_submission = pd.read_csv('../input/house-prices-advanced-regression-techniques/sample_submission.csv')
sample_submission.shape

(1459, 2)

In [62]:
submission=pd.DataFrame({'Id':sample_submission['Id']})
submission

Unnamed: 0,Id
0,1461
1,1462
2,1463
3,1464
4,1465
...,...
1454,2915
1455,2916
1456,2917
1457,2918


In [63]:
submission['SalePrice'] = pred.as_data_frame().values

In [64]:
submission

Unnamed: 0,Id,SalePrice
0,1461,121290.139620
1,1462,155533.528188
2,1463,185434.148054
3,1464,193949.056452
4,1465,188887.021429
...,...,...
1454,2915,82308.358131
1455,2916,78989.646483
1456,2917,161879.520113
1457,2918,115770.121888


In [65]:
submission.to_csv('sub.csv',index=None)