# Phase 3: Submitting to Kaggle

The only way for us to test the strength of our model is by uploading the test predictions to Kaggle

## Setting up Kaggle

If you haven't set up authentication with Kaggle yet (you can test this by running the cell below), follow these steps:

1. Go to the Account tab of your [Kaggle profile](https://www.kaggle.com/settings/account)
2. Select 'Create New Token' (which will download a file `kaggle.json`)
3. If you are on a UNIX-based OS, place this at `~/.kaggle/kaggle.json`
    - For Windows, place this at `C:\Users\<Windows-username>\.kaggle\kaggle.json`

In [2]:
from kaggle.api.kaggle_api_extended import KaggleApi

api = KaggleApi()
api.authenticate()

competition = "house-prices-advanced-regression-techniques"



In [3]:
import os
import random

submission_dir = "../submissions"
submission_files = [f for f in os.listdir(submission_dir) if f.endswith('.csv')]
if len(submission_files) == 0:
    raise FileNotFoundError("No submissions exist. Run phase1 and 2 notebooks first.")

# set this if you want to test a specific file, otherwise a random one will be selected
submission_file = None 

submission_file = random.choice(submission_files) if submission_file == None else submission_file
submission_filepath = f"{submission_dir}/{submission_file}"
submission_filepath

'../submissions/xgboost_submission.csv'

In [4]:
from datetime import datetime

now = datetime.now().strftime("%D %T")
message = f"submission {now}"

response = api.competition_submit(submission_filepath, message, competition)
response

100%|██████████| 21.2k/21.2k [00:00<00:00, 48.4kB/s]


{"message": "Successfully submitted to House Prices - Advanced Regression Techniques", "ref": 47788234}

In [6]:
# to solve latency with submission/query
from time import sleep
sleep(3)

In [None]:
leaderboard = api.competition_submissions(competition)
submission = [s for s in leaderboard if s.ref == response.ref][0]
other_submissions = [s for s in leaderboard if s.ref != response.ref]
other_submissions.sort(key = lambda x: x.date, reverse=True)

score = float(submission.public_score)
print(f"submission returned score of {score}")

print("\nLast 5 submissions:")
for s in other_submissions[:5]:
    print(f"\tSCORE: {s.public_score}")
    print(f"\tref: {s.ref}")
    print(f"\tdate: {s.date}")
    print(f"\tfile name: {s.file_name}")
    print(f"\tsubmitted by {s.submitted_by}\n")

submission returned score of 0.12899

Last 5 submissions:
	SCORE: 0.13113
	ref: 47787518
	date: 2025-10-31 00:14:02
	file name: xgboost_submission.csv
	submitted by nicbolton

	SCORE: 0.12877
	ref: 47784912
	date: 2025-10-30 20:59:10.877000
	file name: xgboost_submission.csv
	submitted by nicbolton

	SCORE: 0.12877
	ref: 47784910
	date: 2025-10-30 20:59:05
	file name: xgboost_submission.csv
	submitted by nicbolton

	SCORE: 0.12877
	ref: 47784906
	date: 2025-10-30 20:58:55
	file name: xgboost_submission.csv
	submitted by nicbolton

	SCORE: 0.12877
	ref: 47784887
	date: 2025-10-30 20:57:35.773000
	file name: xgboost_submission.csv
	submitted by nicbolton

