Machine learning applied to soccer
Sample IPython notebook with soccer predictions
We’ve had a great time giving you our predictions for the World Cup (check out our post before the quarter finals and the one before the semi-finals). So far, we’ve gotten 13 of 14 games correct. But this shouldn’t be about what we did - it’s about what you can do with Google Cloud Platform. Now, we are open-sourcing our prediction model and packaging it up so you can do your own analysis and predictions.
We have ingested raw touch-by-touch gameplay day from Opta for thousands of soccer matches using Google Cloud Dataflow and polished the raw data into predictive statistics using Google BigQuery. You can see BigQuery engineer Jordan Tigani (+JordanTigani) and developer advocate Felipe Hoffa (@felipehoffa) talk about how we did it in this video from Google I/O.
Project setup, installation, and configuration
How to setup the deployment environment
Pre-work: Get started with the Google Cloud Platform and create a project:
Sign up at https://console.developers.google.com/, create a project, and remember to turn on the Google BigQuery API. Install the Google Cloud SDK following the instructions at https://developers.google.com/cloud/sdk/.
How to deploy
Start your instance:
gcloud compute instances create ipy-predict \ --image https://www.googleapis.com/compute/v1/projects/google-containers/global/images/container-vm-v20140522 \ --zone=us-central1-a --machine-type n1-standard-1 --scopes storage-ro bigquery
Ssh to your new machine:
gcutil ssh --ssh_arg "-L 8888:127.0.0.1:8888" --zone=us-central1-a ipy-predict
Download and run the docker image we prepared:
sudo docker run -p 8888:8888 fhoffa/ipython-predictions:v1
Wait until Docker downloads and runs the container, then navigate to the notebook:
- See CONTRIB.md
- See LICENSE