# Asteroid Hunter - Classify Images

 
This Code is used to classify images using an already trained model hosted in Google Cloud AutoML Vision (Asteroid Hunter Project).

**This code has been created to be run using Gooogle Colab.**


We start with a pip cell. This package is missing from the default Google Colab library pack (remeber to click on "restart runtime" in the cell's output).

In [None]:
!pip install --upgrade google-cloud-automl

**To set the os.environ a few cells below, you need a json file provided by the project team**. This file must be placed in the same folder as the code. If you are using Google Colab, just click on "upload", import it and it will be directly stored in the main root folder ('content')

In [None]:
from google.cloud import automl
import time
import pandas as pd
import numpy as np
import csv
import json
import os
import subprocess

os.makedirs('/content/images_to_classify/', exist_ok = True)

Now, let's login to Google Cloud using your Gmail account. The bucket where the images to classify will be stored is public so any Gmail account should work. Steps:

1) Login to gmail using another tab of the same web browser. If you are using Google Colab this step can be skipped, you are already logged in with a Gmail account. 

2) Launch the cell below and click on the link. Accept the Google Cloud conditions.

3) Once you have accepted, a login code will be provided. Copy this code and paste it into the case from provided by this case.

Ignore the warning appearing as an output from the second cell below.

In [None]:
from google.colab import auth
auth.authenticate_user()

In [None]:
project_id = 'hst-asteroid-detection'
!gcloud config set project {project_id}

Now just upload the images you want to classify to the "images_to_classify" folder. If you are using Google Colab and you are not able to see this folder yet, click on the "folder refresh" icon  from the files menu in the left part of the screen.

To upload the files, click on the "3 dots" icon at the right part of the folder name and choose "upload".

In [None]:
#Copy to Google Cloud the files to classify

copy1 = subprocess.Popen('gsutil -m cp -r /content/images_to_classify/ gs://hst-satellites-public/', shell=True, stdout=subprocess.PIPE)
copy1.wait()

#Create the csv needed by AutoML

images_list = !gsutil ls gs://hst-satellites-public/images_to_classify/
the_list = pd.DataFrame(images_list)
the_list.to_csv('images.csv', header=False, index=False)

copy2 = subprocess.Popen('gsutil cp  images.csv gs://hst-satellites-public/', shell=True, stdout=subprocess.PIPE)
copy2.wait()


# Batch Classification

And now we just launch our images classification. 

It will take between 45 minutes (this is the  built-in minimum for AutoML Batch Classification) and several hours depending on the amount of images uploaded.

In [None]:
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="/content/hst-asteroid-detection-ae7516235f0b.json"
os.environ["GOOGLE_CLOUD_PROJECT"]="hst-asteroid-detection"

# Variables
project_id = "129924192384"
model_id= "IOD5322289388342738944"
input_uri = "gs://hst-satellites-public/images.csv"
output_uri = "gs://hst-satellites-public/output/"

prediction_client = automl.PredictionServiceClient()

starttime = time.time()

    # Get the full path of the model.
model_full_id = f"projects/{project_id}/locations/us-central1/models/{model_id}"

gcs_source = automl.GcsSource(input_uris=[input_uri])

input_config = automl.BatchPredictInputConfig(gcs_source=gcs_source)
gcs_destination = automl.GcsDestination(output_uri_prefix=output_uri)
output_config = automl.BatchPredictOutputConfig(
gcs_destination=gcs_destination
    )

params = {"score_threshold": "0.5"}

response = prediction_client.batch_predict(
    name=model_full_id,
    input_config=input_config,
    output_config=output_config,
    params = params
    )

print("Waiting for operation to complete...")
print(
        f"Batch Prediction results saved to Cloud Storage bucket. {response.result()}"
    )

print('That took {} seconds'.format(time.time() - starttime))


Once the classification is finished, we import the results from Google Cloud. You will find the results in several json files inside the "output" folder using the file management tab.

Inside the json files you will find (among other data) the label, score and the corners of the bounding box for the different objects detected in the uploaded images.

In [None]:
copy1 = subprocess.Popen('gsutil -m cp -r gs://hst-satellites-public/output/ . ', shell=True, stdout=subprocess.PIPE)
copy1.wait()

subprocess.Popen('gsutil rm -r gs://hst-satellites-public/images.csv', shell=True, stdout=subprocess.PIPE)
subprocess.Popen('gsutil rm -r gs://hst-satellites-public/output/', shell=True, stdout=subprocess.PIPE)
subprocess.Popen('gsutil rm -r gs://hst-satellites-public/images_to_classify/', shell=True, stdout=subprocess.PIPE)