# MaskRCNN for Nuclei Detection on Google Colab (LB=0.382)

This kernel implements the following:

1.   MaskRCNN implementation at https://github.com/killthekitten/kaggle-ds-bowl-2018-baseline#kaggle-ds-bowl-2018 in Jupyter Notebook form.
> - No change in code, only changed modules train.py and inference.py into procedures so they can be run within this notebook and take parameters
> - Using this code made 6 submissions after every 5 epochs going upto 35 epochs (5, 10, 15, 20, 25, 30, 35). Scores increased on LB until 20 epochs, then went down (probably overfitting). Max LB score at 20 epochs = 0.382

2.   Be able to run this notebook in Google Collaboratory (https://colab.research.google.com/) and utilize the free GPU support. (Takes an average 10 minutes to run 1 epoch with GPU)
> - Loads all the competition data from Kaggle into Google Colab (the virtual environment allocated to the user)  
> - Loads the model weights as required by the item 1 above into Google Colab
> - Loads the code from github for item 1 implementation
> - Can make competition submissions to Kaggle site directly from Google colab
> - After every 5 epochs during RCNN training uploads h5 weights file to your Google drive

#### Note: The ability to periodically save the trainined weights (e.g. after every 5 epochs) in Google Drive (or locally) is desirable because: a) Google Colab session expires after 12 hrs and resets(deletes) all imported files and data. The runtime (i.e. program and datastore) is also reset, and b)if your internet connection is interupted the runtime at Google Colab is reset - at least thats what I have experienced.

## Step 1 of 6: Get access to Kaggle site
In order to access Kaggle competition data from Google Colab, Kaggle json authentication is needed which requires the user specific 'kaggle.json' file from Kaggle site to be downloaded to your machine. 

- Go to https://www.kaggle.com/{your_kaggle_user_id}/account under 'API', choose 'Create New API Token'. This will download the 'kaggle.json' file to your machine. Note down the folder in which the file is downloaded (usually the 'Download' folder in Windows). 

- After the 'kaggle.json' file is downloaded to your computer, run the cell below.
- A 'Choose Files' button will appear below the cell. Click it and it will open a files browser window
- Select the 'kaggle.json' file to upload it to Google Colab.

In [0]:
from google.colab import files
import io, os

# The following line 'upload()' method will open a folder browser window for you to 
# select and choose the 'kaggle.json' file on your computer
uploaded = files.upload()

for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(
      name=fn, length=len(uploaded[fn])))

# Make the folder '.kaggle' on the Google Colab VM
filename = "/content/.kaggle/kaggle.json"
os.makedirs(os.path.dirname(filename), exist_ok=True)

# Move the 'kaggle.json' file to the '.kaggle' folder
!mv kaggle.json /content/.kaggle/

# change the permissions of the 'kaggle.json' file to make it secure
!chmod 600 /content/.kaggle/kaggle.json
#!ls -la /content/.kaggle/

!mkdir ~/.kaggle
!cp /content/.kaggle/kaggle.json ~/.kaggle/kaggle.json

# Check if Kaggle site can be accessed
# install the 'kaggle' module. The '-q' is for quite mode.
!pip install -q kaggle

# List all the competitions currently on the Kaggle site.
# Note: All Kaggle API commands can be found at: https://github.com/Kaggle/kaggle-api
!kaggle competitions list

Saving kaggle.json to kaggle.json
User uploaded file "kaggle.json" with length 65 bytes
ref                                                deadline             category            reward  teamCount  userHasEntered  
-------------------------------------------------  -------------------  ---------------  ---------  ---------  --------------  
digit-recognizer                                   2030-01-01 00:00:00  Getting Started  Knowledge       2910            True  
titanic                                            2030-01-01 00:00:00  Getting Started  Knowledge      11217            True  
house-prices-advanced-regression-techniques        2030-01-01 00:00:00  Getting Started  Knowledge       4331            True  
imagenet-object-localization-challenge             2029-12-31 07:00:00  Research         Knowledge         40           False  
competitive-data-science-predict-future-sales      2019-12-31 23:59:00  Playground           Kudos       2851           False  
two-sigma-financ

## Step 2 of 6: Get all the code and data 

#### (the cell below takes about 25 seconds on Google Colab)
The steps to get code and data are taken from here:

https://github.com/killthekitten/kaggle-ds-bowl-2018-baseline#kaggle-ds-bowl-2018

1.   First, you have to download the train masks. Thanks @lopuhin for bringing all the fixes to one place. You might want to do it outside of this repo to be able to pull changes later and use symlinks:
> git clone https://github.com/lopuhin/kaggle-dsbowl-2018-dataset-fixes ../kaggle-dsbowl-2018-dataset-fixes
> ln -s ../kaggle-dsbowl-2018-dataset-fixes/stage1_train stage1_train

2.   Download the rest of the official dataset and unzip it to the repo:
> unzip ~/Downloads/stage1_test.zip -d stage1_test
> unzip ~/Downloads/stage1_train_labels.csv.zip -d .
> unzip ~/Downloads/stage1_sample_submission.csv.zip -d .

3. Install pycocotools and COCO pretrained weights (mask_rcnn_coco.h5). General idea is described here. Keep in mind, to install pycocotools properly, it's better to run make install instead of make.
> https://github.com/matterport/Mask_RCNN/releases/download/v2.0/mask_rcnn_coco.h5

4. Download the repo:
> https://github.com/killthekitten/kaggle-ds-bowl-2018-baseline.git

In [0]:
#### Get all DS2018 data AND RCNN code ####
!git clone https://github.com/lopuhin/kaggle-dsbowl-2018-dataset-fixes ../kaggle-dsbowl-2018-dataset-fixes
!ln -s ../kaggle-dsbowl-2018-dataset-fixes/stage1_train stage1_train
#!ls -l stage1_train

!mkdir Downloads
# kaggle competitions download [-h] [-c COMPETITION] [-f FILE] [-p PATH]
#!kaggle competitions files -c data-science-bowl-2018
!kaggle competitions download -c data-science-bowl-2018 -f stage1_sample_submission.csv.zip -p ~/Downloads/
!kaggle competitions download -c data-science-bowl-2018 -f stage1_test.zip -p ~/Downloads/
!kaggle competitions download -c data-science-bowl-2018 -f stage1_train_labels.csv.zip -p ~/Downloads/

## Unzip the data files
!unzip ~/Downloads/stage1_sample_submission.csv.zip -d .
!unzip ~/Downloads/stage1_test.zip -d stage1_test
!unzip ~/Downloads/stage1_train_labels.csv.zip -d .

### Get COCO weights
!wget https://github.com/matterport/Mask_RCNN/releases/download/v2.0/mask_rcnn_coco.h5

### Get RCNN code and copy all the '.py' files to the working directory
!git clone https://github.com/dhritiwasan14/kaggle-ds-bowl-2018-baseline.git
!cp kaggle-ds-bowl-2018-baseline/*.py .

# Install pycocotools
!pip install git+https://github.com/waleedka/cocoapi.git#egg=pycocotools&subdirectory=PythonAPI
  

Cloning into '../kaggle-dsbowl-2018-dataset-fixes'...
remote: Enumerating objects: 33416, done.[K
remote: Total 33416 (delta 0), reused 0 (delta 0), pack-reused 33416[K
Receiving objects: 100% (33416/33416), 70.88 MiB | 52.82 MiB/s, done.
Resolving deltas: 100% (11703/11703), done.
Checking out files: 100% (29652/29652), done.
Downloading stage1_sample_submission.csv.zip to /root/Downloads
  0% 0.00/2.62k [00:00<?, ?B/s]
100% 2.62k/2.62k [00:00<00:00, 2.24MB/s]
Downloading stage1_test.zip to /root/Downloads
 55% 5.00M/9.10M [00:00<00:00, 45.4MB/s]
100% 9.10M/9.10M [00:00<00:00, 58.2MB/s]
Downloading stage1_train_labels.csv.zip to /root/Downloads
  0% 0.00/2.67M [00:00<?, ?B/s]
100% 2.67M/2.67M [00:00<00:00, 88.6MB/s]
Archive:  /root/Downloads/stage1_sample_submission.csv.zip
  inflating: ./stage1_sample_submission.csv  
Archive:  /root/Downloads/stage1_test.zip
   creating: stage1_test/0114f484a16c152baa2d82fdd43740880a762c93f436c8988ac461c5c9dbe7d5/
   creating: stage1_test/0999dab0

### Step 3 of 6: Train and Inference procs
Implement 'train.py' and 'inference.py' as procedures so we can run them from within the notebook and pass some parameters

In [0]:
!pip install -q tqdm

import os
import sys
import random
import math
import time
import model as modellib
import pandas as pd
import cv2
import numpy as np

from bowl_config import bowl_config
from bowl_dataset import BowlDataset
import utils
import model as modellib
from model import log
from glob import glob
from tqdm import tqdm
from inference_config import inference_config
from bowl_dataset import BowlDataset
from utils import rle_encode, rle_decode, rle_to_string
import functions as f

#######################################################################################
# Module train.py copied here and made into a procedure with two parameters, 
# 'init' and 'ep'. See values of 'init' below. 'ep' is for how many epochs to train with.
########################################################################################
def train(init,ep):

  # Root directory of the project
  ROOT_DIR = os.getcwd()

  # Directory to save logs and trained model
  MODEL_DIR = os.path.join(ROOT_DIR, "logs")

  # Local path to trained weights file
  COCO_MODEL_PATH = os.path.join(ROOT_DIR, "mask_rcnn_coco.h5")
  # Download COCO trained weights from Releases if needed
  if not os.path.exists(COCO_MODEL_PATH):
      utils.download_trained_weights(COCO_MODEL_PATH)

  model = modellib.MaskRCNN(mode="training", config=bowl_config,
                            model_dir=MODEL_DIR)

  # Which weights to start with?
  init_with = init  # imagenet, coco, or last

  if init_with == "imagenet":
      model.load_weights(model.get_imagenet_weights(), by_name=True)
  elif init_with == "coco":
      # Load weights trained on MS COCO, but skip layers that
      # are different due to the different number of classes
      # See README for instructions to download the COCO weights
      model.load_weights(COCO_MODEL_PATH, by_name=True,
                         exclude=["mrcnn_class_logits", "mrcnn_bbox_fc", 
                                  "mrcnn_bbox", "mrcnn_mask"])
  elif init_with == "last":
      # Load the last model you trained and continue training
#       model.load_weights(model.find_last()[1], by_name=True)
        model.load_weights(root_path+'deepretina_final.h5', by_name=True)

  # Training dataset
  dataset_train = BowlDataset()
  dataset_train.load_bowl('stage1_train')
  dataset_train.prepare()

  # # Validation dataset
  dataset_val = BowlDataset()
  dataset_val.load_bowl('stage1_train')
  dataset_val.prepare()

  # Train the head branches
  # Passing layers="heads" freezes all layers except the head
  # layers. You can also pass a regular expression to select
  # which layers to train by name pattern.
  #model.train(dataset_train, dataset_val, 
  #            learning_rate=bowl_config.LEARNING_RATE, 
  #            epochs=1, 
  #            layers='heads')
  model.keras_model.summary()
  model.train(dataset_train, dataset_val, 
              learning_rate=bowl_config.LEARNING_RATE / 10,
              epochs=ep, 
              layers="all")
    
#######################################################################################
# Module inference.py copied here and made into a procedure with 'fn' parameter. 'fn' is 
# the filename of the csv file to write the predictions for submission.
########################################################################################y
def infer(fn):

  ROOT_DIR = os.getcwd()
  MODEL_DIR = os.path.join(ROOT_DIR, "logs")

  #   # Recreate the model in inference mode
  model = modellib.MaskRCNN(mode="inference", 
                              config=inference_config,
                              model_dir=MODEL_DIR)

  #   # Get path to saved weights
  #   # Either set a specific path or find last trained weights
  model_path = os.path.join(root_path, "deepretina_final.h5")
  #   model_path = model.find_last()[1]

  # Load trained weights (fill in path to trained weights here)
  #   assert model_path != "",/ "Provide path to trained weights"
  print("Loading weights from ", model_path)
  model.load_weights(model_path, by_name=True)

  dataset_test = BowlDataset()
  dataset_test.load_bowl('stage2_test_final')
  dataset_test.prepare()

  output = []
  sample_submission = pd.read_csv('stage2_sample_submission_.csv')
  ImageId = []
  EncodedPixels = []
  for image_id in tqdm(sample_submission.ImageId):
      image_path = os.path.join('stage2_test_final', image_id, 'images', image_id + '.png')

      original_image = cv2.imread(image_path)
      results = model.detect([original_image], verbose=0)
      r = results[0]

      masks = r['masks']
      ImageId_batch, EncodedPixels_batch = f.numpy2encoding_no_overlap2(masks, image_id, r['scores'])
      if not ImageId_batch:
        print(ImageId_batch)
      ImageId += ImageId_batch
      EncodedPixels += EncodedPixels_batch

  f.write2csv(fn, ImageId, EncodedPixels)

Using TensorFlow backend.



Configurations:
BACKBONE_SHAPES                [[128 128]
 [ 64  64]
 [ 32  32]
 [ 16  16]
 [  8   8]]
BACKBONE_STRIDES               [4, 8, 16, 32, 64]
BATCH_SIZE                     2
BBOX_STD_DEV                   [0.1 0.1 0.2 0.2]
DETECTION_MAX_INSTANCES        512
DETECTION_MIN_CONFIDENCE       0.7
DETECTION_NMS_THRESHOLD        0.3
GPU_COUNT                      1
IMAGES_PER_GPU                 2
IMAGE_MAX_DIM                  512
IMAGE_MIN_DIM                  512
IMAGE_PADDING                  True
IMAGE_SHAPE                    [512 512   3]
LEARNING_MOMENTUM              0.9
LEARNING_RATE                  0.001
MASK_POOL_SIZE                 14
MASK_SHAPE                     [28, 28]
MAX_GT_INSTANCES               256
MEAN_PIXEL                     [123.7 116.8 103.9]
MINI_MASK_SHAPE                (56, 56)
NAME                           bowl
NUM_CLASSES                    2
POOL_SIZE                      7
POST_NMS_ROIS_INFERENCE        1000
POST_NMS_ROIS_TRAINING         2

In [0]:
ROOT_DIR = os.getcwd()

# Directory to save logs and trained model
MODEL_DIR = os.path.join(ROOT_DIR, "logs")

# Local path to trained weights file
COCO_MODEL_PATH = os.path.join(ROOT_DIR, "mask_rcnn_coco.h5")
# Download COCO trained weights from Releases if needed
if not os.path.exists(COCO_MODEL_PATH):
    utils.download_trained_weights(COCO_MODEL_PATH)

model = modellib.MaskRCNN(mode="training", config=bowl_config,
                          model_dir=MODEL_DIR)

# Which weights to start with?
init_with = init  # imagenet, coco, or last
model.load_weights('kaggle_bowl.h5', by_name=True)

# Training dataset
dataset_train = BowlDataset()
dataset_train.load_bowl('stage1_train')
dataset_train.prepare()

# # Validation dataset
dataset_val = BowlDataset()
dataset_val.load_bowl('stage1_train')
dataset_val.prepare()

# Train the head branches
# Passing layers="heads" freezes all layers except the head
# layers. You can also pass a regular expression to select
# which layers to train by name pattern.
#model.train(dataset_train, dataset_val, 
#            learning_rate=bowl_config.LEARNING_RATE, 
#            epochs=1, 
#            layers='heads')
model.keras_model.summary()
model.train(dataset_train, dataset_val, 
            learning_rate=bowl_config.LEARNING_RATE / 10,
            epochs=10, 
            layers="all")

Instructions for updating:
Colocations handled automatically by placer.


ERROR:root:Internal Python error in the inspect module.
Below is the traceback from this internal error.



Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2882, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-5-ac47b79cb876>", line 16, in <module>
    init_with = init  # imagenet, coco, or last
NameError: name 'init' is not defined

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 1823, in showtraceback
    stb = value._render_traceback_()
AttributeError: 'NameError' object has no attribute '_render_traceback_'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/IPython/core/ultratb.py", line 1132, in get_records
    return _fixed_getinnerframes(etb, number_of_lines_of_context, tb_offset)
  File "/usr/local/lib/python3.6/dist-packages/IPyth

NameError: ignored

### Step 4 of 6:Procedures to save and retrive weight files to Google Drive.

##### Note: When the following cell is run (specifically 'auth.authenticate_user()') first time for a Google Colab session, it will open a blank box below the cell along with a web link (URL). Click on the URL to allow permission and get a long authentication key. Copy the authentication key and paste it into the blank box and press enter. Only required once for a session.

In [0]:
!pip install -U -q PyDrive
from pydrive.files import GoogleDriveFile
from pydrive.auth import GoogleAuth
from google.colab import auth
from oauth2client.client import GoogleCredentials
import glob

auth.authenticate_user()

# Function to save/load file 'fn' to Google Drive.
def ToGDrive(fn):  
  # 1. Authenticate and create the PyDrive client.
  auth.authenticate_user()
  gauth = GoogleAuth()
  gauth.credentials = GoogleCredentials.get_application_default()
  gdfile=GoogleDriveFile(gauth)

  #load file TO Google Drive
  gdfile.SetContentFile(fn)
  gdfile.Upload()
  if gdfile.uploaded == True:
    f"File {gdfile['title']} uploaded. File ID:{gdfile['id']}"
  return gdfile

# Function to save/load the trained weights file generated by the RCNN to Google Drive
def dl_wts():
  
  directory = 'logs/'
  dn=max([os.path.join(directory,d) for d in os.listdir(directory)], key=os.path.getmtime)
  dn=os.path.join(dn,'mask_rcnn_bowl_*')
  list_of_files=glob.glob(dn)
  latest_file = max(list_of_files, key=os.path.getctime)
  print(latest_file)
  ToGDrive(latest_file)

# Function to save file 'fn' to your computer
def local_download(fn):
  from google.colab import files
  files.download(fn)


[?25l[K    1% |▎                               | 10kB 30.8MB/s eta 0:00:01[K    2% |▋                               | 20kB 2.1MB/s eta 0:00:01[K    3% |█                               | 30kB 3.1MB/s eta 0:00:01[K    4% |█▎                              | 40kB 2.0MB/s eta 0:00:01[K    5% |█▋                              | 51kB 2.5MB/s eta 0:00:01[K    6% |██                              | 61kB 3.0MB/s eta 0:00:01[K    7% |██▎                             | 71kB 3.4MB/s eta 0:00:01[K    8% |██▋                             | 81kB 3.8MB/s eta 0:00:01[K    9% |███                             | 92kB 4.3MB/s eta 0:00:01[K    10% |███▎                            | 102kB 3.3MB/s eta 0:00:01[K    11% |███▋                            | 112kB 3.3MB/s eta 0:00:01[K    12% |████                            | 122kB 4.7MB/s eta 0:00:01[K    13% |████▎                           | 133kB 4.7MB/s eta 0:00:01[K    14% |████▋                           | 143kB 8.9MB/s eta 0:00:01[

### Step 5 of 6: Run the model and make predictions

Train the RCNN using the 'coco' model with upto 20 epochs in a loop:
- Train the model for 5 epoch with init = 'coco'
- Save the weights file on Google Drive after 5 epochs
- Train the model for the next 5 epochs
- Again the save the weights and so on

In [0]:
init='last'  # a parameter for the 'train' procedure to load the model type:'imagenet', 'coco', or 'last,
             # where 'last' means to use the last run model.
total_ep=2  # total number of epochs to run  
step=5       # after every 'step' number of epochs, save the weights to Google Drive
start=1      # start with training up to 5 epochs 

fn='submission.csv'  # name of the submission file to be produced
for x in range(start,total_ep,step):
  train(init,x)               # Train the model for 'step' number of epochs
  init='last'                 # Set this to 'last' so training will start from last epoch
  #fn='submepoch'+str(x)+'.csv'
  dl_wts()                    # Save the training weights file to Google Drive
  infer(fn)                   # make predictions and store in csv file for submission
  #local_download(fn)         # uncomment this if you want the csv file to be downloaded to your computer as well
!kaggle competitions submit -c data-science-bowl-2018 -f submission.csv -m "With 20 epochs COCO weights"

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_image (InputLayer)        (None, 512, 512, 3)  0                                            
__________________________________________________________________________________________________
zero_padding2d_3 (ZeroPadding2D (None, 518, 518, 3)  0           input_image[0][0]                
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 256, 256, 64) 9472        zero_padding2d_3[0][0]           
__________________________________________________________________________________________________
bn_conv1 (BatchNorm)            (None, 256, 256, 64) 256         conv1[0][0]                      
__________________________________________________________________________________________________
activation

  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


Epoch 1/1




 26/332 [=>............................] - ETA: 16:30 - loss: 1.0145 - rpn_class_loss: 0.0402 - rpn_bbox_loss: 0.3052 - mrcnn_class_loss: 0.1965 - mrcnn_bbox_loss: 0.2002 - mrcnn_mask_loss: 0.2723



 38/332 [==>...........................] - ETA: 12:58 - loss: 0.9547 - rpn_class_loss: 0.0345 - rpn_bbox_loss: 0.2960 - mrcnn_class_loss: 0.1768 - mrcnn_bbox_loss: 0.1868 - mrcnn_mask_loss: 0.2606

In [0]:
from google.colab import drive
drive.mount('/content/gdrive')
root_path = 'gdrive/My Drive/Dhriti\'s Stuff/'

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=email%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdocs.test%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.photos.readonly%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fpeopleapi.readonly&response_type=code

Enter your authorization code:
··········
Mounted at /content/gdrive


In [0]:
import functions as f
fn = 'trial.csv'
from tqdm import tqdm
dataset_test = BowlDataset()
dataset_test.load_bowl('stage2_test_final')
dataset_test.prepare()

output = []
sample_submission = pd.read_csv('stage2_sample_submission_final.csv')
ImageId = []
EncodedPixels = []
tqdm(sample_submission.ImageId)
for image_id in tqdm(sample_submission.ImageId):
    image_path = os.path.join('stage2_test_final', image_id, 'images', image_id + '.png')

    original_image = cv2.imread(image_path)
    results = model.detect([original_image], verbose=0)
    r = results[0]

    masks = r['masks']
    ImageId_batch, EncodedPixels_batch = f.numpy2encoding_no_overlap2(masks, image_id, r['scores'])
    if not ImageId_batch:
      ImageId += [image_id]
      EncodedPixels += ['']
    else:
      ImageId += ImageId_batch
      EncodedPixels += EncodedPixels_batch

f.write2csv(fn, ImageId, EncodedPixels)

  0%|          | 0/3019 [00:00<?, ?it/s]
100%|██████████| 3019/3019 [12:50<00:00,  4.62it/s]


In [0]:
ROOT_DIR = os.getcwd()
MODEL_DIR = os.path.join(ROOT_DIR, "logs")
model = modellib.MaskRCNN(mode="inference", 
                            config=inference_config,
                            model_dir=MODEL_DIR)

# Get path to saved weights
# Either set a specific path or find last trained weights
model_path = os.path.join(ROOT_DIR, "gdrive/My Drive/Dhriti\'s Stuff/deepretina_final.h5")
model.load_weights(model_path, by_name=True)

Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Use tf.cast instead.
Instructions for updating:
Use tf.cast instead.


### Step 6 of 6:Submit predictions to Kaggle directly 

In [0]:
fn = 'submission2.csv'
infer(fn)
!kaggle competitions submit -c data-science-bowl-2018 -f trial.csv -m "With 20 epochs COCO weights"

FileNotFoundError: ignored

In [0]:
!kaggle competitions submit -c data-science-bowl-2018 -f trial.csv -m "With 20 epochs COCO weights"

100% 19.5M/19.5M [00:04<00:00, 4.68MB/s]
Successfully submitted to 2018 Data Science Bowl 

_### Retrieving a trained weights file from Google Drive
After after 5 epochs the weights file will be stored on your Google Drive. For example, after 25 epochs, a file named 'mask_rcnn_bowl_0025.h5' (approx 178MB) will be stored on the Google Drive.

To retrieve this file:
- go to Google Drive and right click on the file
- choose 'Get shareable link'. 
- A link will be provided which looks like: 'https://drive.google.com/open?id=1-LL7F3NWcIiDvxFdjncRFj_z5B7i0Aln'
- Copy the id part e.g '1-LL7F3NWcIiDvxFdjncRFj_z5B7i0Aln'
- Paste in the cell below for 'fileid='
- Run the cell
- It will upload the file in the destination folder. The destination folder is where the RCNN looks for the last weight file.



In [0]:
!kaggle competitions download -c data-science-bowl-2018 -f stage2_test_final.zip -p ~/Downloads/

Downloading stage2_test_final.zip to /root/Downloads
 96% 265M/276M [00:04<00:00, 51.3MB/s]
100% 276M/276M [00:04<00:00, 59.8MB/s]


In [0]:
!kaggle competitions download -c data-science-bowl-2018 -f stage2_sample_submission_final.csv -p ~/Downloads/

Downloading stage2_sample_submission_final.csv.zip to /root/Downloads
  0% 0.00/112k [00:00<?, ?B/s]
100% 112k/112k [00:00<00:00, 35.3MB/s]


In [0]:
!unzip ~/Downloads/stage2_sample_submission_final.csv.zip -d .

Archive:  /root/Downloads/stage2_sample_submission_final.csv.zip
  inflating: ./stage2_sample_submission_final.csv  


In [0]:
!ls
!unzip ~/Downloads/stage2_test_final.zip -d stage2_test_final

bowl_config.py		      rebuild_mosaics.py
bowl_dataset.py		      sample_data
config.py		      stage1_sample_submission.csv
Downloads		      stage1_test
functions.py		      stage1_train
inference_config.py	      stage1_train_labels.csv
inference.py		      stage2_sample_submission_final.csv
kaggle-ds-bowl-2018-baseline  train.py
mask_rcnn_coco.h5	      utils.py
model.py		      visualize.py
parallel_model.py
Archive:  /root/Downloads/stage2_test_final.zip
   creating: stage2_test_final/0019c086029dd3be01f72131edb74e21ee995574e6d5c136ea868630b0d73523/
   creating: stage2_test_final/004a078bb44ee55ee7d6f1c19f96b3a0d3b5037746a3a75197dbb0be06da05cf/
   creating: stage2_test_final/005463e6d4a0a0b21161f1d97392f22556fbddba970d9440ae774229308a91ed/
   creating: stage2_test_final/005af293e8e53218ae96746ecf9bb88b511154d4a0b35e4ec6296b4623e15836/
   creating: stage2_test_final/005d47447abac7f7fa0ac56ba82f13edbf485105baf0672504d84b58d562f38b/
   creating: stage2_test_final/00ac87390253a22f6eb67c5771a7

In [0]:
!pip install googledrivedownloader

## Get weight file
from google_drive_downloader import GoogleDriveDownloader as gdd

fileid='1-LL7F3NWcIiDvxFdjncRFj_z5B7i0Aln'  # example file id. To get the file id see instructions above
destpath='./logs/bowl20180403T0311/'        # this should work irrespective of the date

os.makedirs(os.path.dirname(destpath), exist_ok=True)

gdd.download_file_from_google_drive(file_id=fileid,
                                    dest_path=destpath,
                                    unzip=True)


In [0]:
import os
import numpy as np
import cv2
import pandas as pd
import matplotlib.pyplot as plt
import scipy.misc
import random

df = pd.read_csv('0.csv')

data = []
labels = []

yes = 0
no = 0

for idx, row in df.loc[:30].iterrows():
    o = row['image_id']
    if o[0] == '.':
        continue
        
    d = 'stage1_train/' + o
    
    img = cv2.imread(d + '/images/' + o + '.png')
    #img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    
    #print(img.shape)
    padding = (PATCH_SIZE // 2, PATCH_SIZE // 2)
    if CHANNELS == 3:
        img_pad = np.pad(img, (padding, padding, (0, 0)), 'constant', constant_values=(0, 0))
    else:
        img_pad = np.pad(img, (padding, padding), 'constant', constant_values=(0, 0))
    #print(img_pad.shape)
    mask = cv2.imread(d + '/sparse_mask.png', 0)
    mask = cv2.threshold(mask, 1, 255, cv2.THRESH_BINARY)[1]

In [0]:
import keras


In [0]:
!wget https://drive.google.com/file/d/19kVton20JL9u0CpwGssD7EbBvsWcq1ty

--2019-04-24 05:59:24--  https://drive.google.com/file/d/19kVton20JL9u0CpwGssD7EbBvsWcq1ty/download
Resolving drive.google.com (drive.google.com)... 108.177.126.101, 108.177.126.139, 108.177.126.138, ...
Connecting to drive.google.com (drive.google.com)|108.177.126.101|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2019-04-24 05:59:24 ERROR 404: Not Found.

