# What's Wrong with Submissions?

Soon after the competition started, I created an inference kernel to submit predictions. However, all of my attempts failed. I know there are public inference kernels that work, but I don't want to simply copy-paste someone's pipeline and would prefer to use my own instead. That's why I created this "smoke test" kernel. I want to understand why submissions fail on private test set.

In this kernel, I'm trying to debug submission issues in order to understand what's wrong with my submission format. 

So far, I collected the following findings, which I describe more details in the following section.

1. ✅ Submission works if using all-zeros masks -> empty RLE strings
2. ✅ Submission works if using a hard-coded RLE string like "1 10 20 5"
3. ✅ Submission works if fill only the central part of the mask, like 3/4-th of it
4. ❌ Submission FAILS in case if all-ones mask is submitted
5. ❌ Submission FAILS if all-ones except thin margin mask is submitted

## Details
From the third case, we can guess that for some samples, all-ones masks leads to failure. The all-ones mask is easily encoded into RLE format: it is a string that has the following format.
```python
img = read_image(filename)
h, w = img.shape
rle_mask = f"1 {h * w}"
```
As we know the exact format of the encoding in this case, we don't need to use any encoding function for that. Therefore, we can assume that the encoded mask is valid. However, the submission fails. And if this kind of very simple string fails, it means that the scoring algorithm probably expects a mask of a different shape for some (many?) of the test samples. How could it happen? One possible explanation is that some file paths encode wrong information about image size in their names. However, the same failure happens in the case if I derive the mask's shape directly from the image shape, like the following snippet shows.
```python
h, w = PIL.Image.open(filename).size
```
Then it means that the scoring system erroneously treats (?) some images as being of a smaller size than they are in reality. Therefore, the predicted mask doesn't fit, as it is too big, and the number of one values represented as RLE string goes beyond anticipated image limits.

To solve this issue, I decided to try different submission formats. And one of them is created as all-zeros mask, and fill its _central area_ with ones. In this way, the RLE-encoded mask should be in expected the limits. And this worked! (Please see the code in the kernel.)

Therefore, my hypothesis is that somehow, the submission checking code fails because of an obscure problem with expected shapes. Maybe I'm wrong, and it is my code faulty. However, I cannot understand why a dummy all-ones submission fails, while a partially-filled mask works.

Please let me know your thoughts! It is quite upsetting to spend so much time to figure out the submission format instead of doing some real modelling. Not that this is the first competition that has a bit of challenging submission format. But each and every time it happens again, so would be great to make it easier. Thank you!

In [None]:
import logging
from dataclasses import asdict, dataclass, field
from pathlib import Path

import pandas
from fastai.vision.all import *  # get_image_files()
from fast_ai_utils import rle_numba_encode

logging.captureWarnings(True)

In [None]:
@dataclass
class Metadata:
    sample_id: str
    full_path: str
    h: int
    w: int
    
    @classmethod
    def extract(cls, path: Path) -> "Metadata":
        case_and_day = path.parents[1].stem
        _, slice_no, h, w, *_ = path.stem.split("_")
        sample_id = f"{case_and_day}_slice_{int(slice_no):04d}"
        return Metadata(sample_id, str(path), int(h), int(w))

In [None]:
DATA_DIR = Path("/kaggle/input/uw-madison-gi-tract-image-segmentation/")

DEBUG = !cat {DATA_DIR}/sample_submission.csv | test $(wc -l) -eq 1 && echo 1

TEST_IDS = pd.read_csv(DATA_DIR/("train.csv" if DEBUG else "sample_submission.csv"))["id"].drop_duplicates().tolist()

TEST_FILES = get_image_files(DATA_DIR/("train" if DEBUG else "test"))

METADATA = {m.sample_id: m for m in TEST_FILES.map(Metadata.extract)}

In [None]:
df_metadata = pd.DataFrame([asdict(m) for m in METADATA.values()])

In [None]:
pd.crosstab(df_metadata["h"], df_metadata["w"])

In [None]:
df_metadata.apply(lambda row: row.h * row.w, axis=1).value_counts().plot.bar();

In [None]:
pd.read_csv(DATA_DIR/("train.csv" if DEBUG else "sample_submission.csv"))["id"]

In [None]:
from enum import IntEnum

class TestMethod(IntEnum):
    ALL_ZEROS = 0
    ALL_ONES = 1
    FIXED = 2
    ALL_ONES_FROM_IMAGE = 3
    ALL_ONES_MARGIN_05 = 4
    ALL_ONES_MARGIN_10 = 5
    CENTER = 6
    
SELECTED_METHOD = TestMethod.CENTER

In [None]:
from fastprogress import progress_bar

preds = []

for test_id in progress_bar(TEST_IDS):

    case = METADATA[test_id]

    if SELECTED_METHOD == TestMethod.ALL_ZEROS:
        # ✅ works!
        mask = np.zeros((case.h, case.w), dtype=np.uint8)
        rle_string = rle_numba_encode(mask)

    elif SELECTED_METHOD == TestMethod.ALL_ONES:
        # ❌ failed
        mask = np.ones((case.h, case.w), dtype=np.uint8)
        rle_string = rle_numba_encode(mask)

    elif SELECTED_METHOD == TestMethod.ALL_ONES_FROM_IMAGE:
        # ❌ failed
        img = PIL.Image.open(case.full_path)  
        h, w = img.shape
        rle_string = f"1 {h * w}"

    elif SELECTED_METHOD == TestMethod.ALL_ONES_MARGIN_05:
        # ❌ failed
        mask = np.zeros((case.h, case.w), dtype=np.uint8)
        mask[5:-5, 5:-5] = 1
        rle_string = rle_numba_encode(mask)

    elif SELECTED_METHOD == TestMethod.ALL_ONES_MARGIN_10:
        # ❌ failed
        mask = np.zeros((case.h, case.w), dtype=np.uint8)
        mask[10:-10, 10:-10] = 1
        rle_string = rle_numba_encode(mask)

    elif SELECTED_METHOD == TestMethod.CENTER:
        # ✅ works!
        h, w = case.h, case.w
        h_center, w_center = h // 2, w // 2
        h_margin, w_margin = h // 4, w // 4
        mask = np.zeros((h, w), dtype=np.uint8)
        mask[h_center - h_margin:h_center + h_margin, w_center - w_margin:w_center + w_margin] = 1
        rle_string = rle_numba_encode(mask)

    else:
        # ✅ works!
        rle_string = "1 10 20 5"

    for name in ("large_bowel", "small_bowel", "stomach"):
        preds.append({
            "id": test_id,
            "class": name,
            "predicted": rle_string
        })

In [None]:
df_preds = pd.DataFrame(preds)
df_preds.head(10)

In [None]:
df_submit = pd.read_csv(DATA_DIR/"sample_submission.csv")
df_submit = df_submit.drop(columns="predicted").merge(df_preds, on=["id", "class"], how="left")
df_submit.to_csv("submission.csv", index=False)
pd.read_csv("submission.csv").head(10)