Skip to content

Migrating v1.5 competition to v2

Eric Carmichael edited this page Jul 2, 2019 · 5 revisions

In this example I will be migrating this Iris competition to v2: https://github.com/madclam/m2aic2019

Baseline

I first made sure the bundle I was working with worked on v1.5 by uploading the bundle produced by make_bundle then making a submission included with the Iris example. It worked.

Then I downloaded Pisano period v2 competition here: https://github.com/codalab/competition-examples/tree/master/v2/pisano_period

And made sure that worked in a similar way, using the task + solution provided.

Convert the YAML

Overall

I went through the old YAML and many fields are not implemented yet, for example:

  • force_submission_to_leaderboard
  • disallow_leaderboard_modifying
  • etc.

I commented these out for now, we can actually use this competition to test them later!

HTML

HTML section has a new format, old:

html:
  data: data.html
  evaluation: evaluation.html
  overview: overview.html
  terms: rules.html
  #notebook: README.html

New:

pages:
  - title: Data
    file: data.html
  - title: Evaluation
    file: evaluation.html
  - title: Overview
    file: overview.html
  - title: Rules
    file: rules.html

Phases

In v2 we leverage "tasks and solutions" instead of putting data directly on phases. Phases keep their main properties like start_date, end_date although some are named more simply i.e. start, end

Old:

phases:
  1:
    phasenumber: 1
    label: Development Phase
    description: 'Development phase: tune your models and submit prediction results, trained model, or untrained model.'
    start_date: 2018-11-15
    is_scoring_only: False    
    execution_time_limit: 500
    max_submissions_per_day: 5   
    force_best_submission_to_leaderboard: True      # Participants will see their best submission on the leaderboard
    starting_kit: starting_kit.zip                  # The starting kit you prepared
    ingestion_program: ingestion_program.zip        # The ingestion program (the same for both phases)
    public_data: input_data.zip                     # Same as input data (available for download by the participants)
    input_data: input_data.zip                      # The data used by the ingestion program (and the code of the participants) in both phases
    scoring_program: scoring_program.zip            # The scoring program (the same for both phases)
    reference_data: reference_data_1.zip            # The truth values (solution) for phase 1 used by the scoring program
    color: green   

New:

tasks:
  - index: 0
    name: Iris Development Phase Task
    input_data: input_data.zip
    scoring_program: scoring_program.zip
    reference_data: reference_data_1.zip

# No solutions included in this example, but it's possible to do it like so...
# solutions:
# - index: 0
# - tasks:
#   - 0
# - path: solution.zip

phases:
  - name: Development Phase
    description: 'Development phase: tune your models and submit prediction results, trained model, or untrained model.'
    start: 2018-11-15
    tasks:
      - 0
    # if we had solutions..
    # solutions:
    #   - 0

Leaderboard

Old:

leaderboard:
    leaderboards:
        Results: &RESULTS
            label: RESULTS
            rank: 1
    columns:
        set1_score:
            leaderboard: *RESULTS
            label: Prediction score
            numeric_format: 4
            rank: 1
        Duration:
            leaderboard: *RESULTS
            label: Duration 
            numeric_format: 2
            rank: 2
            

New:

leaderboards:
  - title: Results
    key: main
    columns:
      - title: Prediction score
        key: set1_score
        index: 0
        sorting: desc
      - title: Duration
        key: Duration
        index: 1
        sorting: desc

Convert the programs

Codalab competitions can be ran in 3 styles:

  • Code submission (i.e. python, R, C#)
  • Result submission (i.e. json)
  • Hybrid (submit either variation)

The worker executes the programs in a docker container that the user or organizer can specify, depending on the competition configuration.

How a v1.5 program runs on compute worker

v1.5 programs give Codalab some information on how to run the program. It has a metadata file with a command call, where some strings are replaced with actual file locations, like so:

command: python $program/program.py $input $output

How a v2 program runs on compute worker

v2 programs tell Codalab how they run slightly differently, they use metadata.yaml with a command (no strings replaced):

command: python program.py

The largest difference here is files are put in known places. You have /app/input and /app/output instead of some randomly named temporary folder.

Generating predictions

For your predictions, write to the ./output folder.

Submitting scores

For your scoring programs, write to the ./output folder a scores.json mapping to your leaderboard.

Iris scoring program conversion

Old:

# metadata
command: python3 $program/score.py $input $output
description: Compute scores for the competition

New, with default paths inserted:

(note: new style runs with working dir in the program folder)

# metadata.yaml  *** NOTE: added .yaml file ending! ***
command: python3 score.py /app/input/ /app/output/

I tweaked the actual score.py program to store scores in a dictionary and dump that:

Known problems!!!

Fix this:

  • Metadata for scoring programs in CODE competitions should pass in elapsedTime among other missing properties. This should be in the input folder along with the submission prediction results
  • Allow specifying the Docker image to use for prediction/scoring on the competition itself. It is passed along to the worker already
Clone this wiki locally