Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?


Failed to load latest commit information.
Latest commit message
Commit time
April 30, 2021 16:22
September 17, 2023 21:58
August 3, 2023 08:08
June 8, 2023 18:57
September 8, 2023 02:01
November 1, 2022 20:58
February 28, 2020 08:50
September 14, 2021 16:08
September 1, 2020 17:06
March 10, 2020 11:34
January 7, 2021 09:30
March 19, 2020 12:05


Completion Status

It's 10,000 PNGs of real driving captured from the comma fleet, semantically labeled by the public. It's MIT license, no academic only restrictions or anything.

Learn more from the blog post, or on the discord in the #comma-pencil channel.



to see them with the mask overlay.


 imgs/  -- The PNG image files
 masks/ -- PNG segmentation masks (update these!)
 imgs2/  -- New PNG image files paired with fisheye PNGs
 masks2/ -- PNG segmentation masks (update these!)
 imgsd/  -- Driver camera PNG image files from Comma3
 masksd/ -- PNG segmentation masks (update these!)
 segs/  -- The outputs in probability from our internal segnet (unreleased, too big)

Categories of internal segnet

 1 - #402020 - road (all parts, anywhere nobody would look at you funny for driving)
 2 - #ff0000 - lane markings (don't include non lane markings like turn arrows and crosswalks)
 3 - #808060 - undrivable
 4 - #00ff66 - movable (vehicles and people/animals)
 5 - #cc00ff - my car (and anything inside it, including wires, mounts, etc. No reflections)
 6 - #00ccff - movable in my car (people inside the car, imgsd only)

How can I help?

  1. Visit the Google Spreadsheet (request access to edit the spreadsheet) and put your discord username in the "labeler" column for the mask(s) you'll be working on and change the status to "In Progress." If you're new, please start off with just one so we can leave you feedback; this is to prevent you from having to redo them because of something done incorrectly. UPDATE: The original imgs set is complete, but a new imgs2 set was added and is still unfinished. There are "e" and "f" versions for the same image number. Check the "imgs2 series" tab in the spreadsheet to see what's available.

    UPDATE 2: Interior images have been added to the imgsd folder. These are the current priority.

  2. Spend some time studying already merged masks to see how things are labeled. You could use the comma10kviewer web tool to easily do this.

  3. Watch the Beginner Tutorial youtube video below.

  4. Start labelling! Useful label tools:

    • img-labeler Only compatible with Chrome and Edge. Other browsers like Brave, Firefox, and Opera, even if chromium based, don't work properly. Must be used with browser zoom and monitor scaling disabled otherwise it will save with a wrong resolution. Hardware acceleration has also been identified as a possible cause for img-labaler incorrectly saving masks with anti-aliasing. It can be disabled at chrome://settings/system.
      UPDATE: Img-labeler has been updated to support the new imgs2 set. If, for example, you would like to work on image 00074_e, simply type 74e in the image number box. Type 74f for image 00074_f.

    • An external image manipulation tool such as GIMP/Krita (Free) or Adobe Photoshop (Paid) If you choose to use an external tool please ensure your color mode is set to 8-bit, and that antialiasing doesn't change the colors on the edges of your mask.

  5. Fork this repository to your account using the "Fork" button in the top right

  6. Create a new branch from the master branch, and use your labelling tool of choice to label some images

  7. Open a pull request from your new branch to the master branch in the official repository to submit your changes!

  8. Visit the #comma-pencil channel on the Discord for the latest news and chat about the project.

Image Viewing Tools


  1. comma10kviewer #not available At The Moment
  2. comma10kreviewer

Beginner Tutorial

The Goal



comma10k is still a work in progress. For now, just cite the GitHub link. Once we reach 10k images, we'll release a paper, a train/test split, and a benchmark model.

For now, we are validating on images ending with "9.png" and are seeing a categorical cross entropy loss of 0.051. Can you beat this?

And it has been beaten with a CCE loss of 0.045, "comma10k-baseline" by YassineYousfi!

Can you beat that?


10k crowdsourced images for training segnets







No releases published


No packages published