Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Datumaro] Image control in converters #1799

Merged
merged 8 commits into from Jul 13, 2020

Conversation

zhiltsov-max
Copy link
Contributor

@zhiltsov-max zhiltsov-max commented Jun 24, 2020

Motivation and context

  • Added an option to specify image extension when exporting datasets.
  • Added image copying when exporting datasets, if possible.
  • Updated Converter class interface. Now it should be invoked with Converter.convert(...). Converter definition simplified.
  • Annotation-less files are not generated anymore in COCO format, unless target tasks explicitly requested.

How has this been tested?

Unit tests.

How to test:

datum project export -f <any format> -- [--image-ext '.png'] --save-images

Checklist

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.
  • I have updated the license header for each file (see an example below)
# Copyright (C) 2020 Intel Corporation
#
# SPDX-License-Identifier: MIT

@zhiltsov-max zhiltsov-max added this to In progress in Dataset framework (Datumaro) via automation Jun 24, 2020
@zhiltsov-max
Copy link
Contributor Author

@nmanovic, how do you think, should we add an option of making symlinks instead of copying?

@coveralls
Copy link

coveralls commented Jun 24, 2020

Pull Request Test Coverage Report for Build 6199

  • 196 of 233 (84.12%) changed or added relevant lines in 23 files are covered.
  • 6 unchanged lines in 4 files lost coverage.
  • Overall coverage increased (+0.02%) to 65.124%

Changes Missing Coverage Covered Lines Changed/Added Lines %
cvat/apps/dataset_manager/formats/mask.py 3 4 75.0%
datumaro/datumaro/components/project.py 1 2 50.0%
datumaro/datumaro/plugins/labelme_format.py 10 11 90.91%
datumaro/datumaro/plugins/mot_format.py 7 8 87.5%
datumaro/datumaro/plugins/tf_detection_api_format/converter.py 21 22 95.45%
datumaro/datumaro/plugins/image_dir.py 12 14 85.71%
datumaro/datumaro/cli/contexts/project/init.py 0 3 0.0%
datumaro/datumaro/components/converter.py 40 45 88.89%
datumaro/datumaro/plugins/coco_format/converter.py 33 40 82.5%
datumaro/datumaro/plugins/voc_format/converter.py 33 48 68.75%
Files with Coverage Reduction New Missed Lines %
datumaro/datumaro/components/project.py 1 77.69%
datumaro/datumaro/plugins/coco_format/converter.py 1 89.03%
datumaro/datumaro/plugins/image_dir.py 1 87.72%
datumaro/datumaro/plugins/coco_format/importer.py 3 79.49%
Totals Coverage Status
Change from base Build 6185: 0.02%
Covered Lines: 10979
Relevant Lines: 16448

💛 - Coveralls

@azhavoro
Copy link
Contributor

@zhiltsov-max could you fix conflicts?

@zhiltsov-max
Copy link
Contributor Author

zhiltsov-max commented Jun 26, 2020

@azhavoro, done. Is there a way to obtain really original images? I mean, a path to the raw image. The problem is that once read, jpeg images can't be saved with original quality, even if saving with 100% quality. PNG images are likely to lose their compression. So, since we've learned to just copy images on export, it makes sense to support it in CVAT too. Also, it potentially should give a significant performance boost for export.

An example - an original (raw) PNG was 43 kb, exported became 60.

azhavoro
azhavoro previously approved these changes Jun 26, 2020
nmanovic
nmanovic previously approved these changes Jul 13, 2020
Dataset framework (Datumaro) automation moved this from In progress to Reviewer approved Jul 13, 2020
@nmanovic
Copy link
Contributor

@zhiltsov-max , could you please resolve conflicts?

@zhiltsov-max zhiltsov-max dismissed stale reviews from nmanovic and azhavoro via 691ffdf July 13, 2020 12:53
Dataset framework (Datumaro) automation moved this from Reviewer approved to Review in progress Jul 13, 2020
Copy link
Contributor

Codacy Here is an overview of what got changed by this pull request:

Complexity increasing per file
==============================
- datumaro/datumaro/components/converter.py  2
         

Complexity decreasing per file
==============================
+ datumaro/datumaro/plugins/tf_detection_api_format/extractor.py  -1
         

See the complete overview on Codacy

@nmanovic nmanovic merged commit f807714 into develop Jul 13, 2020
Dataset framework (Datumaro) automation moved this from Review in progress to Done Jul 13, 2020
@nmanovic nmanovic deleted the zm/dm-image-control-in-converters branch July 13, 2020 14:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants