Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Matching CVAT task/job with dumped data #3337

Open
2 tasks done
nstolyarov opened this issue Jun 16, 2021 · 13 comments
Open
2 tasks done

Matching CVAT task/job with dumped data #3337

nstolyarov opened this issue Jun 16, 2021 · 13 comments
Assignees
Labels
customer High priority feature request enhancement New feature or request

Comments

@nstolyarov
Copy link

My actions before raising this issue

Is there any option to match dumped data (ImageNet or Semantic Segmentation dump e.g.) with the same data in CVAT's task -> job -> id?

Where can I get written information about images names in concrete task/job? Maybe DB or any table?

Context

After some automatical operations with annotations (getting metrics for 1.000+ images e.g.) I want to know where should I fix the wrong annotation in CVAT.

Your Environment

  • Git hash commit (git log -1):
commit be9e00fa7aeba432690901d54509760eb9ebfba4 (HEAD -> develop, origin/develop, origin/HEAD)
Author: Dmitry Kruchinin <33020454+dvkruchinin@users.noreply.github.com>
Date:   Wed Apr 21 14:47:58 2021 +0300

    Update cypress test. Canvas 3D functionality. Basic actions. (#3112)
    
    * Rename, add css class
    
    * Update cypress test.

  • Docker version docker version (e.g. Docker 17.0.05): 20.10.05
  • Are you using Docker Swarm or Kubernetes? Nope
  • Operating System and version (e.g. Linux, Windows, MacOS): GNU/Linux Ubuntu 4.15.0-140-generic
@zhiltsov-max
Copy link
Contributor

In most formats, the dumped files have the same names as they had in the source CVAT task. Which formats did you export task to? You can find image names in the annotation window of the CVAT task.

CVAT only allows to navigate to a specific frame index. If you want to get image index in CVAT, you can do one of:
a) get it from the dumped data, if the output format includes such information
b) sort image paths / names lexicographically and get the index
c) export in CVAT for images / CVAT for video / Datumaro and find "frame index" there

Please, describe, what you want to do more precise, so I could help you better.

@bsekachev, adding navigation by image names could be useful.

@bsekachev
Copy link
Member

You can see the current image name in CVAT near frame navigation elements.

Screenshot from 2021-06-17 22-42-45

@bsekachev
Copy link
Member

bsekachev commented Jun 17, 2021

Speaking about the database, you can see a mapping there if you have an access to it.
But be very careful when working directly with the database.

docker exec -it cvat_db /bin/bash
/usr/local/bin/createuser -s postgres # if you get error that postgres role does not exist.
psql cvat --user postgres

For the task with ID 9:

SELECT engine_image.frame, engine_image.path from engine_task INNER JOIN engine_data on engine_task.data_id=engine_data.id INNER JOIN engine_image on engine_image.data_id = engine_data.id where engine_task.id=9;

Screenshot from 2021-06-17 23-03-55

@bsekachev
Copy link
Member

From the UI point of view, I would suggest adding a feature to search a frame number by its name. Would it be a convenient solution for users in your opinion?

@bsekachev bsekachev added enhancement New feature or request customer High priority feature request labels Jun 17, 2021
@nstolyarov
Copy link
Author

Hi @zhiltsov-max and @bsekachev. Thank you for your answers.

I will try to give a clear example.

Suppose I have a full path to the file like "FULL/PATH/image.jpg". And I even know the task name / id where is it (but maybe not). How can I find the job id and frame id for this image in CVAT?

It would be useful if I had info like the following:

TASK_ID JOB_ID FRAME_ID IMG_PATH
111 26 874 full/path/to/image.jpg
103 13 234 full/path/to/another_image.jpg

Is there a possibility to get it from CVAT?

I need this in case when I do some operations with annotations (using Semantic mask 1.1 e.g.) and then I need to fix the concrete image's annotation.

UPDATE

I've tried the following command in cvat_db

SELECT * from engine_task INNER JOIN engine_data on engine_task.data_id=engine_data.id INNER JOIN engine_image on engine_image.data_id = engine_data.id;

Am I right that

  • stop frame is the last frame in the task?
  • frame is the frame id for this task?
  • id is image id for the whole CVAT?

@bsekachev
Copy link
Member

@nstolyarov

stop frame is the last frame in the task?

Not exactly. stop_frame is the latest frame in a job. A number of frames in a task: engine_task.size

frame is the frame id for this task?

I would say it is a frame number for this task.

id is image id for the whole CVAT?

I am not sure I understand you. engine_image.id is a primary key in the database, so, it is unique for the CVAT instance.

Generally speaking, a frame can be included into two jobs (if an overlap is enabled).
You can see a range of frames for a specific job on the task page:
Screenshot from 2021-06-18 11-56-05

@nstolyarov
Copy link
Author

@bsekachev

Not exactly. stop_frame is the latest frame in a job. A number of frames in a task: engine_task.size

It is strange because in the task with 35 jobs I have segment_size=20, stop_frame=680 and size=681 for every data in the table.

But nevertheless seems that this is realy what I need.

Thank you very much for your help.

@bsekachev
Copy link
Member

It is strange because in the task with 35 jobs I have segment_size=20, stop_frame=680 and size=681 for every data in the table.

Sounds really strange. This is a piece of the table engine_segment:
Screenshot from 2021-06-18 12-49-16

You can see here start_frame and stop_frame fields are different for the same task_id field.

@mikeyEcology
Copy link

mikeyEcology commented Jun 24, 2021

I would find it useful if when I exported annotations I could get a list of the image name (the file path) and the image number (2) in the example image I tried to upload. So if I have someone annotating images and there are some with issues I can have her record the number of the image with the issue and I can exclude it from my dataset.
So for this example, I'd have a table with a row that has:
bristlecone2.PNG, 2
Is this available? It sounds like this is what @nstolyarov is asking, but I'm not sure.

image

@MattWittbrodt
Copy link

@mikeyEcology this is in the engine_image table.

select path, frame from engine_image will give you that. Just make sure to realize that # (frame number) is replicated across tasks. For example, Task A will have a frame 2 and Task B will have a frame 2.

@avengersassemble
Copy link

Try this:

create view task_job_frame
as
SELECT distinct ep.id as project_id, s.task_id as task_id, j.id AS job_id, s.frame_id, ei.path
FROM engine_job j
INNER JOIN (
SELECT id, task_id, generate_series(start_frame, stop_frame) AS frame_id
FROM engine_segment ) s ON j.segment_id = s.id
inner join engine_image ei
on ei.frame = s.frame_id
and ei.data_id = s.task_id
inner join engine_task et
on s.task_id = et.id
inner join engine_project ep
on et.project_id = ep.id
;

@Jain-Archit
Copy link

@avengersassemble @MattWittbrodt Is there any way to get the above information using cvat-cli or api? I have a single task which is divided into multiple non-overlapping jobs. Annotators have not done certain jobs and in those jobs that they have finished, there are some corrupt images (no annotations). I want to distinguish the images which are corrupt and the ones which have not been annotated yet. I do not have access to cvat-db and looking for a solution using api/cli. Can anyone help?

@zhiltsov-max
Copy link
Contributor

Hi, please check if the get_meta() and get_frames_info() methods of Task and Job in high-level SDK are useful.
Example 1, example 2, more complex example 3 with lower level API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
customer High priority feature request enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants