Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transition CLI to 1-based Frame Numbers #265

Closed
4 tasks done
Breakthrough opened this issue Mar 13, 2022 · 3 comments
Closed
4 tasks done

Transition CLI to 1-based Frame Numbers #265

Breakthrough opened this issue Mar 13, 2022 · 3 comments
Milestone

Comments

@Breakthrough
Copy link
Owner

Breakthrough commented Mar 13, 2022

Problem:

In PySceneDetect v0.5, frame numbers started from frame 0 in the CLI output. This led to some inconsistencies when interfacing with other programs (e.g. ffmpeg) which assume 1-based indices.

Proposal:

The reported end time/frame # should include the presentation time. Thus for a single-frame scene that is the first frame of the video, it should have start/end frames of 1/1 and a start/end time, assuming 10 FPS, of 0.0/0.1 seconds.

The first frame of a video should be called frame 1 with a presentation time of 0.0 seconds (and presentation duration of 0.1s). If we have a video at 10 frames/sec, the second frame (frame 2) should have a presentation time of 0.1s and be displayed until 0.2s. See the frames extracted from counter.mp4 for an example of such a video.

Example:

Assume we have a video at 10 frames/second, which is 0.7 seconds long in total (including the display time of the last frame), and has 4 scenes of length 1, 2, 1, and 3 frames, respectively. We should have the following invariants on the timecodes/frame numbers's that are output for each scene via the CLI / list-scenes:

Scene
Number
Start
(sec)
Start
(frame)
End
(sec)
End
(frame)
Duration
(sec)
Duration
(frames)
1 0.0 1 0.1 1 0.1 1
2 0.1 2 0.3 3 0.2 2
3 0.3 4 0.4 4 0.1 1
4 0.4 5 0.7 7 0.3 3

See original discussion in #264 for more details.

Internally, the scenedetect API uses 0-based frame indexing. Thus the above scene list when returned from calling detect_scenes on a SceneManager will yield the following FrameTimecode results (there is no change between 0.5 and 0.6):

Scene Start
(timecode)
Start
(frame)
End
(timecode)
End
(frame)
0 0.0 0 0.1 1
1 0.1 1 0.3 3
2 0.3 3 0.4 4
3 0.4 4 0.7 7

In the API, the first frame (PTS 0.0s) has a frame number of 0, and the end timecode object is reflective of the entire duration of the scene (thus end - start = duration).

Tasks:

  • Ensure that end time is at least frame 1 or > 0.0s if a timecode
  • Ensure that duration is at least one frame 1 or > 0.0s if a timecode
  • Change frame numbers in statsfile to also start from 1
  • Validate output against counter.mp4 and its frames
@eksperimental
Copy link

Hi. Sorry, End frame from once scene cannot be the same start frame of the next scene.

the Example above should be:

scene 1 | start frame: 1 | end frame: 1 | duration frames: 1
scene 2 | start frame: 2 | end frame: 4 | duration frames: 3

@eksperimental
Copy link

In short, a frame cannot belong to two different scenes.

@Breakthrough
Copy link
Owner Author

Breakthrough commented Mar 14, 2022

@eksperimental Good point, thanks for the suggestion - that certainly helps the math work out easier as well. Just need to make sure that it's documented correctly. I had originally written that table in the form the internal API would see it, not the scene list output.

Edit: I've clarified by showing both tables.

@Breakthrough Breakthrough modified the milestones: v0.7, v0.6 Mar 14, 2022
@Breakthrough Breakthrough changed the title Transition to 1-based Frame Numbers Transition CLI to 1-based Frame Numbers Apr 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants