Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add frame extractor for wildfire #48

Merged
merged 20 commits into from
May 12, 2020

Conversation

x0s
Copy link
Contributor

@x0s x0s commented Mar 23, 2020

This PR aims at importing the frame extraction into the repository used for training models.

Until now, Frame Extraction was mingled with annotation process. This PR improves encapsulation by modularizing the frame extraction. We will be able to choose different frame extraction strategy for training (not only annotating). This is important feature because we will build/train our models on top of what we show to them (the frames)

This is an ongoing PR. It is important to no have duplicate code, so I consider importing this new class into the other repository.

I provide some graphs illustrating the current workflow, the new one I suggest (big picture and isolated).

This PR does not primarily aim to change behaviour but when covering with tests, I discover an issue that should be fixed:

  • Number of frames created can be lower than number registered in labels.csv See Commit

Isolated, it enables to change extraction strategy independently of the annotation process:
graph_frame_extractor_only_

Here is an example how to use this new object:

    frame_extractor = FrameExtractor(path_to_wildfire,
                                     'jean_lulu_with_seq_01.states.csv',
                                     strategy='random',
                                     n_frames=2)

    labels = (frame_extractor.run(path_to_frames)
                             .get_frame_labels())

So far we use this naming convention:

  • states.csv: a state is a subsequence of a video composed of frames with constant targets (fire presence, position, confidence and exploitability) : dim=(n_states)
  • labels.csv: contains extracted frames with the corresponding targets: dim=(n_states x n_frames/state)

Todo list for this PR

  • Isolate FrameExtractor (from states to labels+frames)
  • Cover with test

Further work for VideoSplitter (to be postponed because the team is currently abundantly annotating)

  • Refactor AnnotationParser
  • Make AnnotationChecker output checkedStates in addition of checkedLabels

Any feedback is welcome,
Cheers,


Current workflow:
graph_original

Suggested workflow:
graph_new_

MateoLostanlen
MateoLostanlen previously approved these changes Mar 23, 2020
@x0s x0s changed the title Add first version of frame extractor for wildfire Add first version of frame extractor for wildfire (Work-In-Progress) Apr 1, 2020
@x0s x0s force-pushed the add-wildfire-frame-extractor branch from 08f3320 to ed4d425 Compare April 1, 2020 13:59
x0s added 9 commits April 2, 2020 12:24
…omments

Before, one may randomly select frame indexes with replacement so it resulted in a frame labeled multiple times in ~labels.csv but only one frame image found in folder. So it creates inconsistent count of frames registered and frames effectively available as images. Now the user get the count of frames he asked for, I think this is an expected behaviour. It raises exception when inconsistent number of frames is found. In the future, on may relax this by maybe skipping the state that has not enough frames to cover the user needs, but still provide expected number of frames per state
…ult behaviour)

it complies with previous version of the extractor, so it now raises warnings when allowed
@x0s x0s added type: enhancement New feature or request module: datasets Related to datasets labels Apr 8, 2020
@x0s x0s added this to In progress in PyroNear kanban via automation Apr 8, 2020
@codecov
Copy link

codecov bot commented Apr 8, 2020

Codecov Report

Merging #48 into master will increase coverage by 1.36%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #48      +/-   ##
==========================================
+ Coverage   83.80%   85.17%   +1.36%     
==========================================
  Files          18       19       +1     
  Lines         704      769      +65     
==========================================
+ Hits          590      655      +65     
  Misses        114      114              
Impacted Files Coverage Δ
pyronear/datasets/wildfire/__init__.py 100.00% <100.00%> (ø)
pyronear/datasets/wildfire/frame_extractor.py 100.00% <100.00%> (ø)

@x0s x0s changed the title Add first version of frame extractor for wildfire (Work-In-Progress) Add frame extractor for wildfire Apr 8, 2020
@x0s x0s assigned x0s and frgfm Apr 9, 2020
Because you are using "range" and not "arange" at 
# Trying to set a valid frame range
frames_range = range(state.stateStart, state.stateEnd + 1)

you need stateStart and stateEnd to be int and not float as it is initially in the states dataframe
I added an option to resize the frame before writing in order to speed up the code. You can use it giving:

resizeFrame=(width, height) to the run method
MateoLostanlen
MateoLostanlen previously approved these changes Apr 19, 2020
PyroNear kanban automation moved this from In progress to Reviewer approved Apr 19, 2020
PyroNear kanban automation moved this from Reviewer approved to Review in progress Apr 19, 2020
Copy link
Member

@frgfm frgfm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the PR! Very useful feature, only imports in the test to move to the header, some package requirements to make more flexible, and a fixture file that could be put in a gist.

Thanks again!

requirements.txt Outdated Show resolved Hide resolved
test/fixtures/wildfire_states.csv Show resolved Hide resolved
requirements.txt Outdated Show resolved Hide resolved
requirements.txt Outdated Show resolved Hide resolved
requirements.txt Outdated Show resolved Hide resolved
test/test_datasets_wildfire_frame_extractor.py Outdated Show resolved Hide resolved
Copy link
Contributor Author

@x0s x0s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding a new feature, but I think that's beyond the scope of this PR. That would be better to create a new PR for it. You may be adding it because we are very slow in reviewing then merging it onto master. Anyway, I think we are close to merging it.

I provide you with feedback here:

  • Please follow the PEP8 convention for variables
  • Please follow the Type ints convention this PR is introducing (to increase stability of the code/ documentation: to answer for instance: what types this function is to expect ?)
  • None is neither False nor True so if you want it to test if a variable contain None, it is better practice to test it explicitly: if variable is not None: / if variable is None:
  • Please provide documentation in the dosctrings, this is lacking in run() and _write_frames()
  • If you add a new feature, it is important to cover it with test cases. As you can see, adding this feature decreased covering of this PR One may expected to test for possibles values (None , differents sizes and some pathological cases: one dimension provided when 2 are expected for instance)
  • Please provide a more detailed example illustrating when we need to speed up the code and why resizing would be a good solution.

Answering these could be done in another PRs i guess

pyronear/datasets/wildfire/frame_extractor.py Outdated Show resolved Hide resolved
@x0s x0s marked this pull request as draft April 29, 2020 08:28
Copy link
Member

@frgfm frgfm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MateoLostanlen could you take a look at @x0s comment on the frame_extractor please?

pyronear/datasets/wildfire/frame_extractor.py Outdated Show resolved Hide resolved
@frgfm frgfm modified the milestones: 0.2.0, 0.1.1 Apr 30, 2020
@MateoLostanlen MateoLostanlen marked this pull request as ready for review May 1, 2020 15:06
MateoLostanlen
MateoLostanlen previously approved these changes May 1, 2020
PyroNear kanban automation moved this from Review in progress to Reviewer approved May 12, 2020
Copy link
Member

@frgfm frgfm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good!

@frgfm frgfm merged commit 61e6410 into pyronear:master May 12, 2020
PyroNear kanban automation moved this from Reviewer approved to Done May 12, 2020
@x0s x0s deleted the add-wildfire-frame-extractor branch June 3, 2020 13:59
Akilditu pushed a commit to Akilditu/seb_pyro-vision that referenced this pull request Jun 3, 2020
* Add first version of frame extractor for wildfire

* comment version import, temporary solution for pyronear#49

* Add type annotations to cover FrameExtractor

* Fix frame extraction strategy not effectively passed to method

* FrameExtractor now only pick unique frames + Cover with tests + Add comments

Before, one may randomly select frame indexes with replacement so it resulted in a frame labeled multiple times in ~labels.csv but only one frame image found in folder. So it creates inconsistent count of frames registered and frames effectively available as images. Now the user get the count of frames he asked for, I think this is an expected behaviour. It raises exception when inconsistent number of frames is found. In the future, on may relax this by maybe skipping the state that has not enough frames to cover the user needs, but still provide expected number of frames per state

* Add Integration tests for FrameExtractor

* Test for multiple frames

* Raise Exception when Video frame unreadable + cover with test

* Print in Exception from which video they is inconsistent number of frames

* Allow frames duplicates when too many asked to be extracted (not default behaviour)

it complies with previous version of the extractor, so it now raises warnings when allowed

* Add automatic download of video fixtures

* Add new requirements

* add yaml dependency

* Convert type for states start and End

Because you are using "range" and not "arange" at 
# Trying to set a valid frame range
frames_range = range(state.stateStart, state.stateEnd + 1)

you need stateStart and stateEnd to be int and not float as it is initially in the states dataframe

* correct typing error in my last commit

self and not this

* Add the posibility to resize the frame

I added an option to resize the frame before writing in order to speed up the code. You can use it giving:

resizeFrame=(width, height) to the run method

* Fix pkg-style

* lower bound packages version to ensure backward compatibility

* Strictly comply with PEP8 concerning imports permitting download that may break in the future

* Revert reviewer commits in favor of another PR

Co-authored-by: lostanlen <mateo.lostanlen@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: datasets Related to datasets type: enhancement New feature or request
Projects
Development

Successfully merging this pull request may close these issues.

None yet

3 participants