Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

open_firev2 #121

Closed
wants to merge 45 commits into from
Closed

Conversation

MateoLostanlen
Copy link
Member

Open_fire v1 is dying due to too many dead links as discuss in issue #118. I want to propose here a Open_fire V2. I have a dataset of 9000 images from which a good part comes from openfire. I propose that this dataset becomes the new Open_fire. It is downloadable from my drive so no problem with dead link here.

I guess this PR should close #9 too.

PS: For testing purposes I have created a sample version that can be downloaded quickly.
PPS: The real version of the datatset is not yet on my drive I have to put it in the next coming days but I have wifi problems

After this PR I will use the sample template for Wilfire dataset

@MateoLostanlen MateoLostanlen added the module: datasets Related to datasets label Apr 15, 2021
@MateoLostanlen MateoLostanlen requested review from frgfm and a team April 15, 2021 15:38
@MateoLostanlen MateoLostanlen self-assigned this Apr 15, 2021
@MateoLostanlen MateoLostanlen added this to In progress in PyroNear kanban via automation Apr 15, 2021
@MateoLostanlen MateoLostanlen linked an issue Apr 15, 2021 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Apr 15, 2021

Codecov Report

Merging #121 (2a71cc6) into master (40ddf2f) will decrease coverage by 1.12%.
The diff coverage is 85.00%.

❗ Current head 2a71cc6 differs from pull request most recent head eddfbeb. Consider uploading reports for the commit eddfbeb to get more accurate results

@@            Coverage Diff             @@
##           master     #121      +/-   ##
==========================================
- Coverage   92.33%   91.20%   -1.13%     
==========================================
  Files          18       18              
  Lines         652      637      -15     
==========================================
- Hits          602      581      -21     
- Misses         50       56       +6     
Flag Coverage Δ
unittests 91.20% <85.00%> (-1.13%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pyrovision/datasets/utils.py 84.61% <ø> (-6.16%) ⬇️
pyrovision/datasets/openfire.py 85.71% <85.00%> (-5.32%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ce4db67...eddfbeb. Read the comment docs.

@frgfm
Copy link
Member

frgfm commented Apr 19, 2021

Thanks for the PR and the dataset!
Though, GDrive is not exactly the best solution for dataset storage. How big is it?
I would rather consider putting it in a release attachment and add integrity check 👌

@MateoLostanlen
Copy link
Member Author

MateoLostanlen commented Apr 19, 2021

@frgfm
Copy link
Member

frgfm commented Apr 19, 2021

@MateoLostanlen 500Mb is quite alright :)

  • first, the limitation you are mentioning is about attachment to a comment/PR/issues. I'm positive the limitation for releases are higher since I have model checkpoints of my own bigger than 150Mb
  • worst case scenario if it's a bit too big, we can easily split it in 2/3 parts and put each part in the release attachments !

@MateoLostanlen
Copy link
Member Author

Hi @frgfm, what do you thinks of these last changes ?

Copy link
Member

@frgfm frgfm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I have many questions as I'm not familiar with some classes that you used in here 😅

pyrovision/datasets/__init__.py Show resolved Hide resolved
test/test_datasets_utils.py Outdated Show resolved Hide resolved
pyrovision/datasets/utils.py Outdated Show resolved Hide resolved
pyrovision/datasets/utils.py Outdated Show resolved Hide resolved
pyrovision/datasets/openfire.py Outdated Show resolved Hide resolved
test/test_datasets.py Outdated Show resolved Hide resolved
pyrovision/datasets/openfire.py Outdated Show resolved Hide resolved
pyrovision/datasets/openfire.py Outdated Show resolved Hide resolved
pyrovision/datasets/openfire.py Outdated Show resolved Hide resolved
pyrovision/datasets/openfire.py Outdated Show resolved Hide resolved
@frgfm
Copy link
Member

frgfm commented Feb 13, 2022

@MateoLostanlen sorry I thought that you saw, but I didn't review because there is still a failing test which is related to this PR. Would you mind fixing it before moving on to the review? 🙏

@MateoLostanlen
Copy link
Member Author

@frgfm oh sorry this pr is open for too long I had forgotten :) It's ok the error is fixed. There is still a problem but it has nothing to do with pyrovision as explained here

Copy link
Member

@frgfm frgfm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! My bad, I had to jump back into this

test/test_datasets.py Outdated Show resolved Hide resolved
pyrovision/datasets/openfire.py Outdated Show resolved Hide resolved
Copy link
Member

@frgfm frgfm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unittests looks good thanks!

For now, about the legal part of sharing the URL of the dataset, I can suggest a viable option : we don't share the URL, we only specify about the folder should be structured. People make their own dataset and they'll have to pass the path to the extracted archive.

(while we figure out how/if we can share this)

@MateoLostanlen
Copy link
Member Author

I thinks it will be complicated in a legal point of view . I think our current version it's not legal either. So let's delete it from the repo. However if you do not download any images, this dataset became exactly the ImageFolder dataset from torchvision. It is still usefull ?

@frgfm
Copy link
Member

frgfm commented Mar 13, 2022

I thinks it will be complicated in a legal point of view . I think our current version it's not legal either. So let's delete it from the repo. However if you do not download any images, this dataset became exactly the ImageFolder dataset from torchvision. It is still usefull ?

Fair point

For now, perhaps we only use ImageFolder for training and it will simplify things until we can share a public dataset

Copy link
Member

@frgfm frgfm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, there are a few things missing:

  • removing openfire from datasets/init.py
  • updating the reference script to use ImageFolder

PyroNear kanban automation moved this from In progress to Review in progress Mar 16, 2022
@MateoLostanlen
Copy link
Member Author

Done :)

Copy link
Member

@frgfm frgfm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot 🙂
Perhaps we should document the structure of the folder but it's nitpicking, no problem for me to move forward with this 👍

### Wildfire

This dataset is private, for now only Pyronear members can access it
Train the model on your own dataset
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should specify the folder expected structure / hierarchy?

PyroNear kanban automation moved this from Review in progress to Reviewer approved Apr 6, 2022
@frgfm
Copy link
Member

frgfm commented Jul 18, 2022

Overruled by #153 :)
OpenFire is back 👍

@frgfm frgfm closed this Jul 18, 2022
PyroNear kanban automation moved this from Reviewer approved to Done Jul 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: datasets Related to datasets
Projects
Development

Successfully merging this pull request may close these issues.

Issue downloading OpenFire [ci] Optimize cache for datasets downloading
2 participants