Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to yolo #232

Closed
wants to merge 6 commits into from
Closed

Update to yolo #232

wants to merge 6 commits into from

Conversation

daavoo
Copy link
Contributor

@daavoo daavoo commented Aug 3, 2023

Deployed:

https://github.com/daavoo/example-get-started-experiments
https://studio.iterative.ai/team/Iterative/projects/example-get-started-experiments-bhm5eg8le6


Make the decision because:

  • It was the framework I was already using on a similar project 😅
  • It feels like it is more popular these days compared to fast.ai.
    Not only in general, but in our particular case more than 1 customer/prospect has explicitly reported using YOLO but (afaik) we did not hear that from fast.ai.
  • The framework logic for training felt simpler
    See train.py vs previous with fast.ai
    No need to separate evaluate.py stage because YOLO loads the best model and evaluates it at the end of training.

TODO:

@daavoo daavoo added the A: example-get-started-experiments DVC Experiment, DVCLive examples label Aug 3, 2023
@daavoo daavoo self-assigned this Aug 3, 2023
@daavoo daavoo marked this pull request as draft August 3, 2023 10:55

git rm TrainSegModel.ipynb
Copy link
Contributor Author

@daavoo daavoo Aug 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a git tag for the notebook state.
Feels more natural to drop it once converted to DVC pipeline

@daavoo daavoo marked this pull request as ready for review August 3, 2023 13:41
@@ -107,47 +103,3 @@ This tag also contains a GitHub Actions workflow that reruns the pipeline if any
changes are introduced to the pipeline-related files.
[CML](https://cml.dev/) is used in this workflow to provision a cloud-based GPU
machine as well as report model performance results in Pull Requests.

## Deploying the model
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this section for now.
Will properly close #230 on a subsequent P.R. adding the sagemaker code

Try to improve look of images painted by YOLO
@dberenbaum
Copy link
Collaborator

dberenbaum commented Aug 3, 2023

Thoughts on yolo:

  • I like that it's a more popular, high-level framework.
  • Am I reading it wrong, or are the predictions pretty bad?
  • Not sure if the simplicity is a plus here. It is easy to read, but it's maybe too trivial? Do I really need to modularize my training code if it's 2 lines?
  • How is dvclive even invoked here? Any idea why https://docs.ultralytics.com/reference/yolo/utils/callbacks/dvc/ is giving a 404?
  • In general, I worry there's so much magic happening in both dvc and yolo here that it's hard for the user to learn anything from it or generalize to their use case.

A couple small Studio UI issues:

  • Did you manually hide all older commits in Studio? I only see the latest commit and its experiments.
  • Similarly, it looks like lots of metrics are hidden. If we want to use a more realistic example in yolo, I think it would be better to show more metrics.

fire
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but I vote to drop fire and use argparse. The scripts will still be plenty short and I think they will be more transparent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do

desc="This is a Computer Vision (CV) model that's segmenting out swimming pools from satellite images.",
labels=["cv", "segmentation", "satellite-images", params.train.arch],
)
yolo.train(data="datasets/yolo_dataset.yaml", epochs=epochs, imgsz=imgsz, **kwargs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not track datasets/yolo_dataset.yaml in dvc.yaml?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason. Can totally do it.
It is just a description of where paths are because YOLO wants it to be in a file so felt like not really relevant

yolo settings datasets_dir="/workspaces/example-repos-dev/example-get-started-experiments/build/example-get-started-experiments/datasets/"
yolo settings runs_dir="/workspaces/example-repos-dev/example-get-started-experiments/build/example-get-started-experiments/runs/"
yolo settings weights_dir="/workspaces/example-repos-dev/example-get-started-experiments/build/example-get-started-experiments/weights/"
jupyter nbconvert --execute 'TrainSegModel.ipynb' --inplace
git add .
tick
git commit -m "Run notebook and apply best experiment"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it still applying the best experiment?

Copy link
Contributor Author

@daavoo daavoo Aug 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, do you think we should keep that logic? I am not even sure if people realize what was going one without checking this script

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I just meant we should change the commit message

Comment on lines +109 to +110
-d src/train.py -d datasets/yolo_dataset/train -d datasets/yolo_dataset/val \
"python src/train.py \${train}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels odd to not have the model as an output of the stage here to me.

Also, why do have train and val datasets as separate deps/outs instead of tracking the whole dataset as one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels odd to not have the model as an output of the stage here to me.

I can make the changes for that. I know odd but couldn't find a reason why it matters (maybe I should add additional stage)

Also, why do have train and val datasets as separate deps/outs instead of tracking the whole dataset as one?

So in Studio you can use Data columns to show the number of files separately.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can make the changes for that. I know odd but couldn't find a reason why it matters (maybe I should add additional stage)

Does iterative/dvclive#631 break anything here since it skips caching if _inside_dvc_exp=True?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does iterative/dvclive#631 break anything here since it skips caching if _inside_dvc_exp=True?

Because the .dvc file is created at notebook stage, I believe there is no difference because DVC will (dvc) commit the changes at the end

@dberenbaum
Copy link
Collaborator

Overall, I think it's great and shows off how powerful and simple the tools can be, but feels more like a showcase of functionality than a way to teach about dvc experiments. I would like to at least take some steps on iterative/dvclive#603 if we make this our default example project. Although it's hard to maintain them all as auto-generated repos, I think we need many examples of different frameworks, so I would urge that we should try to keep both projects in some form instead of replacing one with the other.

@daavoo
Copy link
Contributor Author

daavoo commented Aug 4, 2023

  • Am I reading it wrong, or are the predictions pretty bad?

Not sure how/what you are reading. I could talk for a while about metrics for these tasks 😅

The metric currently selected is an instance segmentation one (pmAP 0.5-0.95) and it doesn't hold the same meaning as accuracy or similar (like, I expect 0.9 to consider it good).
It can't be even compared across datasets, but here is SOTA in COCO (still no one passed 0.6):

image

  • Not sure if the simplicity is a plus here. It is easy to read, but it's maybe too trivial? Do I really need to modularize my training code if it's 2 lines?

The point is not (only?) to modularize but to use exp run and its associated features, right?

It is automatically added as a callback if it is installed:

https://github.com/ultralytics/ultralytics/blob/main/ultralytics/utils/callbacks/dvc.py

This is the way YOLO currently handles the integrations.

  • In general, I worry there's so much magic happening in both dvc and yolo here that it's hard for the user to learn anything from it or generalize to their use case.

So, I am assuming you did not have that worry with the previous fast.ai code because the callback was passed explicitly?
I remember the feedback was that it was too complicated because it used the context manager, etc.

A couple small Studio UI issues:

  • Did you manually hide all older commits in Studio? I only see the latest commit and its experiments.

Yes, feel free to add/show whatever.

  • Similarly, it looks like lots of metrics are hidden. If we want to use a more realistic example in yolo, I think it would be better to show more metrics.

Same as above, just the view I added, doesn't have to be what we show. It is what I prefer but I don't know if that is the best way to show as an example.
I do worry about the fallacy of making things look "complex" for the sake of appearance.

@daavoo
Copy link
Contributor Author

daavoo commented Aug 4, 2023

but feels more like a showcase of functionality than a way to teach about dvc experiments

Sorry, maybe it was already implicit in the previous comments but could you summarize the point that makes it the case for this example vs the current one?

What I got as main differences:

  • no evaluate stage
  • train code to simple

Is there something else?

@dberenbaum
Copy link
Collaborator

The metric currently selected is an instance segmentation one (pmAP 0.5-0.95) and it doesn't hold the same meaning as accuracy or similar (like, I expect 0.9 to consider it good).

Sorry, I was reading it wrong 😄 . I have used mAP but here I was looking at the image masks. I think it is just harder to see whether the red bounding boxes overlay an actual pool. Anyway, disregard that comment.

Sorry, maybe it was already implicit in the previous comments but could you summarize the point that makes it the case for this example vs the current one?

I don't have that strong an opinion on which one should be the "default." I think we need to keep both (and in that case, I'm fine with keeping the deployment in this one) because they showcase different features and frameworks and we need more examples.

Reasons why I wouldn't push to make this one the "default":

  1. Docs need to be reworked and I don't see that it would be worth the time.
  2. If I'm not using yolo, it doesn't show me anything about how to use dvclive. By default, I would rather have an example that invokes Live() or an explicit DVCLiveCallback().

@daavoo
Copy link
Contributor Author

daavoo commented Aug 4, 2023

Sorry, I was reading it wrong 😄 . I have used mAP but here I was looking at the image masks. I think it is just harder to see whether the red bounding boxes overlay an actual pool. Anyway, disregard that comment.

Yes, the default built-in visualization is ... not great.

I don't have that strong an opinion on which one should be the "default." I think we need to keep both (and in that case, I'm fine with keeping the deployment in this one) because they showcase different features and frameworks and we need more examples.

Reasons why I wouldn't push to make this one the "default":

  1. Docs need to be reworked and I don't see that it would be worth the time.
  2. If I'm not using yolo, it doesn't show me anything about how to use dvclive. By default, I would rather have an example that invokes Live() or an explicit DVCLiveCallback().

Makes sense.
For the record about 2, at least HuggingFace appears to be also defaulting to automatically include the callback if the library is installed, so assuming we start moving the callbacks to their repos, more examples might start to look like this

@daavoo daavoo marked this pull request as draft August 4, 2023 15:28
@daavoo
Copy link
Contributor Author

daavoo commented Aug 4, 2023

Will make it another repo, converting to draft for now

@daavoo daavoo closed this Aug 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: example-get-started-experiments DVC Experiment, DVCLive examples
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants