Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in model evaluation #338

Open
Monica9577 opened this issue Feb 24, 2024 · 5 comments
Open

Error in model evaluation #338

Monica9577 opened this issue Feb 24, 2024 · 5 comments

Comments

@Monica9577
Copy link

Describe the bug
When trying to create evaluation video, the main console gives back an error saying there is a mismatch in the number of features in input file.

To Reproduce
After succesfully label the classifier.
in "run machine model window".

  • select the video file and classifier model file
  • run model
  • create interactive probability plot
  • select the best probability for the given classifier and the minimum bout ms
  • click create validation video
Captura de pantalla 2024-02-24 191231

Expected behavior
Once complete, you should see a video file representing the analyzed file inside the project_folder/frames/output/validation directory

Desktop (please complete the following information):

  • OS: windows 11
  • Python Version 3.6.8
  • Are you using anaconda. Yes
  • Simba version: 1.86.4

Additional context
Is my first time creating a new model, so maybe I'm doing something wrong, but I followed the tutorial on this github page.
Anyway, I apologize in advance if it was my mistake and not the software's.
thanks in advance for your help

@sronilsson
Copy link
Collaborator

Hi @Monica9577 ! SimBA takes all your files inside the project_folder/csv/targets_inserted directory and build a model from these files. In each one of these files, if you remove the body-part data columns (in the beginning) and your annotations (in the end of the file), you are left with 221 columns of features.

Next, you want to use this model .sav file, that is built using 221 columns of features, on new data inside your project_folder/csv/features_extracted directory.
SimBA goes ahead and opens the first file inside the project_folder/csv/features_extracted directory, and tries to analyze it. However, it finds 245 columns. It doesn’t know what to do, as the model was trained with 221 columns, and it now sees 245 columns - what should it do with all these extra columns? So it give you the error.

One possible way that could cause a mismatch in the number of columns is, for example, you added ROI features to your new data inside the project_folder/csv/features_extracted directory, but you did not add it to the files you used to train the model with - could this be possible?

@Monica9577
Copy link
Author

No, I haven't add any ROIs to any of the files...
But I'm going to check if there's any difference between the features csv of the video I used to train the model and the one I'm using to evaluate it
Thanks!

@sronilsson
Copy link
Collaborator

Thanks!

One more potential reason for this error I have seen before: Within a single SimBA project, be sure you are working with the same Animal names and the same body-part names in all files.

The file project_folder/logs/measures/pose_configs/bp_names/project_bp_names.csv stores the names of your body-parts in your SimBA project. Before training models, SimBA drops the data for your body-parts - we don't want to use the locations of the body-parts in any model. However, if the body-parts names or animal names for any reason changes across data files, then the appropriate columns will not be recognized at body-parts and SimBA will fail to drop them in some files, causing the mismatch in the column numbers errors as you see.

@Monica9577
Copy link
Author

Hi
Yes I'm using the same body parts and animal names in every video,
but is still giving problems

@sronilsson
Copy link
Collaborator

@Monica9577 - if you look at the a file inside the project_folder/csv/targets_inserted directory, and compare it against a file inside the project_folder/csv/features_extracted directory, what differences to you see in column names and the number of columns? You could also zip up and share a file from each directory here with me or through gdrive link and I can look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants