-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/improve speed and limit memory (#11) #100
base: main
Are you sure you want to change the base?
Commits on Apr 12, 2023
-
Feature/improve speed and limit memory (#11)
Improve speed and limit memory consumption - stream input files for inference - add feature: skip deduplication - add feature: ensemble model - add feature: rescale input before inference with pre-trained models
Configuration menu - View commit details
-
Copy full SHA for 2f879e5 - Browse repository at this point
Copy the full SHA 2f879e5View commit details
Commits on Apr 17, 2023
-
Configuration menu - View commit details
-
Copy full SHA for bf0c2ce - Browse repository at this point
Copy the full SHA bf0c2ceView commit details
Commits on Apr 20, 2023
-
- fix bug member variables not assigned when model is not trained - allow throw when input file is malformed: remove skip on bad lines from pandas read function
Configuration menu - View commit details
-
Copy full SHA for 3ad792c - Browse repository at this point
Copy the full SHA 3ad792cView commit details
Commits on May 5, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 29f5549 - Browse repository at this point
Copy the full SHA 29f5549View commit details
Commits on May 9, 2023
-
- Create new object of OnDiskPsmDataset to use for brew tests - Update brew function outputs and assert statements
Configuration menu - View commit details
-
Copy full SHA for ae7f880 - Browse repository at this point
Copy the full SHA ae7f880View commit details
Commits on May 11, 2023
-
- remove assign confidence tests because datasets don't have assign confidence methods anymore - add eval_fdr value to the _update_labels function
Configuration menu - View commit details
-
Copy full SHA for 4e7235b - Browse repository at this point
Copy the full SHA 4e7235bView commit details -
* Fix test confidence: - fix bugs for grouped confidence - fix test_one_group : create file using psm_df_1000 to create OnDiskPsmDataset. - remove test_pickle because confidence does not return dataframe results anymore. - add test_multi_groups to test that different group results are saved correctly. * fix bugs: - overwrite default fdr for update_labels function - return dataframe for psm_df_1000 to use with LinearPsmDataset
Configuration menu - View commit details
-
Copy full SHA for 3e7dda9 - Browse repository at this point
Copy the full SHA 3e7dda9View commit details
Commits on May 15, 2023
-
- Remove test_cli_pepxml because xml files don't work with streaming - Replace old output file names - Add random generator 'rng' variable to confidence since it is required for proteins - Remove subset_max_train from PluginModel - Fix bug: convert targets column after reading in chunks - Fix peptide column name for confidence - Fix test cli plugins : replace DecisionTreeClassifier with LinearSVC BECAUSE DecisionTreeClassifier return scores as 0 or 1
Configuration menu - View commit details
-
Copy full SHA for c5d158a - Browse repository at this point
Copy the full SHA c5d158aView commit details
Commits on May 16, 2023
-
- Refactor test structure : Separate brew and confidence functions, read results from output files. - Fix bugs: fix output columns for proteins, sort proteins data by score
Configuration menu - View commit details
-
Copy full SHA for 1d2fdf0 - Browse repository at this point
Copy the full SHA 1d2fdf0View commit details
Commits on May 17, 2023
-
- Add label value to initial direction because it has to have a numerical number - Read pin does not return dataframe anymore - Compare output of read_pin function to example dataframe
Configuration menu - View commit details
-
Copy full SHA for 6e08b70 - Browse repository at this point
Copy the full SHA 6e08b70View commit details
Commits on May 22, 2023
-
- Add skip_deduplication flag test - Add ensemble flag test - Agg rescale flag test - Fix bug: remove target_column variable from read file for read_data_for_rescale
Configuration menu - View commit details
-
Copy full SHA for d16cedc - Browse repository at this point
Copy the full SHA d16cedcView commit details -
- Remove writer tests with confidence object becaause LinearPsmDataset does not have asign_confidence method anymore and results are streamed to output files while computing confidence
Configuration menu - View commit details
-
Copy full SHA for 40f8394 - Browse repository at this point
Copy the full SHA 40f8394View commit details
Commits on Aug 4, 2023
-
fix error no psms found during training : if no psms passed the fdr v…
…alue then raise error that model performed worse (#33)
Configuration menu - View commit details
-
Copy full SHA for 531d4ae - Browse repository at this point
Copy the full SHA 531d4aeView commit details -
Introduce new executable and bug fixes
* Create new executable to aggregate psms to peptides. * Fix bugs: - fix error no psms found during training : if no psms passed the fdr value then raise error that model performed worse - raise error when pep values are all equal to 1 - prefixes paths to dest_dir to not pollute the workdir - catch error to prevent traces logged: Catch all errors to not break structured logging by error traces - fixes parallelism in parse_in_chunks to max_workers - fix indeterminism - fixed small column chunk bug - fix bug when using multiple input files * Fix and add tests: - remove writer tests with confidence object because LinearPsmDataset does not have asign_confidence method anymore and results are streamed to output files while computing confidence - add test for the new function "get_unique_peptides_from_psms" - add cli test for aggregatePsmsToPeptides
Configuration menu - View commit details
-
Copy full SHA for 84c427b - Browse repository at this point
Copy the full SHA 84c427bView commit details
Commits on Feb 16, 2024
-
Configuration menu - View commit details
-
Copy full SHA for b85d176 - Browse repository at this point
Copy the full SHA b85d176View commit details
Commits on Feb 22, 2024
-
Merge branch 'develop' into 'main'
dev to main See merge request msaid/inferys/mokapot!36
Siegfried Gessulat committedFeb 22, 2024 Configuration menu - View commit details
-
Copy full SHA for 58e8481 - Browse repository at this point
Copy the full SHA 58e8481View commit details -
Configuration menu - View commit details
-
Copy full SHA for 74f91f1 - Browse repository at this point
Copy the full SHA 74f91f1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2985a7f - Browse repository at this point
Copy the full SHA 2985a7fView commit details -
- adds line break in dataset.py - updates call of ruff in CI - updates pyproject.toml according to new ruff api
Configuration menu - View commit details
-
Copy full SHA for 12ebe26 - Browse repository at this point
Copy the full SHA 12ebe26View commit details -
- adds line break in dataset.py - updates call of ruff in CI - updates pyproject.toml according to new ruff api
Configuration menu - View commit details
-
Copy full SHA for 49608e1 - Browse repository at this point
Copy the full SHA 49608e1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6ccc88e - Browse repository at this point
Copy the full SHA 6ccc88eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0b4fdc5 - Browse repository at this point
Copy the full SHA 0b4fdc5View commit details
Commits on Feb 27, 2024
-
Feature/improve speed and limit memory (#11)
Improve speed and limit memory consumption - stream input files for inference - add feature: skip deduplication - add feature: ensemble model - add feature: rescale input before inference with pre-trained models
Configuration menu - View commit details
-
Copy full SHA for f595804 - Browse repository at this point
Copy the full SHA f595804View commit details -
Configuration menu - View commit details
-
Copy full SHA for 46fbf6b - Browse repository at this point
Copy the full SHA 46fbf6bView commit details -
- fix bug member variables not assigned when model is not trained - allow throw when input file is malformed: remove skip on bad lines from pandas read function
Configuration menu - View commit details
-
Copy full SHA for ee95fbd - Browse repository at this point
Copy the full SHA ee95fbdView commit details -
Configuration menu - View commit details
-
Copy full SHA for f3d50c8 - Browse repository at this point
Copy the full SHA f3d50c8View commit details -
- Create new object of OnDiskPsmDataset to use for brew tests - Update brew function outputs and assert statements
Configuration menu - View commit details
-
Copy full SHA for 4293410 - Browse repository at this point
Copy the full SHA 4293410View commit details -
- remove assign confidence tests because datasets don't have assign confidence methods anymore - add eval_fdr value to the _update_labels function
Configuration menu - View commit details
-
Copy full SHA for 623b7d8 - Browse repository at this point
Copy the full SHA 623b7d8View commit details -
* Fix test confidence: - fix bugs for grouped confidence - fix test_one_group : create file using psm_df_1000 to create OnDiskPsmDataset. - remove test_pickle because confidence does not return dataframe results anymore. - add test_multi_groups to test that different group results are saved correctly. * fix bugs: - overwrite default fdr for update_labels function - return dataframe for psm_df_1000 to use with LinearPsmDataset
Configuration menu - View commit details
-
Copy full SHA for 8f417dd - Browse repository at this point
Copy the full SHA 8f417ddView commit details -
- Remove test_cli_pepxml because xml files don't work with streaming - Replace old output file names - Add random generator 'rng' variable to confidence since it is required for proteins - Remove subset_max_train from PluginModel - Fix bug: convert targets column after reading in chunks - Fix peptide column name for confidence - Fix test cli plugins : replace DecisionTreeClassifier with LinearSVC BECAUSE DecisionTreeClassifier return scores as 0 or 1
Configuration menu - View commit details
-
Copy full SHA for 2e1723e - Browse repository at this point
Copy the full SHA 2e1723eView commit details -
- Refactor test structure : Separate brew and confidence functions, read results from output files. - Fix bugs: fix output columns for proteins, sort proteins data by score
Configuration menu - View commit details
-
Copy full SHA for 6355834 - Browse repository at this point
Copy the full SHA 6355834View commit details -
- Add label value to initial direction because it has to have a numerical number - Read pin does not return dataframe anymore - Compare output of read_pin function to example dataframe
Configuration menu - View commit details
-
Copy full SHA for 296fb73 - Browse repository at this point
Copy the full SHA 296fb73View commit details -
- Add skip_deduplication flag test - Add ensemble flag test - Agg rescale flag test - Fix bug: remove target_column variable from read file for read_data_for_rescale
Configuration menu - View commit details
-
Copy full SHA for 096b07f - Browse repository at this point
Copy the full SHA 096b07fView commit details -
- Remove writer tests with confidence object becaause LinearPsmDataset does not have asign_confidence method anymore and results are streamed to output files while computing confidence
Configuration menu - View commit details
-
Copy full SHA for d497fcc - Browse repository at this point
Copy the full SHA d497fccView commit details -
fix error no psms found during training : if no psms passed the fdr v…
…alue then raise error that model performed worse (#33)
Configuration menu - View commit details
-
Copy full SHA for d241adb - Browse repository at this point
Copy the full SHA d241adbView commit details -
Introduce new executable and bug fixes
* Create new executable to aggregate psms to peptides. * Fix bugs: - fix error no psms found during training : if no psms passed the fdr value then raise error that model performed worse - raise error when pep values are all equal to 1 - prefixes paths to dest_dir to not pollute the workdir - catch error to prevent traces logged: Catch all errors to not break structured logging by error traces - fixes parallelism in parse_in_chunks to max_workers - fix indeterminism - fixed small column chunk bug - fix bug when using multiple input files * Fix and add tests: - remove writer tests with confidence object because LinearPsmDataset does not have asign_confidence method anymore and results are streamed to output files while computing confidence - add test for the new function "get_unique_peptides_from_psms" - add cli test for aggregatePsmsToPeptides
Configuration menu - View commit details
-
Copy full SHA for 41ed445 - Browse repository at this point
Copy the full SHA 41ed445View commit details -
Configuration menu - View commit details
-
Copy full SHA for ac43547 - Browse repository at this point
Copy the full SHA ac43547View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4a9872f - Browse repository at this point
Copy the full SHA 4a9872fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 346a0c0 - Browse repository at this point
Copy the full SHA 346a0c0View commit details -
- adds line break in dataset.py - updates call of ruff in CI - updates pyproject.toml according to new ruff api
Configuration menu - View commit details
-
Copy full SHA for f543166 - Browse repository at this point
Copy the full SHA f543166View commit details -
- adds line break in dataset.py - updates call of ruff in CI - updates pyproject.toml according to new ruff api
Configuration menu - View commit details
-
Copy full SHA for 0742dc2 - Browse repository at this point
Copy the full SHA 0742dc2View commit details -
Configuration menu - View commit details
-
Copy full SHA for a2602df - Browse repository at this point
Copy the full SHA a2602dfView commit details -
Configuration menu - View commit details
-
Copy full SHA for f12a43d - Browse repository at this point
Copy the full SHA f12a43dView commit details -
Merge branch 'main' into 'feature/sync'
# Conflicts: # tests/conftest.py # tests/system_tests/test_system.py # tests/unit_tests/test_brew.py # tests/unit_tests/test_writer_flashlfq.py # tests/unit_tests/test_writer_txt.py
Siegfried Gessulat committedFeb 27, 2024 Configuration menu - View commit details
-
Copy full SHA for 0fd515b - Browse repository at this point
Copy the full SHA 0fd515bView commit details -
Merge branch 'feature/sync' into 'main'
rebase main See merge request msaid/inferys/mokapot!37
Configuration menu - View commit details
-
Copy full SHA for 6726dea - Browse repository at this point
Copy the full SHA 6726deaView commit details