Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collect the IDs of the predicted methods in the test set #3

Open
mauricioaniche opened this issue Apr 2, 2020 · 0 comments
Open

Collect the IDs of the predicted methods in the test set #3

mauricioaniche opened this issue Apr 2, 2020 · 0 comments
Assignees

Comments

@mauricioaniche
Copy link
Contributor

Right now, we only collect performance metrics (e.g., precision, recall, accuracy).

We need to collect some examples for future qualitative analysis. In other words, for each of the models we build, a collection of [method_id, expected_prediction, model_prediction].

This way we can later look at code examples of false positives, false negatives, etc.

I suppose all these changes will be:

  • _single_run_model should receive X_train, X_test, y_train, and y_test (which will be implemented in Train, validation, and test predicting-refactoring-ml#36), we should pass X_test_id.
  • _single_run_model then returns, besides the performance metrics, a dataframe as suggested above.
  • This should be printed to the logs in a way that becomes easy to parse later. Suggestion: "PRED,refactoring,model,id_element,expected_prediction,predicted_value". "PRED" is just a prefix that is easy to be find by grep.

I'm using method as an example, but it can also be a class or a variable or a field, i.e., everything we predict.

@jan-gerling jan-gerling transferred this issue from refactoring-ai/predicting-refactoring-ml Aug 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants