Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reformat csv reading and plot the number of instances being used throughout the hypothesis validation process #84

Closed
dpuenteramirez opened this issue Jan 26, 2022 · 7 comments · Fixed by #89 or #90
Assignees
Labels
enhancement New feature or request hypothesis tryouts Validate a hypothesis
Milestone

Comments

@dpuenteramirez
Copy link
Owner

The basic idea is to add a new metric on the right of the current plot. A handmade of the desired result is the following.
Captura de pantalla 2022-01-26 a las 10.45.43.png

@dpuenteramirez dpuenteramirez added enhancement New feature or request hypothesis tryouts Validate a hypothesis labels Jan 26, 2022
@dpuenteramirez dpuenteramirez added this to the Sprint 7 milestone Jan 26, 2022
@dpuenteramirez dpuenteramirez self-assigned this Jan 26, 2022
dpuenteramirez added a commit that referenced this issue Jan 26, 2022
Due to some modifications to the csv output #84 is being renamed
@dpuenteramirez dpuenteramirez changed the title Plot the number of instances being used throughout the hypothesis validation process Reformat csv reading and plot the number of instances being used throughout the hypothesis validation process Jan 26, 2022
@dpuenteramirez
Copy link
Owner Author

The desired result has been achieved
Dataset_iris_accuracy score_with_deletion

@alvarag
Copy link
Collaborator

alvarag commented Jan 27, 2022

I am not sure about the names... I think we should have:

  • Original dataset
  • After self-training
  • After ENN

@dpuenteramirez
Copy link
Owner Author

You mean on the right legend, is that correct?

@dpuenteramirez
Copy link
Owner Author

Something like this @alvarag?
Dataset_14_banana_norm_f1_with_deletion

@alvarag
Copy link
Collaborator

alvarag commented Jan 27, 2022

Yes, that is what I wanted to see. The problem is that ENN is fitlering instances of the original dataset, thus it is worsening the performance

@dpuenteramirez
Copy link
Owner Author

Yeah, I knwon, in the following picture you can see the same experiment without deletion of the originl dataset instances. Thus the performance improves quite a bit
Dataset_14_banana_norm_f1_without_deletion

@dpuenteramirez
Copy link
Owner Author

Nevertheless, both the "Banana" and the "Iris" datasets have been used as tests to check that all the code worked correctly. I am currently running the experimentation on all the datasets we have in order to have a wider range of results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment