-
Notifications
You must be signed in to change notification settings - Fork 661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistency in meal price outlier classifier
#489
Comments
Arguably this example does not shows what you claim it does. 1. Average value for this venue is R$ 13 Using Jarbas's In [1]: import statistics
In [2]: values = tuple(r.total_net_value for r in Reimbursement.objects.filter(cnpj_cpf='05467695000130'))
In [3]: sum(values) / len(values)
Out[3]: Decimal('13.22262773722627737226277372') 2. Standard deviation is R$ 6 Also, the standard deviation is around R$ 6.66 (hey devil 😈): In [4]: statistics.stdev(values)
Out[4]: Decimal('6.655636628195527505758211117') 3. Thus R$ 34 happens to be above the threshold Thus the threshold is R$ R$ 33.19, below the example value of R$ 34.05: In [5]: (sum(values) / len(values)) + (3 * statistics.stdev(values))
Out[5]: Decimal('33.18953762181285988953740707') |
Also we can check the (arguably) low values in Jarbas: https://jarbas.serenata.ai/dashboard/chamber_of_deputies/reimbursement/?q=05467695000130 |
What I mean is… Rosie has a good accuracy, but 100% is impossible. This example seams more a false-positive than a bug. Sure we can learn with this example and improve the classifier, let's say, to ask it to only consider venues with averages greater than a certain minimum limit ; ) |
Does the suggestion made in issue's discussion about considering values greater than a certain minimum goes like this tiny change? I'm afraid I'm missing the big picture.
What is the problem?
In some publications made by Rosie's Twitter, it is noted that the value identified as suspect is within the standard deviation established by the classifier.
As can be verified in the following suspicion:
Suspicions Tweet
Jarbas Documebt
In this case the value is only 34.50 BRL.
How can this be addressed?
I think it is necessary to adjust the classifier rules or improve the training set.
The text was updated successfully, but these errors were encountered: