Feature/local outlier factor #164

pablomm · 2019-09-12T12:43:42Z

Created LocalOutlierFactor (which wraps scikit-learn multivariate version)
Example in gallery of detection of outliers
New real dataset employed in the example (fetch_octane)
Test and Doctests added

codecov · 2019-09-12T12:51:48Z

Codecov Report

Merging #164 into develop will increase coverage by 0.32%.
The diff coverage is 100%.

@@             Coverage Diff             @@
##           develop     #164      +/-   ##
===========================================
+ Coverage    72.74%   73.06%   +0.32%     
===========================================
  Files           40       41       +1     
  Lines         3900     3947      +47     
===========================================
+ Hits          2837     2884      +47     
  Misses        1063     1063

Impacted Files	Coverage Δ
skfda/_neighbors/base.py	`100% <ø> (ø)`	⬆️
skfda/_neighbors/regression.py	`100% <ø> (ø)`	⬆️
skfda/_neighbors/classification.py	`100% <ø> (ø)`	⬆️
skfda/_neighbors/unsupervised.py	`100% <ø> (ø)`	⬆️
skfda/_neighbors/outlier.py	`100% <100%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e81577b...1640735. Read the comment docs.

vnmabus

We have discussed this method and we are not sure if it works for the population in a functional data context (it will probably work, but there is no theory published). Thus, we will keep this PR "on hold" until we have a better understanding on the theory. If you have more info about the usage of this method in FDA, please tell us.

vnmabus · 2019-09-16T15:37:19Z

docs/modules/exploratory/outliers.rst

+Local Outlier Factor
+--------------------
+


Brief explanation missing.

pablomm · 2019-09-19T21:38:39Z

We have discussed this method and we are not sure if it works for the population in a functional data context (it will probably work, but there is no theory published). Thus, we will keep this PR "on hold" until we have a better understanding on the theory. If you have more info about the usage of this method in FDA, please tell us.

I didn't do much research, I read the original paper, and then I saw others in which it was applied for time series, and as it is based on proximity I extended it in the same way as the rest of the k-nn estimators, in which it worked for the fda case.

Apart from that and the tests I did with different datasets outliers comparing the results with the other methods we have implemented I did not investigate further.

vnmabus · 2019-09-19T22:22:33Z

The thing is that one theoretical motivation of the nearest neighbors methods is the estimation of probability densities, which do not exist in functional data. They told me that the other nearest neighbors methods (and probably this one) can have a more rigorous foundation in FDA based in Radon-Nikodym derivatives, which sometimes do exist in FDA, and can be seen as the equivalent of a quotient of densities. But the fact is that no one has tried to extend the local outlier factor to FDA right now, so I prefer to err on the side of caution.

pablomm · 2019-09-21T11:32:09Z

Okay, I understand your point of view. Keep me informed of progress in this regard.

I understand that this branch will be blocked for quite some time. I want to add some enhancements to the efficiency of the knn estimators to finish the work on this module. To keep everything updated easily I had thought to add the changes but without adding this estimator to the documentation, leaving it alone in the private neighbor module, what do you think?

vnmabus · 2019-09-21T12:42:52Z

Okay, I understand your point of view. Keep me informed of progress in this regard.

I understand that this branch will be blocked for quite some time. I want to add some enhancements to the efficiency of the knn estimators to finish the work on this module. To keep everything updated easily I had thought to add the changes but without adding this estimator to the documentation, leaving it alone in the private neighbor module, what do you think?

Ok, as long as the LocalOutlierFactor object is private. I would also create an issue so that we will not forget about it.

pablomm · 2019-09-21T14:40:43Z

Ok, as long as the LocalOutlierFactor object is private. I would also create an issue so that we will not forget about it.

Done.

Feature/local outlier factor

pablomm added 9 commits September 9, 2019 11:12

Local Outlier Factor

e0fdb33

Documentation of Local outlier factor

1997776

Example of LOF and new dataset

399013a

Space in doctest

d80a112

Space in doctest

c0533cb

Format in doctest

8c03cab

Coverage of LocalOutlierFactor

2a6fed5

Format code with autopep8

33bce9a

Change in doctest due to bug fixed

d4def93

pablomm added the enhancement label Sep 12, 2019

pablomm requested a review from vnmabus September 12, 2019 12:43

Merge branch 'develop' into feature/local-outlier-factor

e58e9f0

vnmabus added the pending theory The theoretical properties of the methods implemented are not yet understood in FDA. label Sep 18, 2019

vnmabus reviewed Sep 18, 2019

View reviewed changes

docs/modules/exploratory/outliers.rst Outdated

Local Outlier Factor

--------------------

Copy link

Member

vnmabus Sep 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brief explanation missing.

pablomm added 2 commits September 21, 2019 15:23

Merge branch 'develop' into feature/local-outlier-factor

f383169

Remove public references to LocalOutlierFactor

1640735

pablomm mentioned this pull request Sep 21, 2019

Local Outlier Factor for functional data #168

Open

vnmabus approved these changes Sep 23, 2019

View reviewed changes

pablomm merged commit d293e5a into develop Sep 23, 2019

pablomm deleted the feature/local-outlier-factor branch September 23, 2019 15:30

DavidGarciaFer pushed a commit that referenced this pull request Jun 21, 2020

Merge pull request #164 from GAA-UAM/feature/local-outlier-factor

cd5bae4

Feature/local outlier factor

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/local outlier factor #164

Feature/local outlier factor #164

pablomm commented Sep 12, 2019

codecov bot commented Sep 12, 2019 •

edited

vnmabus left a comment

vnmabus Sep 16, 2019

pablomm commented Sep 19, 2019

vnmabus commented Sep 19, 2019

pablomm commented Sep 21, 2019

vnmabus commented Sep 21, 2019

pablomm commented Sep 21, 2019

Feature/local outlier factor #164

Feature/local outlier factor #164

Conversation

pablomm commented Sep 12, 2019

codecov bot commented Sep 12, 2019 • edited

Codecov Report

vnmabus left a comment

Choose a reason for hiding this comment

vnmabus Sep 16, 2019

Choose a reason for hiding this comment

pablomm commented Sep 19, 2019

vnmabus commented Sep 19, 2019

pablomm commented Sep 21, 2019

vnmabus commented Sep 21, 2019

pablomm commented Sep 21, 2019

codecov bot commented Sep 12, 2019 •

edited