Add GeometricMeanRarity #7

jerome-qn · 2022-06-16T17:42:00Z

TODO: override aggregate calculations with vectorized methods

src/openrarity/scoring/base.py

src/openrarity/scoring/geometric_mean.py

theelderbeever · 2022-06-16T20:22:06Z

The poetry.lock file shouldn't be ignored. It provides locked in library versioning for everyone and prevents the dependency resolver from needing to run.

pyproject.toml

src/openrarity/scoring/base.py

impreso · 2022-06-16T23:14:10Z

src/openrarity/scoring/base.py

+@dataclass
+class BaseRarityFormula:
+
+    formula_name: str


suggest to 1) create enum to provide explicit formula names 2) why we need formula id if we have a name?

we don't know what the names will be, a priori

gp

impreso · 2022-06-16T23:15:35Z

src/openrarity/scoring/base.py

+    def score_token(token: Token) -> float:
+        raise NotImplementedError
+
+    # base aggregate scorers: can override more efficient methods


as discussed offline , can we create an efficient computation abstraction ( type alias should work it out) to perform batch-based computations in NumPy format.

gp, we will create another class to batch computations in a future PR

src/openrarity/scoring/geometric_mean.py

impreso · 2022-06-16T23:18:49Z

src/openrarity/scoring/geometric_mean.py

+    -- equivalent to the nth power of "statistical rarity"
+    '''
+
+    def score_token(self, token: Token) -> float:


as a rule of thumb we should strike for 90% test coverage in the library - quality is very important to keep consistent computation.

will write tests in future PR

ok , makes sense

Update geometric_mean.py

src/openrarity/scoring/geometric_mean.py

src/openrarity/scoring/harmonic_mean.py

src/openrarity/scoring/distance_metric.py

impreso · 2022-06-18T18:31:53Z

poetry.lock

@@ -0,0 +1,296 @@
+[[package]]


why we need poetry lock again?

impreso

nice progress ! some re-factoring needed and good to go ( assuming we will write more unit tests for the calculation logic)

impreso

LGTM for now , we need to refactor class level documentation and add unit-tests to make sure the computation is correct.

dadashi

lgtm, good progress.

jerome-qn added 3 commits June 16, 2022 10:40

Add GeometricMeanRarity

7952ac2

add poetry.lock to gitnore

6f20d33

read string_attributes dict

c3a43fc

theelderbeever reviewed Jun 16, 2022

View reviewed changes

src/openrarity/scoring/base.py Outdated Show resolved Hide resolved

theelderbeever reviewed Jun 16, 2022

View reviewed changes

src/openrarity/scoring/geometric_mean.py Outdated Show resolved Hide resolved