New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add features #32
Add features #32
Conversation
Please run |
0bbcf20
to
15331ab
Compare
m_std = np.std(m, ddof=1) | ||
m_new = np.cumsum(m - m_mean) | ||
result = m_new / (len(m) * m_std) | ||
return max(result) - min(result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace with np.ptp
class Eta(BaseFeature): | ||
def __call__(self, t, m, sigma=None, sorted=None, fill_value=None): | ||
n = len(m) | ||
m_std = np.std(m, ddof=1) ** 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.var
def __call__(self, t, m, sigma=None, sorted=None, fill_value=None): | ||
n = len(m) | ||
m_std = np.std(m, ddof=1) ** 2 | ||
m_sum = sum([(m[i + 1] - m[i]) ** 2 for i in range(n - 1)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.sum((m[1:] - m[:-1]) ** 2)
def __call__(self, t, m, sigma=None, sorted=None, fill_value=None): | ||
m_mean = np.mean(m) | ||
d_mean = np.mean(np.power(sigma, 2)) | ||
m_std = np.std(m, ddof=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.var
def __call__(self, t, m, sigma=None, sorted=None, fill_value=None): | ||
n = len(m) | ||
m_mean = np.mean(m) | ||
m_st = np.std(m, ddof=1) ** 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.var
from ._base import BaseFeature | ||
|
||
|
||
class Kurtosis(BaseFeature): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scipy.stat?
m_span = [m[i + 1] - m[i] for i in range(len(m) - 1)] | ||
t_span = [t[i + 1] - t[i] for i in range(len(t) - 1)] | ||
div = [abs(i / j) for i, j in zip(m_span, t_span)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vectorize
from ._base import BaseFeature | ||
|
||
|
||
class Skew(BaseFeature): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?scipy.stats
|
||
class WeightedMean(BaseFeature): | ||
def __call__(self, t, m, sigma=None, sorted=None, fill_value=None): | ||
return np.average(m, weights=np.power(sigma, 2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-2
Check style test failed, can you, please, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably some random-number statistics-motivated tests can be added. Thus, we can test AndersonDarlingNormal
using a sample of normal distribution, and test ReducedChi2
with a random vector generated from multidimensional normal distribution having mu = [x, x, ..., x] and Sigma = diag(sigma_1, sigma_2, ...., sigma_n)
n = len(m) | ||
m_wmean = np.average(m, weights=sigma) | ||
s = ((m - m_wmean) / sigma) ** 2 | ||
return sum(s) / (n - 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.sum
is faster, proof:
In [1]: a = np.arange(1024)
In [2]: %timeit sum(a)
275 µs ± 22.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [3]: %timeit np.sum(a)
6.21 µs ± 548 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
feature = ReducedChi2() | ||
desired = feature(m, m, sigma) | ||
actual = 10.666667 | ||
assert_allclose(actual, desired) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add an additional test with non-constant sigma
class ReducedChi2(BaseFeature): | ||
def __call__(self, t, m, sigma=None, sorted=None, fill_value=None): | ||
n = len(m) | ||
m_wmean = np.average(m, weights=sigma) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weights=sigma ** -2
?
add feature_tests.ipynb
New features + tests