Let tsfresh choose the value column if possible and increase test coverage #722

nils-braun · 2020-06-24T16:21:29Z

Sorry for coming late (and even more sorry, because we needed long to review your PR). I realized that there was a small change concerning the API: before your very nice changes, it was possible to let tsfresh find out the column_value, if it was the only remaining column. I added this again.
Doing this, I tried to increase the test coverage a bit.

Feel free to drop a comment, if you want!

github-actions · 2020-06-24T16:22:23Z

You have style errors. See them below.

./tsfresh/feature_extraction/data.py:149:1: E302 expected 2 blank lines, found 1

coveralls · 2020-06-24T16:27:33Z

Coverage increased (+0.1%) to 98.093% when pulling 122adaa on feature/some-small-refactoring into e867d74 on master.

hoesler · 2020-06-24T17:29:19Z

Ni @nils-braun.

Apparently I missed that. Thanks!

But why did you change the value_columns arg of WideTsData from list[str] to str? If I just look at the class in isolation, it doesn't feel right to force users to select all or only one column. The current API is reflected in to_tsdata, so I would move more restricted selection logic to that function. Users can then circumvent that logic, by passing WideTsData directly instead of data frames.

nils-braun · 2020-06-25T08:54:28Z

I though that it makes the logic of the None-column_value easier, but you are right, in the end both is fine.
I will change it back to your original propsal.

nils-braun · 2020-06-25T09:24:54Z

@hoesler Are you fine with merging this?

nils-braun · 2020-07-15T20:21:10Z

I assume you are :-) If not, just comment and we can change it back!

…factoring

github-actions · 2020-07-15T20:27:55Z

Result of Benchmark Tests

Benchmark	Min	Max	Mean	Mean on Repo `HEAD`
tests/benchmark.py::test_benchmark_small_data	15.65	15.81	15.73 +- 0.07	12.46 +- 0.05
tests/benchmark.py::test_benchmark_large_data	6.27	6.50	6.40 +- 0.10	6.00 +- 0.02
tests/benchmark.py::test_benchmark_with_selection	8.69	8.87	8.75 +- 0.08	8.13 +- 0.05

hoesler · 2020-06-25T10:18:26Z

tsfresh/feature_extraction/data.py

@@ -164,17 +172,16 @@ def __init__(self, df, column_id, column_sort=None, value_columns=None):
        :type column_sort: str|None

        :param value_columns: list of column names to treat as time series values.
-            If `None`, all columns except `column_id` and `column_sort` will be used.
+            If `None` or empty, all columns except `column_id` and `column_sort` will be used.


I think, you don't actually handle the empty case, do you?

Nice catch! Thanks

hoesler · 2020-06-25T10:25:20Z

tsfresh/feature_extraction/data.py

+        if column_kind is None:
+            raise ValueError("A value for column_kind needs to be supplied")
+
+        if column_value is None:


I think this should be added to the docs and the column_value arg should be optional.

hoesler · 2020-06-25T10:29:05Z

tsfresh/feature_extraction/data.py

+        if column_value is None:
+            possible_value_columns = _get_value_columns(df, column_id, column_sort, column_kind)
+            if len(possible_value_columns) != 1:
+                raise ValueError("Could not guess the value column! Please hand it to the function as an argument.")


Maybe the message could be more specific and also include possible_value_columns.

hoesler · 2020-06-25T10:30:12Z

tsfresh/feature_extraction/data.py

@@ -291,7 +307,7 @@ def __len__(self):
        return sum(grouped_df.ngroups for grouped_df in self.grouped_dict.values())


-def to_tsdata(df, column_id=None, column_kind=None, column_value=None, column_sort=None):


I made column_id optional, because you can pass a TsData object

hoesler · 2020-07-16T07:14:38Z

Damn. I started a review a long while ago, but forgot to actually submit it. Sorry! Here are some small comments I had.

nils-braun · 2020-07-16T16:55:44Z

Ok, thanks! I will work on the issues and open a new PR!

nils-braun added 2 commits June 24, 2020 15:04

Let tsfresh sort out the column_value if possible

3237f2f

Increase test coverage in ts_data

88069a9

pep8ify

39ae3a2

Revert the change from value_columns to column_value

7a8fdfe

nils-braun mentioned this pull request Jun 25, 2020

travis to GitHub actions #723

Merged

Fix for older python version

122adaa

nils-braun changed the base branch from master to main July 4, 2020 16:14

nils-braun added 2 commits July 15, 2020 22:22

Merge remote-tracking branch 'origin/main' into feature/some-small-re…

4f8ca31

…factoring

Changelog

bb371e8

nils-braun merged commit 2f8f0fa into main Jul 15, 2020

nils-braun mentioned this pull request Jul 15, 2020

Add RollingWideTsFrameAdapter along with helper functions #731

Closed

hoesler reviewed Jul 16, 2020

View reviewed changes

nils-braun deleted the feature/some-small-refactoring branch October 17, 2020 15:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Let tsfresh choose the value column if possible and increase test coverage #722

Let tsfresh choose the value column if possible and increase test coverage #722

nils-braun commented Jun 24, 2020

github-actions bot commented Jun 24, 2020

coveralls commented Jun 24, 2020 •

edited

Loading

hoesler commented Jun 24, 2020

nils-braun commented Jun 25, 2020

nils-braun commented Jun 25, 2020

nils-braun commented Jul 15, 2020

github-actions bot commented Jul 15, 2020

hoesler Jun 25, 2020

nils-braun Jul 18, 2020

hoesler Jun 25, 2020

hoesler Jun 25, 2020

hoesler Jun 25, 2020

hoesler commented Jul 16, 2020

nils-braun commented Jul 16, 2020

		@@ -291,7 +307,7 @@ def __len__(self):
		return sum(grouped_df.ngroups for grouped_df in self.grouped_dict.values())


		def to_tsdata(df, column_id=None, column_kind=None, column_value=None, column_sort=None):

Let tsfresh choose the value column if possible and increase test coverage #722

Let tsfresh choose the value column if possible and increase test coverage #722

Conversation

nils-braun commented Jun 24, 2020

github-actions bot commented Jun 24, 2020

coveralls commented Jun 24, 2020 • edited Loading

hoesler commented Jun 24, 2020

nils-braun commented Jun 25, 2020

nils-braun commented Jun 25, 2020

nils-braun commented Jul 15, 2020

github-actions bot commented Jul 15, 2020

Result of Benchmark Tests

hoesler Jun 25, 2020

Choose a reason for hiding this comment

nils-braun Jul 18, 2020

Choose a reason for hiding this comment

hoesler Jun 25, 2020

Choose a reason for hiding this comment

hoesler Jun 25, 2020

Choose a reason for hiding this comment

hoesler Jun 25, 2020

Choose a reason for hiding this comment

hoesler commented Jul 16, 2020

nils-braun commented Jul 16, 2020

coveralls commented Jun 24, 2020 •

edited

Loading