Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python 3 support #8

Closed
jneuff opened this issue Oct 27, 2016 · 14 comments
Closed

Python 3 support #8

jneuff opened this issue Oct 27, 2016 · 14 comments

Comments

@jneuff
Copy link
Collaborator

jneuff commented Oct 27, 2016

Currently we only support Python 2. In future releases we want to support both Python 2 and 3. This howto outlines the main steps towards Python 3 support.

@MaxBenChrist
Copy link
Collaborator

see #26

@grantbey
Copy link

Bummer. Thanks for the reply.

earthgecko added a commit to earthgecko/tsfresh that referenced this issue Oct 31, 2016
@MaxBenChrist
Copy link
Collaborator

sooner or later we will have that python3 support :)

until then, you could extract the features with a local python2.7 interpreter, pickle the dataframe and then load them into your python3.5 project

@MaxBenChrist MaxBenChrist self-assigned this Nov 1, 2016
@MaxBenChrist
Copy link
Collaborator

I will look into this later

@MaxBenChrist MaxBenChrist removed their assignment Nov 1, 2016
@MaxBenChrist
Copy link
Collaborator

MaxBenChrist commented Nov 1, 2016

I just uploaded the branch "i8_add_python3_support"

on it, I started to make tsfresh runable under python3. Now, all unit tests are passing on python 2.7. On Python 3.5.1, 14 unit tests are failing.

Maybe I will have time during the next days to finish the job. Otherwise it would be nice if somebody else could check the changes and getting that unit tests to pass.

@earthgecko
Copy link
Contributor

earthgecko commented Nov 1, 2016

I will take a look. I have to do it for Skyline at some point and I really want to deep dive into what you are up to here, so it may be an effective method for me to start a Python 3 path in my own sphere and get a handle on how you do not run into some of the clustering issues relating to timeseries as with k-means et al.

@jneuff I have read the paper now and dug a bit deep and I now understand a little more :) I should be say hey TPOT -> tsFRESH :)

@MaxBenChrist anybody interested in having a go at porting any bits and pieces to Python 3 can use Python 3.5.2 (latest) unless there is a reason that Python 3.5.1 is required, which silence on the matter shall be read as py352_ok = True, I am sure you are busy

Nice of blue-yonder and you all to release it, timeseries and ml not being easy and all, this looks like a step :)

@MaxBenChrist
Copy link
Collaborator

hi @earthgecko

we are happy about anybody that wants to contribute. You could take my "i8_add_python3_support" branch as a starting point.

Where do one find this py352_ok = True flag? I am not familiar with it.

Bytheway, to what are you referring with TPOT ? :)

Max

earthgecko added a commit to earthgecko/tsfresh that referenced this issue Nov 2, 2016
@earthgecko
Copy link
Contributor

Hi @MaxBenChrist

I have your i8_add_python3_support branch and I am working on that. Any changes
I will pull small increments on that branch for you.

A question concerning about how to handle Python 3 builtins in a backwards
compatible manner? For example the use of builtins in tsfresh/feature_selection/feature_selector.py
in the i8_add_python3_support branch is not backwards compatible with 2.7.x as
it stands now as there is no builtins in 2.7 and this has ramifactions through
other modules.

I shall add some additional detailed info on #30 for consideration.

There is no flag, it was a question :) You are OK with using 3.5.2, there is no specific reason you are using 3.5.1?

TPOT - https://github.com/rhiever/tpot - I initially thought that tsfresh was doing a subset of what TPOT does, but no TPOT could probably add a FRESH dimension :)

MaxBenChrist pushed a commit that referenced this issue Nov 2, 2016
earthgecko added a commit to earthgecko/tsfresh that referenced this issue Nov 2, 2016
@earthgecko
Copy link
Contributor

Now down 5 failing unit tests from 14

The outstanding ones are mostly related to objects have no attribute 'assertItemsEqual' in a number of contexts and there is a pandas errors related to:

pandas/computation/expressions.py:182: UserWarning: evaluating in Python space because the '*' operator is not supported by numexpr for the bool dtype, use '&' instead

In tests/transformers/test_full_pipeline.py along with an AssertionError too, they may be related

>       self.assertTrue(some_expected_features.issubset(set(extracted_features.columns)))
E       AssertionError: False is not true

@jphme
Copy link

jphme commented Nov 2, 2016

Some info on blocking points (was playing with the Python3 branch but unfortunately have no time to go into depth or create a fix myself right now):

The first Quickstart example
extracted_features = extract_features(timeseries, column_id="id", column_sort="time")

yields:

TypeError                                Traceback (most recent call last)
/opt/conda/lib/python3.5/site-packages/tsfresh/utilities/dataframe_functions.py in normalize_input_to_internal_representation(df_or_dict, column_id, column_sort, column_kind, column_value)
    239                 id_and_sort_column = [_f for _f in [column_id, column_sort] if _f]
    240                 kind_to_df_map = {key: df_or_dict[[key] + id_and_sort_column].copy().rename(columns={key: "_value"})
--> 241                                   for key in df_or_dict.columns if key not in id_and_sort_column}
    242 
    243                 #todo: is this the right check?

TypeError: can only concatenate list (not "filter") to list

.

When using with column_value="a" you can get around this error but now we get some numexpr errors:

/opt/conda/lib/python3.5/site-packages/pandas/computation/expressions.py:181: UserWarning: evaluating in Python space because the '*' operator is not supported by numexpr for the bool dtype, use '&' instead
  unsupported[op_str]))
/opt/conda/lib/python3.5/site-packages/scipy/signal/spectral.py:772: UserWarning: nperseg = 256, is greater than input length = 15, using nperseg = 15
  'using nperseg = {1:d}'.format(nperseg, x.shape[-1]))

@earthgecko
Copy link
Contributor

The current py2 py3 tests state in a gist - https://gist.github.com/earthgecko/118d168f88ebb37661154e3cb898c1fb

@jneuff
Copy link
Collaborator Author

jneuff commented Nov 2, 2016

The method assertItemsEqual has been removed from unites.TestCase somewhere along the way to Python 3.5 – we'll need to find a replacement with the same semantics.

earthgecko added a commit to earthgecko/tsfresh that referenced this issue Nov 3, 2016
Due to a change in unittest in as identified by @jneuff in
blue-yonder#8 (comment)

Semantically they appear to be the same and this fixes the related failing
tests on Python 3.5 as described in the gist in
blue-yonder#8 (comment)

Adds a basic method to determine python version for now, only committing so that
the new deeper unitest.assertEqual issue that now presents itself can be
addressed.
@earthgecko
Copy link
Contributor

@jneuff yes! Semantically they appear to be the same, relating failing tests pass \o/

However, fixing that now just letting the next unittest.assertEqual issue raise its head, it seems that assertEqual has changed in py3 as well, that may go a bit deeper :( One step at a time :)

assertEqual change

Current debug

        # Preserve old features
>       self.assertEqual(list(X_transformed.columns), ["feature_1", "a__length", "b__length"])
E       AssertionError: Lists differ: ['feature_1', 'b__length', 'a__length'] != ['feature_1', 'a__length', 'b__length']
E
E       First differing element 1:
E       'b__length'
E       'a__length'
E
E       - ['feature_1', 'b__length', 'a__length']
E       + ['feature_1', 'a__length', 'b__length']

tests/transformers/test_feature_augmenter.py:50: AssertionError

Used in quite a few places - https://github.com/blue-yonder/tsfresh/search?q=assertEqual&type=Code and further to that it must be kept in mind that with tests with 2 elements, this could pass sometimes if any elements were returned in differing order each time.

E       First differing element 0:
E       'b'
E       'a'
E
E       - ['b', 'a']
E       + ['a', 'b']

MaxBenChrist pushed a commit that referenced this issue Nov 3, 2016
* Changed xrange to range for #8

* Modified csort due to range changing from list to type for #8

* Use assertItemsEqual py2 and assertCountEqual py3
Due to a change in unittest in as identified by @jneuff in
#8 (comment)

Semantically they appear to be the same and this fixes the related failing
tests on Python 3.5 as described in the gist in
#8 (comment)

Adds a basic method to determine python version for now, only committing so that
the new deeper unitest.assertEqual issue that now presents itself can be
addressed.
MaxBenChrist added a commit that referenced this issue Nov 4, 2016
@MaxBenChrist
Copy link
Collaborator

MaxBenChrist commented Nov 4, 2016

I rewrote those unittests with the six framework.

Some of the unit tests still failed, the reason for that was the bug in #29 . I fixed that. Now you should be able to enjoy your fresh features under python3 :)

MaxBenChrist added a commit that referenced this issue Nov 4, 2016
* use python3 compatible print function

* use absolute import for py3 support

* in py3 one has to load reduce

* check for basestring in column name

* import range for py3 support

* call list of range iterator

* fixed standard library import

* added builtin support for str, basestring, zip, object

* class FeatureExtractionSettings inherits from baseobject

* call list of iterator objects

* change iter() to iteritems()

* replaced filter by list comprehension

* added future to requirements.txt

* Changed xrange to range for #8 (#30)

* csort range py3 compatible (#32)

* Changed xrange to range for #8

* Modified csort due to range changing from list to type for #8

* Use assertItemsEqual py2 and assertCountEqual py3
Due to a change in unittest in as identified by @jneuff in
#8 (comment)

Semantically they appear to be the same and this fixes the related failing
tests on Python 3.5 as described in the gist in
#8 (comment)

Adds a basic method to determine python version for now, only committing so that
the new deeper unitest.assertEqual issue that now presents itself can be
addressed.

* pinned version of future package

* added six to test-requirements.txt

* use six for assertCountEqual unit testing

* deleted some comments

* use six for assertCountEqual to test feature_augmentor

__dict__ will give iterator for properties

* use six.assertCountEqual

* use assertGreaterEqual to test subset

* parameter have to be sorted in feature name

fixes bug #29, features with multiple, but same parameters could end up
with different names

* added python 3.5.2 to .travis.yml

closes #8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants