Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor parse_time #2408

Merged
merged 4 commits into from May 1, 2018

Conversation

Projects
None yet
5 participants
@vermashresth
Copy link
Contributor

commented Jan 16, 2018

parse_time function is refactored to use singledispatch module
issue #2383

@pep8speaks

This comment has been minimized.

Copy link

commented Jan 16, 2018

Hello @vermashresth! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on May 01, 2018 at 17:09 Hours UTC
@nabobalis

This comment has been minimized.

Copy link
Contributor

commented Jan 16, 2018

@Cadair, you'll need to explain to me the point of this again. We gained alot of code lines to remove some if, elif, else statements.

@Cadair

This comment has been minimized.

Copy link
Member

commented Jan 17, 2018

Thanks for the PR @vermashresth, it appears you have walked into something that could trigger a lot of discussion.

This makes me realise what a mess parse_time is from an API perspective.

I think a good approach might be to not make parse_time the generic function, but to write a convert_time generic function that is called from within parse_time this would allow you to do things like filter out format and 'now' before executing the singledispatch function, which would probably make things a lot cleaner.

@vermashresth

This comment has been minimized.

Copy link
Contributor Author

commented Jan 17, 2018

@Cadair Sure! I will try that. Can you please tell how to skip build testing for python 2.x because it will always fail?

@Cadair

This comment has been minimized.

Copy link
Member

commented Jan 17, 2018

@vermashresth don't worry about it for the moment, the CI is pretty broken due to the last numpy release anyay. Depending on how this PR develops there is a Python 2 backport of singledispatch which is installable through pip, so we could use that.

@vermashresth

This comment has been minimized.

Copy link
Contributor Author

commented Jan 17, 2018

@Cadair If I first filter out by time_format and then use single dispatch like this

def parse_time(time_string, time_format='', **kwargs):
    if time_format == 'datetime':
        convert_time.dispatch(datetime)(time_string, time_format, **kwargs)
    elif time_format == 'utime':
        convert_time.dispatch(float)(time_string, time_format, **kwargs)
    elif time_string is 'now':
        return datetime.utcnow()
    else:
        convert_time(time_string, time_format, **kwargs)

wouldnt it give wrong output for cases like for example if user has provided a pandas.Timestamp, but also given time_format as datetime, this will return the output as just time_string while it should have given a time_string.to_pydatetime() because it's giving more priority to filtering by time_string.

@Cadair

This comment has been minimized.

Copy link
Member

commented Jan 17, 2018

@vermashresth having just gone on a git blame time travel excursion through the history of time_format='datetime' I am completely none the wiser. I propose you remove the ability to set time_format to 'datetime' as I can se no utility in it at all.

It looks to me that if time_format=utime then you just need to convert time_string to a float and fire it off through the dispatch.

So I think that gets us to:

    if time_format == 'utime':
        convert_time(float(time_string))
    elif time_string is 'now':
        return datetime.utcnow()
    else:
        convert_time(time_string)

I don't think you need to worry about passing anything other than time_string through the dispatch, unless I have missed something in the current implementation of parse_time.

@vermashresth

This comment has been minimized.

Copy link
Contributor Author

commented Jan 17, 2018

@Cadair Yes! just noticed that conversation too. (Trying out IRC for the first time)
Regarding the parameters, the last else case when no conditions match, it is using time_format (for regex parsing) and kwargs, so I guess we will need them.
here this part

else:
        # remove trailing zeros and the final dot to allow any
        # number of zeros. This solves issue #289
        if '.' in time_string:
            time_string = time_string.rstrip("0").rstrip(".")
        for time_format in TIME_FORMAT_LIST:
            try:
                try:
                    ts, time_delta = _regex_parse_time(time_string,
                                                       time_format)
                except TypeError:
                    break
                if ts is None:
                    continue
                return datetime.strptime(ts, time_format) + time_delta
            except ValueError:
                pass

        time_string_parse_format = kwargs.pop('_time_string_parse_format', None)
        if time_string_parse_format is not None:
            ts, time_delta = _regex_parse_time(time_string,
                                               time_string_parse_format)
            if ts and time_delta:
                return datetime.strptime(ts, time_string_parse_format) + time_delta
            else:
                return datetime.strptime(time_string, time_string_parse_format)
        raise ValueError("'{tstr!s}' is not a valid time string!".format(tstr=time_string))

This will come in the default function definition of convert_time when no type matches.

@Cadair

This comment has been minimized.

Copy link
Member

commented Jan 17, 2018

Oh! That's horrible, because it's not actually the time_format argument it's the string templates from for time_format in TIME_FORMAT_LIST:

Also there is an implicit assumption in that else section of the code, in that the type of the argument is str.

@vermashresth

This comment has been minimized.

Copy link
Contributor Author

commented Jan 17, 2018

Hmm, this is very confusing. At least kwargs are needed then to get the time_string_parse_format, right? And should we let this assumption part be then or rewrite it?

@Cadair

This comment has been minimized.

Copy link
Member

commented Jan 18, 2018

Yeah, I had forgotten about that, I guess you will have to pass kwargs through. I think it would make sense to have that section of the code be in a str dispatch function.

@vermashresth

This comment has been minimized.

Copy link
Contributor Author

commented Jan 18, 2018

Hey! @Cadair just made a PR with the changes, can you please take a look?

elif time_format == 'utime':
parse_time.dispatch(float)(time_string, time_format='', **kwargs)
if time_format == 'utime':
return convert_time.dispatch(float)(time_string, **kwargs)

This comment has been minimized.

Copy link
@Cadair

Cadair Jan 18, 2018

Member

instead of doing this it should be convert_time(float(time_string), **kwargs) because if it isn't convertable to a float it's going to error anyway.

This comment has been minimized.

Copy link
@vermashresth

vermashresth Jan 18, 2018

Author Contributor

Okay, will do this!

@Cadair

This comment has been minimized.

Copy link
Member

commented Jan 18, 2018

I think from a styleistic point of view I would rather have all the type specific implementations have descriptive names. convert_time_pandas or convert_time_astropy for instance.

@nabobalis nabobalis added this to the 1.0 milestone Jan 23, 2018


@convert_time.register(float)
@convert_time.register(int)
def convert_time_float(time_string, **kwargs):

This comment has been minimized.

Copy link
@wafels

wafels Jan 24, 2018

Member

What about quantities, like 10 * u.second ?

This comment has been minimized.

Copy link
@Cadair

Cadair Jan 27, 2018

Member

I am semi-relucant to make parse_time assume even more things are utime. Given that this PR is only phase 1 in a large redactor (and will not be in 0.9) I propose we leave this as-is for now.

@Cadair

This comment has been minimized.

Copy link
Member

commented Jan 27, 2018

I am happy with this PR as it stands. I don't think it's worth the effort of adding the python 2 single dispatch module as a dependency, so we will be leaving this open until we do the migration to Python 3 only.

This PR is also a great base for anyone who ends up doing the Time GSOC project this summer, or whatever lucky person gets to do that refactor!

@nabobalis nabobalis force-pushed the vermashresth:issue-2383 branch from c66d6e3 to 92dbce2 May 1, 2018

@nabobalis
Copy link
Contributor

left a comment

WIP of parse_time rewrite, needs to be merged for GSoC.

@Cadair

Cadair approved these changes May 1, 2018

@Cadair Cadair merged commit b6830e1 into sunpy:master May 1, 2018

9 checks passed

Giles Click details to preview the documentation build
Details
ci/circleci: egg-info-27 Your tests passed on CircleCI!
Details
ci/circleci: egg-info-35 Your tests passed on CircleCI!
Details
ci/circleci: egg-info-36 Your tests passed on CircleCI!
Details
ci/circleci: html-docs Your tests passed on CircleCI!
Details
codecov/patch 90.38% of diff hit (target 82.48%)
Details
codecov/project Absolute coverage decreased by -0.49% but relative coverage increased by +7.9% compared to ecfa9a8
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.