Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add UnivariateTimeTypeToContinuous transformer to builtins #245

Closed
wants to merge 15 commits into from

Conversation

venuur
Copy link
Contributor

@venuur venuur commented Apr 30, 2020

Implements univariate case of TimeType transformer discussed in #234

I would appreciate feedback, especially recommendations on how to add tests.

"""
UnivariateTimeTypeToContinuous(zero_time=nothing, step=Hour(24))

Convert `Date`, `DateTime`, and `Time` vectors to `Float64` by assuming `0.0` corresponds
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: I suggest "Convert a Date, DateTime, and Time vector to prevent confusion with multi-variate case.

@venuur
Copy link
Contributor Author

venuur commented May 4, 2020

Good feedback. I’ll update the PR soon.

Copy link
Member

@ablaom ablaom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay this is looking pretty solid. Only some minor comments and suggestions for checks to add (bells and whistles!)

Regarding the check in clean! I suggest above: Best to issue a @warn and reset the bad input to something reasonable than to throw an error. Specify in the warning what change is made. See the example at the docs

Re tests, I am sure you can cook up some reasonable tests 😄 Would be good to check:

  • user specified time_step works as well as automatic case

  • check your code tweak for time_step = 24hours in Time setting works ( 5am + 24hrs = 5am mod 24hrs)

  • use @test_throws to check incompatible types at time of fit or transform

  • use @test_logs to check any warnings you implement in clean!

@venuur
Copy link
Contributor Author

venuur commented May 5, 2020

I added the checks and I pushed the new version to the dev branch of venuur/MLJModels.jl. According to github the changes should show up here (though I don't see them yet). I'll be adding tests later in the week.

@ablaom
Copy link
Member

ablaom commented May 5, 2020

According to github the changes should show up here (though I don't see them yet).

That's weird.

This latest commit here is 8881832, which matches the latest commit on your dev branch just now:

Screen Shot 2020-05-06 at 11 25 04 AM

Are you sure they are not there?

@venuur
Copy link
Contributor Author

venuur commented May 6, 2020

When I look at the commits tab up top, I see the commits in the attached screenshot.
3BE41D16-B29F-4C75-9551-38F5D34B2A21

The date doesn’t match what you are showing.

@ablaom
Copy link
Member

ablaom commented May 6, 2020

Your latest commit appears there, it seems to me. It's at the bottom. Perhaps what's confusing you is that this tab lists the commits in the opposite order to that in the conversations tab?

Sorry I can't be more helpful.

@venuur
Copy link
Contributor Author

venuur commented May 6, 2020

My latest commit on my fork is venuur@3abaa09

That is not the one on this pull request, right? Or am I just reading it all wrong, which is totally possible.

@ablaom
Copy link
Member

ablaom commented May 7, 2020

No sorry, I must have been looking in the wrong place. This is quite weird. Going to close and re-open - see if that helps

@ablaom ablaom closed this May 7, 2020
@ablaom ablaom reopened this May 7, 2020
@ablaom
Copy link
Member

ablaom commented May 7, 2020

Nah. Can you add # testing push at the top line of src/MLJBase.jl, commit and push, and let's see if that appears here.

@venuur
Copy link
Contributor Author

venuur commented May 9, 2020

Looks like it's updated finally.

@ablaom
Copy link
Member

ablaom commented May 10, 2020

Looking great! Thanks adding such detailed checks.

I think all that's needed now are some tests.

@ablaom
Copy link
Member

ablaom commented May 26, 2020

@venuur Be great to have some tests when you get a chance.

@venuur
Copy link
Contributor Author

venuur commented May 27, 2020

Thanks for pinging me. I haven’t forgotten about this, it’s just my workload has been heavier than usual so it may be another week or two before I can wrap this up.

I’ll definitely get to it, the timeline is just a bit pushed compared to my original plan.

@ablaom
Copy link
Member

ablaom commented May 27, 2020

No worries at all!!

Carl Morris added 4 commits June 8, 2020 12:04
Fix default constructor case of Hour(24) when applied to a Date vector.
Hour cannot be added to Date objects, so the step must be converted to
Day, but because this is a change to the step hyper parameter it must be
adjusted. Unfortunately, this case cannot be distinguished from a user
supplied date, so a warning must be shown, which is non-ideal for a
common default case.

Alternatively we can add an extra internal variable to track whether a
user supplied step is given and only warn on that case, but that is
extra complexity I am avoiding for this commit.
@venuur
Copy link
Contributor Author

venuur commented Jun 9, 2020

Added tests and fixed a couple corner cases while writing the log tests. There's an annoying issue that the default constructor uses a step of Hour(24), but when applied to a Date, we need to convert to Day(1), which is fine, but since the constructor doesn't distinguish a user supplied step versus the default, a warning message is printed. I would rather add some extra checks to avoid the message if it's the default value, but I don't have the time right now to figure out the right logic.

Excepting the somewhat unnecessary log message, the current version should be complete. Please let me know if I'm missing anything.

@venuur
Copy link
Contributor Author

venuur commented Jun 9, 2020

Should be all good for your next review when you get a chance @ablaom

Copy link
Member

@ablaom ablaom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is terrific, many thanks. Two small details remain. Great if you could put these in:

  • add "summary" doc-string UNIVARIATE_TIME_TYPE_TO_CONTINUOUS like the others appearing at the top of transformers.jl.

  • add a metadata entry, which can go at the very end of transformers.jl. It should look like this:

metadata_model(UnivariateTimeTypeToContinuous,
    input   = AbstractVector{<:ScientificTypes.ScientificTimeType},
    output  = AbstractVector{Continuous},
    weights = false,
    descr   = UNIVARIATE_TIME_TYPE_TO_CONTINUOUS,
    path    = "MLJModels.UnivariateTimeTypeToContinuous")

@ablaom
Copy link
Member

ablaom commented Jun 10, 2020

And, if you haven't noticed, there's a failure for julia 1.0. Here's an excerpt:

┌ Warning: Cannot add `TimePeriod` `step` to `Date` `zero_time`. Converting `step` to `Day`.
453└ @ MLJModels ~/build/alan-turing-institute/MLJModels.jl/src/builtins/Transformers.jl:408
454TimeTypeToContinuous: Test Failed at /home/travis/build/alan-turing-institute/MLJModels.jl/test/builtins/Transformers.jl:125
455  Expression: all(dt_continuous .== ex)
456Stacktrace:
457 [1] macro expansion at /home/travis/build/alan-turing-institute/MLJModels.jl/test/builtins/Transformers.jl:125 [inlined]
458 [2] macro expansion at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.0/Test/src/Test.jl:1083 [inlined]
459 [3] top-level scope at /home/travis/build/alan-turing-institute/MLJModels.jl/test/builtins/Transformers.jl:96
460TimeTypeToContinuous: Test Failed at /home/travis/build/alan-turing-institute/MLJModels.jl/test/builtins/Transformers.jl:134
461  Expression: all(dt_continuous .== ex)
462Stacktrace:
463 [1] macro expansion at /home/travis/build/alan-turing-institute/MLJModels.jl/test/builtins/Transformers.jl:134 [inlined]
464 [2] macro expansion at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.0/Test/src/Test.jl:1083 [inlined]
465 [3] top-level scope at /home/travis/build/alan-turing-institute/MLJModels.jl/test/builtins/Transformers.jl:96
466┌

@venuur
Copy link
Contributor Author

venuur commented Jun 10, 2020

Sounds good. I’ll make the changes you suggested and look into the Julia 1.0 error.

@ablaom
Copy link
Member

ablaom commented Aug 14, 2020

Okay, working on the conflicts and adding the extras referred to above ...

@venuur
Copy link
Contributor Author

venuur commented Aug 14, 2020

Thanks for following up. And sorry for disappearing. As you may guess, I haven’t had time and won’t have time for a while to work on this. I’m still happy to comment here where I can.

@ablaom
Copy link
Member

ablaom commented Aug 14, 2020

No worries. You put a lot a fine work into this.

Note the multivariate (table) case will be handled by #288 .

Closing in favour of #295.

@ablaom ablaom closed this Aug 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants