feature request: RBFRepeater #20

koaning · 2019-03-01T20:05:05Z

feature generation that can be used for timeseries. trick from the london talk.

MaxHalford · 2019-03-15T06:58:34Z

I stumbled upon your talk a few days ago and really enjoyed many of your talking points. I was curious about the RBF kernel trick so I decided to implement it in an online learning library me and some friends are working on. From what I understand The idea is simply to computing the distance between, say, a month and all the 12 months of the year using a RBF. This way September is closer to August than it is to March, which isn't taken into account if one simply one-hot encodes the month. Is this correct? If you're interested I coded it at the end of this notebook.

koaning · 2019-03-15T20:47:44Z

That cream stuff sounds cool beans. I'll give it a spin. Also: PyData Amsterdam has a CFP open at the moment. I'm still in the committee and that cream library sounds like something we'd love to host.
The goal here is to make an sklearn compatible transformer that is general. Your example is good but our goal is to be very general; like be able to supply a date column and a number of RBFs you'd like per year. Or a column that you specify that will denote the timewindow. There's going to be a sprint this wednesday so I'll keep this thread up to date.

MaxHalford · 2019-03-15T20:55:18Z

Sounds great! I've started making some slides (written in English) for an upcoming of the data science Meetup back here in Toulouse, so maybe I can reuse them.
Okay good to know, I just wanted to make sure the maths were right. Indeed I think that having a transformer to extract date features would be nice because it could then pipeline into a RBFTransformer.

Good stuff!

Edit: if you're going to try creme I suggest you install the latest version from GitHub using pip install git+https://github.com/creme-ml/creme as there is a lot of stuff that isn't on PyPI yet.

koaning · 2019-03-15T21:02:55Z

Question about creme: most of the learning that occurs, is that just a small SGD step that occurs per datapoint or is there something more happening? SKlearn has some passive agressive things api here, but creme is not doing that atm?

I like the idea of doing a rolling mean on an intercept by the way.

MaxHalford · 2019-03-15T21:07:39Z

I'm not 100% sure what you mean but here goes: you can provide an optimizer to LinearRegression and LogisticRegression. The default optimizer for both is called VanillaSGD and simply performs textbook online gradient descent. There are many optimisers you can use, such as PassiveAggressiveI, PassiveAggressiveII, Adam, etc. sklearn's SGDClassifier and SGDRegressor can only use plain gradient descent because they use a special trick for the intercept that isn't generic. Because we use a running statistic to compute the intercept we're "allowed" to use any optimizer we wish.

I hope I'm clear! I'm going to write an explanatory notebook when I get some time!

koaning · 2019-03-15T21:18:38Z

Yep. This is all I wanted to know. Thanks!

Do consider sending that cfp tho: https://pydata.org/amsterdam2019/cfp/

MaxHalford · 2019-03-15T23:22:06Z

I just did :)

MaxHalford · 2019-04-11T14:46:49Z

@koaning when are the speakers for PyData Amsterdam annouced? I have to book a plane ticket early if I come.

MBrouns · 2019-04-11T15:08:35Z

@MaxHalford tomorrow, but you're in! We're looking forward to seeing your talk!

MaxHalford · 2019-04-11T15:10:35Z

Cheers @MBrouns, I'm really excited! I'll book my ticket ASAP :)

koaning · 2019-08-24T18:30:17Z

This feature has now been implemented. Documentation will follow.

koaning changed the title ~~feature request: RBF features~~ feature request: RBFRepeater Mar 5, 2019

koaning added the sprint-material This is something that can be done in a single day sprint. label Mar 23, 2019

RensDimmendaal mentioned this issue May 31, 2019

[wip] add repeating and spanning basis functions #147

Closed

RensDimmendaal mentioned this issue Jul 30, 2019

Add Repeating Basis Functions #171

Merged

koaning closed this as completed Aug 24, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature request: RBFRepeater #20

feature request: RBFRepeater #20

koaning commented Mar 1, 2019 •

edited

MaxHalford commented Mar 15, 2019

koaning commented Mar 15, 2019 •

edited

MaxHalford commented Mar 15, 2019 •

edited

koaning commented Mar 15, 2019

MaxHalford commented Mar 15, 2019

koaning commented Mar 15, 2019

MaxHalford commented Mar 15, 2019

MaxHalford commented Apr 11, 2019

MBrouns commented Apr 11, 2019

MaxHalford commented Apr 11, 2019

koaning commented Aug 24, 2019

feature request: RBFRepeater #20

feature request: RBFRepeater #20

Comments

koaning commented Mar 1, 2019 • edited

MaxHalford commented Mar 15, 2019

koaning commented Mar 15, 2019 • edited

MaxHalford commented Mar 15, 2019 • edited

koaning commented Mar 15, 2019

MaxHalford commented Mar 15, 2019

koaning commented Mar 15, 2019

MaxHalford commented Mar 15, 2019

MaxHalford commented Apr 11, 2019

MBrouns commented Apr 11, 2019

MaxHalford commented Apr 11, 2019

koaning commented Aug 24, 2019

koaning commented Mar 1, 2019 •

edited

koaning commented Mar 15, 2019 •

edited

MaxHalford commented Mar 15, 2019 •

edited