Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use categoricalarrays 6.0 #44

Closed
tlienart opened this issue Sep 20, 2019 · 6 comments
Closed

Use categoricalarrays 6.0 #44

tlienart opened this issue Sep 20, 2019 · 6 comments

Comments

@tlienart
Copy link
Collaborator

tlienart commented Sep 20, 2019

They released it 2 weeks ago with a number of fixes including the annoying `T` is deprecated, use `nonmissingtype` instead.; should we start using it?

it seems the issues you had are:

@ablaom
Copy link
Member

ablaom commented Sep 22, 2019

Thanks for that.

Yes the first two are resolved but unfortunately the last is not and it is a pretty serious issue (I would say a bug) without a straightforward workaround. @nalimilan has formulated a plan to resolve the issue, but as this looks fairly low-level, I don't think it would be efficient for us to make our own PR.

I agree this is annoying. Apart from the warning, having a cap on the version causes people not familiar with the package manager some pain in updating MLJ.

@nalimilan
Copy link

Sorry, I haven't fixed that yet since an efficient solution requires a bit of work. But maybe we can go with the inefficient solution for now. Can you confirm that this branch fixes the problems you encounter?

@ablaom
Copy link
Member

ablaom commented Sep 24, 2019

@nalimilan Thank you so much for looking into this!

Some things are now working. However, I'm having some issues with valindex behaviour. Probably I shouldn't be using it (ie should regard it as private) but it is convenient to do so. Still, I'm reporting this, in case it really is a bug.

Under CategoricalArrays 0.5.2 the following code throws no error. However, an error is thrown on nl/setindex! branch, unless you do relabelling a <-> b and A <-> B:

using CategoricalArrays

y = categorical(["B", "A"])
b = y[1]
a = y[2]

z = [b, a]

@assert y[1].pool.valindex == z[1].pool.valindex "something unexpected"

@nalimilan
Copy link

OK, good to know. I'll try to add tests and turn this into a proper PR then.

Regarding the valindex issue, indeed there's absolutely no guaranty that the index and valindex fields will match (at least currently, I might drop the levels/index distinction if we no longer need it for e.g. CSV.jl since it confuses many people). You always need to go through order(pool) as noted here. The fact that it worked in 0.5.2 is just due to [b, a] returning an Array in that version.

@nalimilan
Copy link

@ablaom
Copy link
Member

ablaom commented Oct 16, 2019

Resolved in MLJBase 0.7.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants