Skip to content

Conversation

@bkamins
Copy link
Contributor

@bkamins bkamins commented Apr 22, 2021

it would be great to have DataFrames.jl 1.0 allowed and then release made. Thank you!

@cstjean
Copy link
Owner

cstjean commented Apr 22, 2021

I'll be happy to merge a PR when the tests pass, but unfortunately, ScikitLearn hasn't been updated in a while, and I'm in no position to really work on it. Help wanted!

@bkamins
Copy link
Contributor Author

bkamins commented Apr 23, 2021

CI throws a massive number of warnings and an error on Python side. I do not know ScikitLearn well enough to diagnose unfortunately.

@cstjean
Copy link
Owner

cstjean commented Apr 24, 2021

Thank you for the work!

CI throws a massive number of warnings and an error on Python side.

The trouble is that scikit-learn on the Python side keeps evolving... https://discourse.julialang.org/t/conda-jl-specify-library-version/59915

@azev77
Copy link

azev77 commented Apr 25, 2021

@cstjean, when @timholy moved on from Interpolations.jl he posted "Interpolations.jl needs a new maintainer" on Discourse.
Perhaps it would be worth doing the same for this package?

@ablaom heads up, this will likely cause some issues for MLJ users

@ablaom
Copy link

ablaom commented Apr 25, 2021

cc @OkonSamuel

@ablaom
Copy link

ablaom commented Apr 25, 2021

@cstjean Curious as to why DataFrames is a dependency. Most models just work on matrices, right?

@cstjean
Copy link
Owner

cstjean commented Apr 25, 2021

Curious as to why DataFrames is a dependency. Most models just work on matrices, right?

Yeah, DataFrames could be made an optional @require dependency, especially now that it's on 1.0.

when timholy moved on from Interpolations.jl he posted "Interpolations.jl needs a new maintainer" on Discourse.
Perhaps it would be worth doing the same for this package?

Has there been other successful transitions through such an announcement? Seems a bit risky, especially for a package which is probably most attractive for recent Python to Julia converts...

I'd love to pass on the torch, though. Right now, we're three maintainers: myself, @OkonSamuel and @alexmorley. I'll gladly give commit rights to anyone who shows up with a good PR and an interest.

@azev77
Copy link

azev77 commented Apr 25, 2021

Perhaps the folks at PyCall.jl have some suggestions? @stevengj

@ablaom
Copy link

ablaom commented Apr 25, 2021

Yeah, DataFrames could be made an optional @require dependency, especially now that it's on 1.0.

Based on our experience, I don't recommend using Requires.jl, if you can help it. Perhaps if you need to materialise a table, you just use a julia native table (eg, named tuple of vectors or dictionary of vectors keyed on Symbol). If a user wants a DataFrame, they can just do using DataFrames; DataFrame(table) where table is any of these (or any Tables.jl-compatible table). Just a thought.

On the other hand, if you are actually using DataFrames-specific interface (in particular, random access for rows), and a transition is painful, then leaving the DataFrames dep makes sense for now.

@PyDataBlog
Copy link

Any updates on this?

@cstjean
Copy link
Owner

cstjean commented Apr 27, 2021

@PyDataBlog Updates will be posted here, anyone is very welcome to make another PR to complete this. In the meanwhile, you can use this branch to run ScikitLearn + DataFrames 1.0. It certainly won't be any worse than the current release.

@cstjean
Copy link
Owner

cstjean commented Apr 30, 2021

Albert Zevelev pointed out that for his use case, using older versions of Conda/PyCall

Pkg.add([
    Pkg.PackageSpec(name="Conda", version="1.5.0"),
    Pkg.PackageSpec(name="PyCall", version="1.92.2"),
    Pkg.PackageSpec(name="MLJ", version="0.16.0"),
    Pkg.PackageSpec(name="MLJScikitLearnInterface", version="0.1.8"), 
])

resolves his problem. That's another track to follow through on...

@ablaom
Copy link

ablaom commented Apr 30, 2021

These versions of Conda and Pycall are just one patch earlier than the latest releases. And there is no change I could see in the Project files. So it is curious that this should make a difference.

@codecov-commenter
Copy link

codecov-commenter commented May 11, 2021

Welcome to Codecov 🎉

Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests.

Thanks for integrating Codecov - We've got you covered ☂️

@giordano
Copy link
Collaborator

Tests on Ubuntu are passing, but notebook examples fail on macOS and Windows at the moment, which is the same situation we currently have on master. Good to merge?

@bkamins
Copy link
Contributor Author

bkamins commented May 11, 2021

@giordano - thank you for finishing this.


try
return convert(Array, return_vector ? X[:, cols[1]] : X[:, cols])
return Array(return_vector ? X[:, cols[1]] : X[:, cols])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the changes in this file compatible with previous versions of DataFrames.jl?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Constructors always worked. Just convert is a legacy think from pre-Julia 1.0 that we kept for some time to avoid breaking changes (and we have these breakages when we removed them 😄).

@cstjean cstjean merged commit d0550a0 into cstjean:master May 11, 2021
@bkamins bkamins deleted the patch-1 branch May 11, 2021 18:07
@bkamins
Copy link
Contributor Author

bkamins commented May 20, 2021

Should the package get a new release?

@cstjean
Copy link
Owner

cstjean commented May 23, 2021

I wish, but AFAIK we can't release until all the tests pass, no? We're close, but macos+windows still does not work.

@bkamins
Copy link
Contributor Author

bkamins commented May 23, 2021

Ah - OK (I pinged, because there were some users asking about the release)

@giordano
Copy link
Collaborator

We're close, but macos+windows still does not work.

Tests do pass for me locally on macOS in a clean environment. I suspect there are some issues with Python/Conda setup for macOS/Windows in GitHub Actions, but hunting them down could be a long task. Is that worth holding this back?

@bkamins
Copy link
Contributor Author

bkamins commented May 27, 2021

In this case I would make a release and wait for user's feedback if anything is broken and then try to fix it in patch release (if needed)

@cstjean
Copy link
Owner

cstjean commented May 29, 2021

In this case I would make a release and wait for user's feedback if anything is broken and then try to fix it in patch release (if needed)

I'm out of touch, but once upon a time the registry auto-rejected packages whose CI failed. Has that changed?

@ablaom
Copy link

ablaom commented May 30, 2021

I don't think you even need CI implemented to register a package. And I agree with @bkamins suggestion.

cc @DilumAluthge

@DilumAluthge
Copy link

Yeah for AutoMerge, we only check that import PackageName works.

@cstjean
Copy link
Owner

cstjean commented May 31, 2021

Then let's release!

@giordano giordano mentioned this pull request Jun 8, 2021
@ablaom
Copy link

ablaom commented Jun 8, 2021

@cstjean Can you release a new version please, after merging #100, or delegate (maybe @OkonSamuel has authority now also?)?

@cstjean
Copy link
Owner

cstjean commented Jun 9, 2021

@OkonSamuel would you have time?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants