Skip to content

Can I fit a sklearn classifier using lazy? #390

Answered by jpivarski
peguerosdc asked this question in Q&A
Discussion options

You must be logged in to vote

uproot.lazy defers reading until a slice of an array is requested, but Scikit-Learn's fit function asks for the entire sliced array. For Scikit-Learn to only pull one chunk of the array at a time, it would have to be knowledgeable about chunking—the fitting algorithm would need to be able to deal with batches and the interface would have to recognize that slicing the input and training one slice at a time is a benefit. In other words, Scikit-Learn would have to be "in on it [the batching of data]." There might be some interface between Scikit-Learn and Dask, but Scikit-Learn didn't know about Awkward Arrays. (Which is why we want to add interfaces between Awkward and Dask, so that third p…

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@peguerosdc
Comment options

@jpivarski
Comment options

Answer selected by peguerosdc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants