Add new error message content when columns are in a different order#5927
Add new error message content when columns are in a different order#5927quasiben merged 2 commits intodask:masterfrom
Conversation
|
cc @bnaul
…On Wed, Feb 19, 2020 at 1:06 PM Tom Augspurger ***@***.***> wrote:
***@***.**** approved this pull request.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#5927?email_source=notifications&email_token=AACKZTGLR6ELVEXTGHM57FTRDWNOJA5CNFSM4KX63RJ2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCWFROCI#pullrequestreview-361436937>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACKZTFLB2DNR755RS3NZALRDWNOJANCNFSM4KX63RJQ>
.
|
|
Thanks @jsignell |
|
Can I ask about the intent for this? I've been running into an issue on It seems that the ordering of the data is not guaranteed so dask throws an error. I was hoping to propose a PR that meant that |
|
Just to be clear: Dask is checking the ordering of the columns here, not the rows. We call this in many places, and it's an existing check (this PR improved the error message). Do you have an example of this failing? |
|
The ordering of the which will become |
|
Do you have an example? I'm not familiar with |
|
I think that's unnecessary detail - its just how we shortcut obtaining column metadata on behalf of Dask to save it doing the Essentially, we create a Dask The query object appears to generate This is an issue for SQLAlchemy to fix the ordering of the columns in the generated SQL, and/or an issue for Dask to handle the mis-ordering of columns here (as, as far as I can tell, the column ordering does not matter) |
|
Sorry, I'm having trouble understanding the issue :/ DataFrame's have a well-defined ordering of the columns. Perhaps you can can ensure that your meta object has the correct order prior to the read_sql.. |
|
A SQLAlchemy query object is the main thing passed to Because of this, I am unable to predict how to order the columns in my I was hoping that Dask didn't worry too much about column ordering (as it references columns within the underlying Pandas dataframes by label), so we could remove this part of the check. Thanks for stepping through this with me, sorry for not being clear enough at the start! |
Closes: #5886
black dask/flake8 dask