Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dd.concat returns empty tuple when axis=1 #11074

Closed
mscanlon-exos opened this issue Apr 25, 2024 · 3 comments
Closed

dd.concat returns empty tuple when axis=1 #11074

mscanlon-exos opened this issue Apr 25, 2024 · 3 comments
Labels
needs triage Needs a response from a contributor

Comments

@mscanlon-exos
Copy link

mscanlon-exos commented Apr 25, 2024

Describe the issue:

When calling dd.concat, if you pass axis=1 the line dasks = [df for df in dfs if isinstance(df, _Frame)] causes an empty dataframe to be produced because the dataframes passed are instances of dask_expr._collection.DataFrame and thus inherit from dask_expr._collection.FrameBase

Minimal Complete Verifiable Example:

def test_mcve():
    import dask.dataframe as dd
    import pandas as pd


    ddf = dd.from_pandas(pd.DataFrame(data={'foo': [1]}))
    ddf_1 = dd.from_pandas(pd.DataFrame(data={'foo': [2]}))


    concat_ddf = dd.multi.concat([ddf, ddf_1], axis=1)
    assert(isinstance(concat_ddf.compute(), tuple))
    

Anything else we need to know?:

Environment:

  • Dask version: 2024.4.2
  • Python version: 3.10
  • Operating System: Mac OSx
  • Install method (conda, pip, source): Pip
@github-actions github-actions bot added the needs triage Needs a response from a contributor label Apr 25, 2024
@phofl
Copy link
Collaborator

phofl commented Apr 25, 2024

Why are you using concat from multi? This won’t work with the query optimizer. You should use dd.concat instead

@mscanlon-exos
Copy link
Author

I thought i did but i guess i didn't. Sorry about that. Thanks for the response

@phofl
Copy link
Collaborator

phofl commented Apr 25, 2024

Closing this, it's currently not ideal if folks import from internal stuff, but this will fix itself when we remove the legacy implementation.

Please let me know if there is something that you need that you can't access with dd.foo

@phofl phofl closed this as completed Apr 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs triage Needs a response from a contributor
Projects
None yet
Development

No branches or pull requests

2 participants