Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propagate index metadata if not specified #4509

Merged
merged 1 commit into from Feb 19, 2019

Conversation

@jcrist
Copy link
Member

@jcrist jcrist commented Feb 19, 2019

Several dask.dataframe functions take a meta kwarg. If not provided,
these functions try to infer the metadata based on the computation. If
provided, this inference is skipped. meta can be either a pandas
object, or a spec (tuple, list of tuples, or dict).

In the case of a spec, the index metadata isn't provided. Previously we
would default to the default pandas index, now we just copy over the
index metadata from the first pandas object passed to these functions.
In the case of map_partitions, map_overlap, aca, etc... this is
usually what the user wants.

Fixes #4454.

Several `dask.dataframe` functions take a `meta` kwarg. If not provided,
these functions try to infer the metadata based on the computation. If
provided, this inference is skipped. `meta` can be either a pandas
object, or a spec (tuple, list of tuples, or dict).

In the case of a spec, the index metadata isn't provided. Previously we
would default to the default pandas index, now we just copy over the
index metadata from the first pandas object passed to these functions.
In the case of `map_partitions`, `map_overlap`, `aca`, etc... this is
usually what the user wants.
@jcrist jcrist changed the title Propogate index metadata if not specified Propagate index metadata if not specified Feb 19, 2019
@mrocklin
Copy link
Member

@mrocklin mrocklin commented Feb 19, 2019

Looks good. Thanks @jcrist . Merging.

@mrocklin mrocklin merged commit 5744197 into dask:master Feb 19, 2019
2 checks passed
@jcrist jcrist deleted the propogate-index-in-metadata branch Feb 19, 2019
jorge-pessoa pushed a commit to jorge-pessoa/dask that referenced this issue May 14, 2019
Several `dask.dataframe` functions take a `meta` kwarg. If not provided,
these functions try to infer the metadata based on the computation. If
provided, this inference is skipped. `meta` can be either a pandas
object, or a spec (tuple, list of tuples, or dict).

In the case of a spec, the index metadata isn't provided. Previously we
would default to the default pandas index, now we just copy over the
index metadata from the first pandas object passed to these functions.
In the case of `map_partitions`, `map_overlap`, `aca`, etc... this is
usually what the user wants.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants