Add support for concat #40

jcrist · 2017-08-02T19:23:03Z

Add a top-level concat function for concatenating dataframes,
series, or indices together
Remove old concat method, as it didn't match pandas api
Fix bugs in to_pandas implementation to properly handle indices

- Add a top-level `concat` function for concatenating dataframes, series, or indices together - Remove old `concat` method, as it didn't match pandas api - Fix bugs in `to_pandas` implementation to properly handle indices

sklam · 2017-08-02T19:35:52Z

pygdf/categorical.py

+                    self._ordered == other._ordered and
+                    self._codes_impl == other._codes_impl)
+        except Exception:
+            return False


Why hide the Exception?

Because you can't compare pandas custom dtypes and numpy dtypes. We hide the exception so that you can compare categorical and numeric implementations. Although in this case we can skip the try block because we check if other is CategoricalSeriesImpl. The try/except for NumericalSeriesImpl is still needed though.

The except will just hide accidental code error. If other is a NumericalSeriesImpl, it will just return None, which will be interpreted as False. I think the try-except should be removed.

Agreed, except for the "return None" bit - I prefer comparison methods to always return booleans directly. Fixed.

sklam · 2017-08-02T19:43:43Z

pygdf/series.py

+            if o._impl != head._impl:
+                raise ValueError("All series must be of same type")
+
+        data = head._impl.concat(objs)


I don't think we need to delegate to _impl to handle concat. The GPU dataframe will always require plain-old-data types. (We can only handle data that can be copied b/w cpu and gpu memory.) It is always safe to just copy the data for concat just like the one in numerical.

For categorical data we need to allocate a buffer of series.cat.codes.dtype, while for numeric we need to allocate a buffer of series.dtype. I figured it made more sense to special case inside the implementations than check if the series was categorical and special case in the method.

The self._data.dtype (Buffer.dtype) will always have the physical dtype. The SeriesImpl has ~~~a~~~ the logical meaning of things. Here, we will only need physical info.

Fair point. I didn't realize _data also had a dtype attribute, I was relying on self.dtype (which could be category) when I implemented it before. Fixed.

sklam · 2017-08-02T23:27:07Z

Thanks!

Update

Make xdf a subpackage of cudf instead of standalone

Add support for concat

7f79939

- Add a top-level `concat` function for concatenating dataframes, series, or indices together - Remove old `concat` method, as it didn't match pandas api - Fix bugs in `to_pandas` implementation to properly handle indices

sklam reviewed Aug 2, 2017

View reviewed changes

jcrist mentioned this pull request Aug 2, 2017

Rethink class structure #38

Closed

jcrist added 3 commits August 2, 2017 14:53

Remove unnecessary try-catch block

c0b59de

Move concat implementation to series only

d03b9bb

Missed adding the test file

488c9fb

sklam merged commit ec36487 into rapidsai:master Aug 2, 2017

shwina pushed a commit that referenced this pull request Oct 16, 2019

Merge pull request #40 from rapidsai/branch-0.11

0188e17

Update

tgravescs mentioned this pull request Nov 23, 2021

[BUG] Regex \n at end of string change in issue 9620 broken string to timestamp behavior #9764

Closed

raydouglass pushed a commit that referenced this pull request Nov 7, 2023

Merge pull request #40 from vyasr/chore/move_xdf_to_cudf

f3d7f52

Make xdf a subpackage of cudf instead of standalone

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for concat #40

Add support for concat #40

jcrist commented Aug 2, 2017

sklam Aug 2, 2017

jcrist Aug 2, 2017

sklam Aug 2, 2017

jcrist Aug 2, 2017

sklam Aug 2, 2017

jcrist Aug 2, 2017

sklam Aug 2, 2017 •

edited

jcrist Aug 2, 2017

sklam commented Aug 2, 2017

Add support for concat #40

Add support for concat #40

Conversation

jcrist commented Aug 2, 2017

sklam Aug 2, 2017

Choose a reason for hiding this comment

jcrist Aug 2, 2017

Choose a reason for hiding this comment

sklam Aug 2, 2017

Choose a reason for hiding this comment

jcrist Aug 2, 2017

Choose a reason for hiding this comment

sklam Aug 2, 2017

Choose a reason for hiding this comment

jcrist Aug 2, 2017

Choose a reason for hiding this comment

sklam Aug 2, 2017 • edited

Choose a reason for hiding this comment

jcrist Aug 2, 2017

Choose a reason for hiding this comment

sklam commented Aug 2, 2017

sklam Aug 2, 2017 •

edited