Skip to content

Commit

Permalink
Fix Classify still returns Categorical in some cases (#82)
Browse files Browse the repository at this point in the history
  • Loading branch information
caspervdw committed Dec 11, 2020
1 parent ec02b17 commit 1d42e18
Showing 1 changed file with 9 additions and 4 deletions.
13 changes: 9 additions & 4 deletions dask_geomodeling/geometry/field_operations.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,8 @@ class Classify(BaseSingleSeries):
labels (list): The classification returned if a value falls in a specific
bin (i.e. ``["A", "B", "C"]``). The length of this list is either one
larger or one less than the length of the ``bins`` argument. Labels
should be unique.
should be unique. If labels are numeric, they are always
converted to float to be able to deal with NaN values.
right (boolean, optional): Determines what side of the intervals are
closed. Defaults to True (the right side of the bin is closed so a
value assigned to the bin on the left if it is exactly on a bin edge).
Expand Down Expand Up @@ -112,8 +113,11 @@ def process(series, bins, labels, right):
if series.dtype == object:
series = series.fillna(value=np.nan)
result = pd.cut(series, bins, right, labels)
# transform from categorical to whatever suits the "labels"
result = pd.Series(result, dtype=pd.Series(labels).dtype)

# Transform from categorical to whatever suits the "labels". The
# dtype has to be able to accomodate NaN as well.
result = result.astype(pd.Series(labels + [np.nan]).dtype)

if open_bounds:
# patch the result, we actually want to classify np.inf
if right:
Expand Down Expand Up @@ -142,7 +146,8 @@ class ClassifyFromColumns(SeriesBlock):
labels (list): The classification returned if a value falls in a specific
bin (i.e. ``["A", "B", "C"]``). The length of this list is either one
larger or one less than the length of the ``bins`` argument. Labels
should be unique.
should be unique. If labels are numeric, they are always
converted to float to be able to deal with NaN values.
right (boolean, optional): Determines what side of the intervals are
closed. Defaults to True (the right side of the bin is closed so a
value assigned to the bin on the left if it is exactly on a bin edge).
Expand Down

0 comments on commit 1d42e18

Please sign in to comment.