Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] Table: Ensure correct dtype in _compute_distributions #1969

Merged

Conversation

ales-erjavec
Copy link
Contributor

Issue
import Orange.data
from Orange.statistics import distribution

domain = Orange.data.Domain(
    [], [],
    [Orange.data.StringVariable("S"),
     Orange.data.DiscreteVariable("D", ["a", "b"])])

data = Orange.data.Table(domain, [["", 1], ["", float("nan")]])
distribution.get_distribution(data, domain.metas[1])

results in

Traceback (most recent call last):
  File "<stdin>", line 10, in <module>
  File "/Users/aleserjavec/workspace/orange3/Orange/statistics/distribution.py", line 295, in get_distribution
    return Discrete(dat, variable, unknowns)
  File "/Users/aleserjavec/workspace/orange3/Orange/statistics/distribution.py", line 39, in __new__
    return cls.from_data(dat, variable)
  File "/Users/aleserjavec/workspace/orange3/Orange/statistics/distribution.py", line 63, in from_data
    dist, unknowns = data._compute_distributions([variable])[0]
  File "/Users/aleserjavec/workspace/orange3/Orange/data/table.py", line 1317, in _compute_distributions
    dist, unknowns = bincount(m, len(var.values) - 1, W)
  File "/Users/aleserjavec/workspace/orange3/Orange/statistics/util.py", line 50, in bincount
    return (np.bincount(X.astype(np.int32, copy=False),
ValueError: cannot convert float NaN to integer
Description of changes

Ensure correct dtype in _compute_distributions when the column data comes from the metas array.

Includes
  • Code changes
  • Tests
  • Documentation

Fix an 'ValueError: cannot convert float NaN to integer' in bincount
when the column data comes from a object array and contains NaN values.
@ales-erjavec ales-erjavec force-pushed the fixes/distribution-meta-object-array branch from b81907b to e5c1dc3 Compare January 27, 2017 15:21
@codecov-io
Copy link

codecov-io commented Jan 27, 2017

Current coverage is 89.53% (diff: 100%)

Merging #1969 into master will increase coverage by <.01%

@@             master      #1969   diff @@
==========================================
  Files            90         90          
  Lines          9180       9182     +2   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits           8219       8221     +2   
  Misses          961        961          
  Partials          0          0          

Sunburst

Powered by Codecov. Last update f3ebda7...e5c1dc3

@astaric astaric modified the milestone: 3.3.11 Jan 30, 2017
@janezd janezd merged commit e3e3650 into biolab:master Feb 3, 2017
astaric pushed a commit that referenced this pull request Feb 3, 2017
…ject-array

[FIX] Table: Ensure correct dtype in `_compute_distributions`
(cherry picked from commit e3e3650)
@ales-erjavec ales-erjavec deleted the fixes/distribution-meta-object-array branch May 12, 2017 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants