Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with UnivariateProbabilitySimulator and missing values/scale y #46

Closed
jason-bentley opened this issue Sep 7, 2020 · 0 comments · Fixed by #47
Closed

Issues with UnivariateProbabilitySimulator and missing values/scale y #46

jason-bentley opened this issue Sep 7, 2020 · 0 comments · Fixed by #47
Assignees
Labels
bug Something isn't working

Comments

@jason-bentley
Copy link
Contributor

Describe the bug
The current UnivariateProbabilitySimulator() appears to have two issues:

  1. The simulator fails to create partitions when the feature being simulated contains missing values - ValueError: cannot convert float NaN to integer
  2. If the feature does not have missing values an error an error is thrown when trying to work out the scale for y - TypeError: '>' not supported between instances of 'float' and 'method'

To Reproduce
To reproduce these errors please got to the branch: https://github.com/BCG-Gamma/facet/tree/docs/notebook_updates
and within the sphinx > source > tutorial folder run the notebook: https://github.com/BCG-Gamma/facet/blob/docs/notebook_updates/sphinx/source/tutorial/Prediabetes_classification_with_Facet.ipynb

Expected behavior
In both cases I expect to be able to get a complete figure with simulated trend and CIs for a feature displayed correctly.

Screenshots

First error:


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-60-ea5a7119572e> in <module>
      2 simulator = UnivariateProbabilitySimulator(crossfit=ranker.best_model_crossfit, n_jobs=-1)
      3 partitioner = ContinuousRangePartitioner()
----> 4 univariate_simulation = simulator.simulate_feature(name=sim_feature, partitioner=partitioner)

C:\Projects\facet\facet\src\facet\simulation\_simulation.py in simulate_feature(self, name, partitioner)
    182             raise NotImplementedError("multi-output simulations are not supported")
    183 
--> 184         simulation_values = partitioner.fit(sample.features.loc[:, name]).partitions()
    185         simulation_results = self._aggregate_simulation_results(
    186             results_per_split=self._simulate_feature_with_values(

C:\Projects\facet\facet\src\facet\simulation\partition\_partition.py in fit(self, values, lower_bound, upper_bound, **fit_params)
    202         # calculate the step count based on the maximum number of partitions,
    203         # rounded to the next-largest rounded value ending in 1, 2, or 5
--> 204         self._step = step = self._step_size(lower_bound, upper_bound)
    205 
    206         # calculate centre values of the first and last partition;

C:\Projects\facet\facet\src\facet\simulation\partition\_partition.py in _step_size(self, lower_bound, upper_bound)
    334     def _step_size(self, lower_bound: float, upper_bound: float) -> float:
    335         return RangePartitioner._ceil_step(
--> 336             (upper_bound - lower_bound) / (self.max_partitions - 1)
    337         )
    338 

C:\Projects\facet\facet\src\facet\simulation\partition\_partition.py in _ceil_step(step)
    294             raise ValueError("arg step must be positive")
    295 
--> 296         return min(10 ** math.ceil(math.log10(step * m)) / m for m in [1, 2, 5])
    297 
    298     @staticmethod

C:\Projects\facet\facet\src\facet\simulation\partition\_partition.py in <genexpr>(.0)
    294             raise ValueError("arg step must be positive")
    295 
--> 296         return min(10 ** math.ceil(math.log10(step * m)) / m for m in [1, 2, 5])
    297 
    298     @staticmethod

ValueError: cannot convert float NaN to integer

Second error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-58-22d13aac8dd5> in <module>
----> 1 SimulationDrawer().draw(data=univariate_simulation, title=sim_feature)

C:\Projects\facet\facet\src\facet\simulation\viz\_draw.py in draw(self, data, title)
     73         if title is None:
     74             title = f"Simulation: {data.feature}"
---> 75         super().draw(data=data, title=title)
     76 
     77     @classmethod

C:\Projects\facet\pytools\src\pytools\viz\_viz.py in draw(self, data, title)
    104             # noinspection PyProtectedMember
    105             style._drawing_start(title)
--> 106             self._draw(data)
    107             # noinspection PyProtectedMember
    108             style._drawing_finalize()

C:\Projects\facet\facet\src\facet\simulation\viz\_draw.py in _draw(self, data)
     96             partitions=simulation_series.partitions,
     97             frequencies=simulation_series.frequencies,
---> 98             is_categorical_feature=data.partitioner.is_categorical,
     99         )
    100 

C:\Projects\facet\facet\src\facet\simulation\viz\_style.py in draw_uplift(self, feature, target, values_label, values_median, values_min, values_max, values_baseline, percentile_lower, percentile_upper, partitions, frequencies, is_categorical_feature)
    178 
    179         # add a horizontal line at y=0
--> 180         ax.axhline(y=values_baseline, linewidth=0.5)
    181 
    182         # remove the top and right spines

C:\Anaconda3\envs\facet-develop\lib\site-packages\matplotlib\axes\_axes.py in axhline(self, y, xmin, xmax, **kwargs)
    860         self._process_unit_info(ydata=y, kwargs=kwargs)
    861         yy = self.convert_yunits(y)
--> 862         scaley = (yy < ymin) or (yy > ymax)
    863 
    864         trans = self.get_yaxis_transform(which='grid')

TypeError: '>' not supported between instances of 'float' and 'method'

Desktop (please complete the following information):

  • Windows
  • Chrome
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants