Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot Convert float NaN to Int #1784

Closed
epi-cvu opened this issue Feb 8, 2024 · 1 comment
Closed

Cannot Convert float NaN to Int #1784

epi-cvu opened this issue Feb 8, 2024 · 1 comment
Labels
bug Something isn't working resolution:duplicate This issue or pull request already exists

Comments

@epi-cvu
Copy link

epi-cvu commented Feb 8, 2024

Environment Details

Please indicate the following details about the environment in which you found the bug:

  • SDV version: Latest
  • Python version: 3.10.11
  • Operating System: Windows 11

Error Description

I was trying the HMA Synthesizer tutorial using the documentation. And after the fitting process I tried sampling on scale 1 and 0.1 and it failed and gave me an error "cannot convert float NaN to Int"

Steps to reproduce

To reproduce this, we will have to download the CSV from MIMIC IV (physionet) Data set and specially "patient" and "admissions" and put them in the same folder. I followed all the steps of multi-table synthetic data with HMASynthesizer and waited for the fit to finish. After the fit step, I tried sampling with a scale of 1 and it gives me "Cannot convert float NaN to int". I went ahead and tried modifying the columbs anchor_year and anchor_age with computer representation Int32, but still gives me the same message.

Paste the command(s) you ran and the output.
If there was a crash, please include the traceback here.
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[82], [line 1](vscode-notebook-cell:?execution_count=82&line=1)
----> [1](vscode-notebook-cell:?execution_count=82&line=1) synthesizer.sample(scale=0.1)

File [c:\Users\Charles](file:///C:/Users/Charles) VU\Documents\GitHub\Stage\.venv\lib\site-packages\sdv\multi_table\base.py:393, in BaseMultiTableSynthesizer.sample(self, scale)
    [389](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/multi_table/base.py:389)     raise SynthesizerInputError(
    [390](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/multi_table/base.py:390)         f"Invalid parameter for 'scale' ({scale}). Please provide a number that is >0.0.")
    [392](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/multi_table/base.py:392) with self._set_temp_numpy_seed():
--> [393](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/multi_table/base.py:393)     sampled_data = self._sample(scale=scale)
    [395](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/multi_table/base.py:395) return sampled_data

File [c:\Users\Charles](file:///C:/Users/Charles) VU\Documents\GitHub\Stage\.venv\lib\site-packages\sdv\sampling\hierarchical_sampler.py:222, in BaseHierarchicalSampler._sample(self, scale)
    [220](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/sampling/hierarchical_sampler.py:220)     LOGGER.info(f'Sampling {num_rows} rows from table {table}')
    [221](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/sampling/hierarchical_sampler.py:221)     sampled_data[table] = self._sample_rows(synthesizer, num_rows)
--> [222](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/sampling/hierarchical_sampler.py:222)     self._sample_children(table_name=table, sampled_data=sampled_data)
    [224](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/sampling/hierarchical_sampler.py:224) added_relationships = set()
    [225](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/sampling/hierarchical_sampler.py:225) for relationship in self.metadata.relationships:

File [c:\Users\Charles](file:///C:/Users/Charles) VU\Documents\GitHub\Stage\.venv\lib\site-packages\sdv\sampling\hierarchical_sampler.py:142, in BaseHierarchicalSampler._sample_children(self, table_name, sampled_data)
    [140](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/sampling/hierarchical_sampler.py:140) if child_name not in sampled_data:  # Sample based on only 1 parent
    [141](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/sampling/hierarchical_sampler.py:141)     for _, row in sampled_data[table_name].iterrows():
--> [142](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/sampling/hierarchical_sampler.py:142)         self._add_child_rows(
    [143](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/sampling/hierarchical_sampler.py:143)             child_name=child_name,
    [144](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/sampling/hierarchical_sampler.py:144)             parent_name=table_name,
    [145](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/sampling/hierarchical_sampler.py:145)             parent_row=row,
    [146](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/sampling/hierarchical_sampler.py:146)             sampled_data=sampled_data
    [147](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/sampling/hierarchical_sampler.py:147)         )
    [149](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/sampling/hierarchical_sampler.py:149)     if child_name not in sampled_data:  # No child rows sampled, force row creation
    [150](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/sampling/hierarchical_sampler.py:150)         foreign_key = self.metadata._get_foreign_keys(table_name, child_name)[0]

File [c:\Users\Charles](file:///C:/Users/Charles) VU\Documents\GitHub\Stage\.venv\lib\site-packages\sdv\sampling\hierarchical_sampler.py:108, in BaseHierarchicalSampler._add_child_rows(self, child_name, parent_name, parent_row, sampled_data, num_rows)
    [105](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/sampling/hierarchical_sampler.py:105)     num_rows = self._get_num_rows_from_parent(parent_row, child_name, foreign_key)
    [106](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/sampling/hierarchical_sampler.py:106) child_synthesizer = self._recreate_child_synthesizer(child_name, parent_name, parent_row)
--> [108](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/sampling/hierarchical_sampler.py:108) sampled_rows = self._sample_rows(child_synthesizer, num_rows)
    [110](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/sampling/hierarchical_sampler.py:110) if len(sampled_rows):
    [111](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/sampling/hierarchical_sampler.py:111)     parent_key = self.metadata.tables[parent_name].primary_key

File [c:\Users\Charles](file:///C:/Users/Charles) VU\Documents\GitHub\Stage\.venv\lib\site-packages\sdv\sampling\hierarchical_sampler.py:71, in BaseHierarchicalSampler._sample_rows(self, synthesizer, num_rows)
     [69](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/sampling/hierarchical_sampler.py:69) if num_rows is None:
     [70](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/sampling/hierarchical_sampler.py:70)     num_rows = synthesizer._num_rows
---> [71](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/sampling/hierarchical_sampler.py:71) return synthesizer._sample_batch(int(num_rows), keep_extra_columns=True)

File [c:\Users\Charles](file:///C:/Users/Charles) VU\Documents\GitHub\Stage\.venv\lib\site-packages\sdv\single_table\base.py:602, in BaseSingleTableSynthesizer._sample_batch(self, batch_size, max_tries, conditions, transformed_conditions, float_rtol, progress_bar, output_file_path, keep_extra_columns)
    [600](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/single_table/base.py:600) while num_valid < batch_size and counter < max_tries:
    [601](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/single_table/base.py:601)     prev_num_valid = num_valid
--> [602](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/single_table/base.py:602)     sampled, num_valid = self._sample_rows(
    [603](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/single_table/base.py:603)         num_rows_to_sample,
    [604](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/single_table/base.py:604)         conditions,
    [605](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/single_table/base.py:605)         transformed_conditions,
    [606](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/single_table/base.py:606)         float_rtol,
    [607](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/single_table/base.py:607)         sampled,
    [608](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/single_table/base.py:608)         keep_extra_columns
    [609](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/single_table/base.py:609)     )
    [611](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/single_table/base.py:611)     num_new_valid_rows = num_valid - prev_num_valid
    [612](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/single_table/base.py:612)     num_increase = min(num_new_valid_rows, remaining)

File [c:\Users\Charles](file:///C:/Users/Charles) VU\Documents\GitHub\Stage\.venv\lib\site-packages\sdv\single_table\base.py:519, in BaseSingleTableSynthesizer._sample_rows(self, num_rows, conditions, transformed_conditions, float_rtol, previous_rows, keep_extra_columns)
    [516](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/single_table/base.py:516)     except NotImplementedError:
    [517](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/single_table/base.py:517)         raw_sampled = self._sample(num_rows)
--> [519](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/single_table/base.py:519) sampled = self._data_processor.reverse_transform(raw_sampled)
    [520](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/single_table/base.py:520) if keep_extra_columns:
    [521](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/single_table/base.py:521)     input_columns = self._data_processor._hyper_transformer._input_columns

File [c:\Users\Charles](file:///C:/Users/Charles) VU\Documents\GitHub\Stage\.venv\lib\site-packages\sdv\data_processing\data_processor.py:851, in DataProcessor.reverse_transform(self, data, reset_keys)
    [849](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/data_processing/data_processor.py:849) try:
    [850](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/data_processing/data_processor.py:850)     if not data.empty:
--> [851](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/data_processing/data_processor.py:851)         reversed_data = self._hyper_transformer.reverse_transform_subset(
    [852](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/data_processing/data_processor.py:852)             data[reversible_columns]
    [853](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/data_processing/data_processor.py:853)         )
    [854](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/data_processing/data_processor.py:854) except rdt.errors.NotFittedError:
    [855](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/sdv/data_processing/data_processor.py:855)     LOGGER.info(f'HyperTransformer has not been fitted for table {self.table_name}')

File [c:\Users\Charles](file:///C:/Users/Charles) VU\Documents\GitHub\Stage\.venv\lib\site-packages\rdt\hyper_transformer.py:887, in HyperTransformer.reverse_transform_subset(self, data)
    [876](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/rdt/hyper_transformer.py:876) def reverse_transform_subset(self, data):
    [877](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/rdt/hyper_transformer.py:877)     """Revert the transformations for a subset of the fitted columns.
    [878](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/rdt/hyper_transformer.py:878) 
    [879](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/rdt/hyper_transformer.py:879)     Args:
   (...)
    [885](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/rdt/hyper_transformer.py:885)             Reversed subset.
    [886](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/rdt/hyper_transformer.py:886)     """
--> [887](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/rdt/hyper_transformer.py:887)     return self._reverse_transform(data, prevent_subset=False)

File [c:\Users\Charles](file:///C:/Users/Charles) VU\Documents\GitHub\Stage\.venv\lib\site-packages\rdt\hyper_transformer.py:870, in HyperTransformer._reverse_transform(self, data, prevent_subset)
    [868](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/rdt/hyper_transformer.py:868)         output_columns = transformer.get_output_columns()
    [869](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/rdt/hyper_transformer.py:869)         if output_columns and set(output_columns).issubset(data.columns):
--> [870](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/rdt/hyper_transformer.py:870)             data = transformer.reverse_transform(data)
    [872](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/rdt/hyper_transformer.py:872) reversed_columns = self._subset(self._input_columns, data.columns)
    [874](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/rdt/hyper_transformer.py:874) return data.reindex(columns=reversed_columns)

File [c:\Users\Charles](file:///C:/Users/Charles) VU\Documents\GitHub\Stage\.venv\lib\site-packages\rdt\transformers\base.py:55, in random_state.<locals>.wrapper(self, *args, **kwargs)
     [53](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/rdt/transformers/base.py:53) method_name = function.__name__
     [54](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/rdt/transformers/base.py:54) with set_random_states(self.random_states, method_name, self.set_random_state):
---> [55](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/rdt/transformers/base.py:55)     return function(self, *args, **kwargs)

File [c:\Users\Charles](file:///C:/Users/Charles) VU\Documents\GitHub\Stage\.venv\lib\site-packages\rdt\transformers\base.py:477, in BaseTransformer.reverse_transform(self, data)
    [475](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/rdt/transformers/base.py:475) data = data.copy()
    [476](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/rdt/transformers/base.py:476) columns_data = self._get_columns_data(data, self.output_columns)
--> [477](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/rdt/transformers/base.py:477) reversed_data = self._reverse_transform(columns_data)
    [478](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/rdt/transformers/base.py:478) data = data.drop(self.output_columns, axis=1)
    [479](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/rdt/transformers/base.py:479) data = self._add_columns_to_data(data, reversed_data, self.columns)

File [c:\Users\Charles](file:///C:/Users/Charles) VU\Documents\GitHub\Stage\.venv\lib\site-packages\rdt\transformers\categorical.py:195, in UniformEncoder._reverse_transform(self, data)
    [192](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/rdt/transformers/categorical.py:192)         labels.append(key)
    [194](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/rdt/transformers/categorical.py:194) result = pd.cut(data, bins=bins, labels=labels, include_lowest=True)
--> [195](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/rdt/transformers/categorical.py:195) return result.replace(nan_name, np.nan).astype(self.dtype)

File [c:\Users\Charles](file:///C:/Users/Charles) VU\Documents\GitHub\Stage\.venv\lib\site-packages\pandas\core\generic.py:6534, in NDFrame.astype(self, dtype, copy, errors)
   [6530](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/generic.py:6530)     results = [ser.astype(dtype, copy=copy) for _, ser in self.items()]
   [6532](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/generic.py:6532) else:
   [6533](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/generic.py:6533)     # else, only a single dtype is given
-> [6534](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/generic.py:6534)     new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors)
   [6535](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/generic.py:6535)     res = self._constructor_from_mgr(new_data, axes=new_data.axes)
   [6536](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/generic.py:6536)     return res.__finalize__(self, method="astype")

File [c:\Users\Charles](file:///C:/Users/Charles) VU\Documents\GitHub\Stage\.venv\lib\site-packages\pandas\core\internals\managers.py:414, in BaseBlockManager.astype(self, dtype, copy, errors)
    [411](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/internals/managers.py:411) elif using_copy_on_write():
    [412](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/internals/managers.py:412)     copy = False
--> [414](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/internals/managers.py:414) return self.apply(
    [415](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/internals/managers.py:415)     "astype",
    [416](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/internals/managers.py:416)     dtype=dtype,
    [417](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/internals/managers.py:417)     copy=copy,
    [418](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/internals/managers.py:418)     errors=errors,
    [419](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/internals/managers.py:419)     using_cow=using_copy_on_write(),
    [420](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/internals/managers.py:420) )

File [c:\Users\Charles](file:///C:/Users/Charles) VU\Documents\GitHub\Stage\.venv\lib\site-packages\pandas\core\internals\managers.py:354, in BaseBlockManager.apply(self, f, align_keys, **kwargs)
    [352](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/internals/managers.py:352)         applied = b.apply(f, **kwargs)
    [353](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/internals/managers.py:353)     else:
--> [354](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/internals/managers.py:354)         applied = getattr(b, f)(**kwargs)
    [355](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/internals/managers.py:355)     result_blocks = extend_blocks(applied, result_blocks)
    [357](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/internals/managers.py:357) out = type(self).from_blocks(result_blocks, self.axes)

File [c:\Users\Charles](file:///C:/Users/Charles) VU\Documents\GitHub\Stage\.venv\lib\site-packages\pandas\core\internals\blocks.py:616, in Block.astype(self, dtype, copy, errors, using_cow)
    [596](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/internals/blocks.py:596) """
    [597](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/internals/blocks.py:597) Coerce to the new dtype.
    [598](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/internals/blocks.py:598) 
   (...)
    [612](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/internals/blocks.py:612) Block
    [613](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/internals/blocks.py:613) """
    [614](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/internals/blocks.py:614) values = self.values
--> [616](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/internals/blocks.py:616) new_values = astype_array_safe(values, dtype, copy=copy, errors=errors)
    [618](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/internals/blocks.py:618) new_values = maybe_coerce_values(new_values)
    [620](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/internals/blocks.py:620) refs = None

File [c:\Users\Charles](file:///C:/Users/Charles) VU\Documents\GitHub\Stage\.venv\lib\site-packages\pandas\core\dtypes\astype.py:238, in astype_array_safe(values, dtype, copy, errors)
    [235](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/dtypes/astype.py:235)     dtype = dtype.numpy_dtype
    [237](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/dtypes/astype.py:237) try:
--> [238](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/dtypes/astype.py:238)     new_values = astype_array(values, dtype, copy=copy)
    [239](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/dtypes/astype.py:239) except (ValueError, TypeError):
    [240](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/dtypes/astype.py:240)     # e.g. _astype_nansafe can fail on object-dtype of strings
    [241](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/dtypes/astype.py:241)     #  trying to convert to float
    [242](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/dtypes/astype.py:242)     if errors == "ignore":

File [c:\Users\Charles](file:///C:/Users/Charles) VU\Documents\GitHub\Stage\.venv\lib\site-packages\pandas\core\dtypes\astype.py:180, in astype_array(values, dtype, copy)
    [176](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/dtypes/astype.py:176)     return values
    [178](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/dtypes/astype.py:178) if not isinstance(values, np.ndarray):
    [179](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/dtypes/astype.py:179)     # i.e. ExtensionArray
--> [180](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/dtypes/astype.py:180)     values = values.astype(dtype, copy=copy)
    [182](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/dtypes/astype.py:182) else:
    [183](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/dtypes/astype.py:183)     values = _astype_nansafe(values, dtype, copy=copy)

File [c:\Users\Charles](file:///C:/Users/Charles) VU\Documents\GitHub\Stage\.venv\lib\site-packages\pandas\core\arrays\categorical.py:550, in Categorical.astype(self, dtype, copy)
    [547](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/arrays/categorical.py:547)     return super().astype(dtype, copy=copy)
    [549](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/arrays/categorical.py:549) elif dtype.kind in "iu" and self.isna().any():
--> [550](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/arrays/categorical.py:550)     raise ValueError("Cannot convert float NaN to integer")
    [552](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/arrays/categorical.py:552) elif len(self.codes) == 0 or len(self.categories) == 0:
    [553](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/arrays/categorical.py:553)     result = np.array(
    [554](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/arrays/categorical.py:554)         self,
    [555](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/arrays/categorical.py:555)         dtype=dtype,
    [556](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/arrays/categorical.py:556)         copy=copy,
    [557](file:///C:/Users/Charles%20VU/Documents/GitHub/Stage/.venv/lib/site-packages/pandas/core/arrays/categorical.py:557)     )

ValueError: Cannot convert float NaN to integer

@epi-cvu epi-cvu added bug Something isn't working new Automatic label applied to new issues labels Feb 8, 2024
@epi-cvu epi-cvu closed this as completed Feb 8, 2024
@npatki
Copy link
Contributor

npatki commented Feb 9, 2024

Seems like this issue was resolved by modifying the default distribution to 'norm'. The root cause is likely #1691. The good news is that this issue has already been resolved and the fix should be available in the upcoming SDV release (0.10.0). After this release, you can leave the default distribution as-is and it should work without failing.

@npatki npatki added resolution:duplicate This issue or pull request already exists and removed new Automatic label applied to new issues labels Feb 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working resolution:duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

2 participants