Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check schema with Decimal not working in Pandera #926

Closed
tpcgold opened this issue Aug 29, 2022 · 4 comments · Fixed by #956
Closed

Check schema with Decimal not working in Pandera #926

tpcgold opened this issue Aug 29, 2022 · 4 comments · Fixed by #956
Labels
bug Something isn't working

Comments

@tpcgold
Copy link

tpcgold commented Aug 29, 2022

Seems like it's not possible to check a decimal with pandera schema:

import pandera as pa

schema = pa.DataFrameSchema({
    'sample': pa.Column(
        pa.Decimal(precision=13, scale=2))
})
Traceback (most recent call last):
  File "G:\My Drive\PycharmProjects\Development\venv\lib\site-packages\pandera\engines\pandas_engine.py", line 153, in dtype
    return engine.Engine.dtype(cls, data_type)
  File "G:\My Drive\PycharmProjects\Development\venv\lib\site-packages\pandera\engines\engine.py", line 210, in dtype
    raise TypeError(
TypeError: Data type 'Decimal(13, 2)' not understood by Engine.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "C:\Users\stefa\AppData\Local\Programs\Python\Python39\lib\code.py", line 90, in runcode
    exec(code, self.locals)
  File "<input>", line 108, in <module>
  File "G:\My Drive\PycharmProjects\Development\venv\lib\site-packages\pandera\schema_components.py", line 81, in __init__
    super().__init__(
  File "G:\My Drive\PycharmProjects\Development\venv\lib\site-packages\pandera\schemas.py", line 1713, in __init__
    self.dtype = dtype  # type: ignore
  File "G:\My Drive\PycharmProjects\Development\venv\lib\site-packages\pandera\schemas.py", line 1820, in dtype
    self._dtype = pandas_engine.Engine.dtype(value) if value else None
  File "G:\My Drive\PycharmProjects\Development\venv\lib\site-packages\pandera\engines\pandas_engine.py", line 171, in dtype
    np_or_pd_dtype = pd.api.types.pandas_dtype(data_type)
  File "G:\My Drive\PycharmProjects\Development\venv\lib\site-packages\pandas\core\dtypes\common.py", line 1777, in pandas_dtype
    npdtype = np.dtype(dtype)
TypeError: Cannot interpret 'DataType(Decimal(13, 2))' as a data type
@tpcgold tpcgold added the bug Something isn't working label Aug 29, 2022
@tpcgold tpcgold changed the title Check schema with Double not working in Pandera Check schema with Decimal not working in Pandera Aug 29, 2022
@tpcgold
Copy link
Author

tpcgold commented Aug 29, 2022

Altough this might work for some developers:

import pandera as pa

schema = pa.DataFrameSchema({
    'sample': pa.Column(
        Decimal)
})

Decimal(28,0) and others like for example Decimal(13,1) are not the same and should be covered by the schema check

@NathanEmb
Copy link

I encountered this issue as well, both when specifying a custom scale and when trying to use default values

@cosmicBboy
Copy link
Collaborator

This is definitely a bug!

Looking into a fix...

In the mean time, you can use the pandas_engine datatype here https://pandera.readthedocs.io/en/stable/reference/generated/pandera.engines.pandas_engine.Decimal.html#pandera.engines.pandas_engine.Decimal

import pandera as pa
from pandera.engines.pandas_engine import Decimal

schema = pa.DataFrameSchema({
    'sample': pa.Column(Decimal(13, 1))
})
print(schema)

@cosmicBboy cosmicBboy mentioned this issue Oct 7, 2022
@cosmicBboy
Copy link
Collaborator

@tpcgold @NathanEmb #956 should address this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants