Skip to content

Commit

Permalink
Fix to check if values are integer or float and convert accordingly. (#…
Browse files Browse the repository at this point in the history
…21277)

This code will prevent the loss of data if the value is a float it will convert to float if it is not then int.  It will use pd.Float64Dtype() for floats instead of using the the pd.Int64Dtype(). Since there could be floating-point values in the array this will fix the exception for safely casting the array to data type.
fixes error when using mysql_to_s3 (TypeError: cannot safely cast non-equivalent object to int64) #16919
  • Loading branch information
SasanAhmadi committed Feb 6, 2022
1 parent 0bcca55 commit 0a6ea57
Showing 1 changed file with 5 additions and 1 deletion.
6 changes: 5 additions & 1 deletion airflow/providers/amazon/aws/transfers/sql_to_s3.py
Expand Up @@ -127,10 +127,14 @@ def _fix_int_dtypes(df: pd.DataFrame) -> None:
if "float" in df[col].dtype.name and df[col].hasnans:
# inspect values to determine if dtype of non-null values is int or float
notna_series = df[col].dropna().values
if np.isclose(notna_series, notna_series.astype(int)).all():
if np.equal(notna_series, notna_series.astype(int)).all():
# set to dtype that retains integers and supports NaNs
df[col] = np.where(df[col].isnull(), None, df[col])
df[col] = df[col].astype(pd.Int64Dtype())
elif np.isclose(notna_series, notna_series.astype(int)).all():
# set to float dtype that retains floats and supports NaNs
df[col] = np.where(df[col].isnull(), None, df[col])
df[col] = df[col].astype(pd.Float64Dtype())

def execute(self, context: 'Context') -> None:
sql_hook = self._get_hook()
Expand Down

0 comments on commit 0a6ea57

Please sign in to comment.