You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using gspread_dataframe to import data from Google Sheets. I'd like to force all imported data to strings independently of the cell type used in the worksheet.
The documentation mentions I can use all options supported by the Pandas text parsing readers. In theory dtype=str or dtype=object should force all values to be preserved without interpreting them. Somehow this is not true, not sure if it's a bug or I'm doing something wrong.
In the scenario below the imported dataframe has decimals dropped due to the fact that all Amounts are in number format in the worksheet. If I change the worksheet type to 'string' the desired outcome is correct, but I'm trying to avoid tweaking the file before importing the data.
import gspread
import gspread_dataframe as gsframe
gsframe.get_as_dataframe(
worksheet=sheet,
header=0,
dtype=str,
usecols=cols,
skiprows=row_offset,
skip_blank_lines=True).dropna(axis = 0, how = 'all').fillna('')
Worksheet Imported Dataframe
string numbers string string
Name Amount Name Amount
A -25.00 A -25
B -63.00 B -63
C 20.00 C 20
D -10.00 D -10
▲ dropped decimals
Expected outcome
Worksheet Imported Dataframe
string numbers string string
Name Amount Name Amount
A -25.00 A -25.00
B -63.00 B -63.00
C 20.00 C 20.00
D -10.00 D -10.00
The text was updated successfully, but these errors were encountered:
MrBeardedGuy
changed the title
Data gets interpreted anyways even when using dtype=str
Data gets interpreted even when using dtype=str
Dec 18, 2021
Closing this one. I've found the issue. The interpretation happens once the data gets added back to the sheet. New rows in Google Sheets get created with type automatic by default. That's what'c causing the decimals inconsistency.
I'm using
gspread_dataframe
to import data from Google Sheets. I'd like to force all imported data to strings independently of the cell type used in the worksheet.The documentation mentions I can use all options supported by the Pandas text parsing readers. In theory
dtype=str
ordtype=object
should force all values to be preserved without interpreting them. Somehow this is not true, not sure if it's a bug or I'm doing something wrong.In the scenario below the imported dataframe has decimals dropped due to the fact that all
Amounts
are innumber
format in the worksheet. If I change the worksheet type to 'string' the desired outcome is correct, but I'm trying to avoid tweaking the file before importing the data.Expected outcome
The text was updated successfully, but these errors were encountered: