Skip to content

BUG: using dtype=str in pd.read_sql_query casts nans to strings instead of nan #45091

@Husseinjd

Description

@Husseinjd

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the master branch of pandas.

Reproducible Example

# Table contains columns with nulls 
import sqlite3
import pandas as pd

con = sqlite3.connect('example.db')
cur = con.cursor()
cur.execute('''CREATE TABLE IF NOT EXISTS Sample
               (col1,col2)''')
cur.execute("INSERT INTO Sample VALUES ('val1',NULL)")
con.commit()

df = pd.read_sql_query("SELECT * from Sample", con, dtype=str)
con.close()

print(df.info()) 
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   col1    2 non-null      object
 1   col2    2 non-null      object
dtypes: object(2)
memory usage: 160.0+ bytes
print(df.to_markdown())

|    | col1   | col2   |
|---:|:-------|:-------|
|  0 | val1   | None   |
|  1 | val1   | None   |

Issue Description

Parsing dtypes to str on all columns results on ignoring nans and parsing nulls to strings instead of keeping them as nans.

Expected Behavior

Similar to the pd.read_csv(...,dtype=str) , where nans are taken into account.

Installed Versions

pandas=1.3.5 Python=3.8 pyodbc=4.0.3

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugNeeds TriageIssue that has not been reviewed by a pandas team member

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions