Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: pandas.read_sql missing table schema input #43585

Open
3 tasks done
vinodMS opened this issue Sep 15, 2021 · 0 comments
Open
3 tasks done

BUG: pandas.read_sql missing table schema input #43585

vinodMS opened this issue Sep 15, 2021 · 0 comments
Labels
Bug IO SQL to_sql, read_sql, read_sql_query Needs Discussion Requires discussion from core team before further action

Comments

@vinodMS
Copy link

vinodMS commented Sep 15, 2021

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the master branch of pandas.

Reproducible Example

import pandas as pd

pd.read_sql('test_data', 'postgres:///db_name')

Issue Description

This specific method wraps around the read_sql_query and read_sql_table methods, where read_sql checks if the sql parameter contains a valid table name, if not it assumes that the sql parameter contains a sql query and routes the request to read_sql_query.

I'm using a SQL Database, hence read_sql uses SQLDatabase.has_table method to determine whether or not a valid table name was provided as part of the sql argument.

The problem here is that the following line in the has_table method requires a table schema name(if default schema table is not used)
return insp.has_table(name, schema or self.meta.schema)
but there is no schema name passed when instantiating the SQLDatabase class nor is a schema name given when calling the has_table method from read_sql.

Expected Behavior

df = pd.read_sql('test_data', 'postgres:///db_name') should be able to find the 'test_data' table that has a custom table schema.

in order to solve the bug the following could be done,

  1. Add a schema parameter to the read_sql method (similar to that of read_sql_table)
  2. And then this line should be updated to this - _is_table_name = pandas_sql.has_table(sql, schema)

Happy to create a PR if agreed.

Installed Versions

INSTALLED VERSIONS

commit : None
python : 3.8.3.final.0
python-bits : 64
OS : Darwin
OS-release : 20.5.0
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.UTF-8

pandas : 1.0.5
numpy : 1.18.5
pytz : 2020.1
dateutil : 2.8.1
pip : 20.1.1
setuptools : 49.2.0.post20200714
Cython : 0.29.21
pytest : 5.4.3
hypothesis : None
sphinx : 3.1.2
blosc : None
feather : None
xlsxwriter : 1.2.9
lxml.etree : 4.5.2
html5lib : 1.1
pymysql : 1.0.2
psycopg2 : None
jinja2 : 2.11.2
IPython : 7.16.1
pandas_datareader: None
bs4 : 4.9.1
bottleneck : 1.3.2
fastparquet : None
gcsfs : None
lxml.etree : 4.5.2
matplotlib : 3.2.2
numexpr : 2.7.1
odfpy : None
openpyxl : 3.0.4
pandas_gbq : None
pyarrow : None
pytables : None
pytest : 5.4.3
pyxlsb : None
s3fs : None
scipy : 1.5.0
sqlalchemy : 1.3.18
tables : 3.6.1
tabulate : None
xarray : None
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : 1.2.9
numba : 0.50.1

@vinodMS vinodMS added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 15, 2021
@mroeschke mroeschke added IO SQL to_sql, read_sql, read_sql_query Needs Discussion Requires discussion from core team before further action and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO SQL to_sql, read_sql, read_sql_query Needs Discussion Requires discussion from core team before further action
Projects
None yet
Development

No branches or pull requests

2 participants