-
Notifications
You must be signed in to change notification settings - Fork 16.5k
Description
Currently (as far as I understand reading the code base, the docs in pandas etc and making some tests in my code) you can only have a string (or list of them) as query in the methods related to queries in DBApi: get_pandas_df, run, get_records and get_first.
It would be nice being able to read the queries from a file instead of a string in the script. The code for getting the query could be something like the following:
with open(f'{filename}') as f:
raw_query = f.read().rstrip("\n")
sql_query = eval(f'f"""{raw_query}"""')
And after that, you just have to pass sql_query to any method.
In a regular python script, considering a file called 'test_query.sql' with "SELECT * FROM {table}", the following will work:
table = 'orders'
with open(f'{filename}') as f:
raw_query = f.read().rstrip("\n")
sql_query = eval(f'f"""{raw_query}"""')
And sql_query will be: "SELECT * FROM orders"
However, I am having troubles finding out a way to have this working correctly in the methods of DBApi hook, mainly for passing the parameters. I don't come up with any solution that doesn't involve to pass every parameter that you have in the query, which can be very painful if you have a very long query with a lot of parameters.
So, ideally (considering the same file as above), the following will work:
table = 'orders'
redshift = PostgresHook('redshift_conn')
df = redshift.get_pandas_df('test_query.sql')
Do you have any ideas? I will open the PR and work on it if a good idea arises.