-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEAT: Implement read_csv for omniscidb backend #2062
Conversation
e87fe9e
to
23ef681
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a way to add a test on CI for this?
@jreback I can create new omnisci container with needed test datasets inside. After that, it remains only to write a test |
|
@anmyachev thanks for work on that! maybe you can add a test that creates a dataframe, use that to create a csv file (to_csv), load this csv using the backend |
23ef681
to
1796bab
Compare
|
hey @anmyachev it seems there are some errors on CI related to black format. to avoid that you can install git pre-commit hooks using |
|
|
2e48a44
to
753b40e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anmyachev in general this PR looks very good.
I liked your approach to create the csv file inside the OmniSciDB container!
I just have some questions/comments :
- related to the clickhouse-driver, why was it pinned to
>=0.1.3? doesn't work>=0.1.2? - maybe you could change from
quotedtoquotechar.. so it could be more generic. - not sure, maybe @jreback could give us a direction, but maybe you can also add
read_csvtoir.TableExprjust raisingNotImplementedError, so another backend could implement the same method with the same parameters. Also you move your tests from omniscidb/tests to tests/all ... so it would be easier when another backend implements this function. @jreback what is your thoughts about this?
|
IIRC this is just efficiently puts a file into the db from a csv, so a data-loading operation? I know postgresql supports this as well, not sure if its implemented. This is an alternative way of constructing a table (e.g. we normally introspect an existing one), or use the Client to create ones. Have to think about if we actually should add any api for this. For now I would not, but certainly raise an issue. |
99a714c
to
f53f821
Compare
…niscidb_load_data
|
@xmnlab PR is ready for review. While there is no exact decision whether all backends need to have this function, I suggest leaving it only for Omnisci and adding it already in release 1.3. This way we can get feedback from users. |
|
@jreback any feedback about this ? |
…niscidb_load_data
|
Green CI depends on #2104 |
|
@anmyachev could you check CI for py36 y py37 py38 is green because currently it skips tests for |
These two tests were fixed in last commit. |
|
@jreback PR is ready for review. |
ibis/omniscidb/tests/test_client.py
Outdated
| ('month_', 'int32'), | ||
| ] | ||
| ) | ||
| con.create_table(t_name, schema=schema) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use a fixture to handle resource creation & destruction rather than using a try/finally (you can do that in the fixture itself)
|
thanks @anmyachev |
continuation of #1977
The advantage of this approach is the avoidance of additional memory costs.
Problem: