Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

insert_dataframe() need dataframe's column order exactly same as clickhouse table's column order #245

Closed
zsmeijin opened this issue Aug 25, 2021 · 1 comment

Comments

@zsmeijin
Copy link

zsmeijin commented Aug 25, 2021

Describe the bug

In BlockOutputStream class, I found following code in write method:

for i, (col_name, col_type) in enumerate(block.columns_with_types):
    write_binary_str(col_name, self.fout)
    write_binary_str(col_type, self.fout)

    if n_columns:
        try:
            items = block.get_column_by_index(i)
        except IndexError:
            raise ValueError('Different rows length')
        write_column(self.context, col_name, col_type, items,
                     self.fout, types_check=block.types_check)

it seems column data in block is got by index, rather than column name. so if dataframe's column order is not same as clickhouse table's column order, the data type can go wrong.

By now reindex dataframe's column order before insert is enough to solve this problem. I'm not sure if it is a design to improve insert performance, but at least this additional requirement should be documented

To Reproduce

use insert_dataframe() method to insert a dataframe who's column order is not same as clickhouse table's column order

Expected behavior

dataframe’s column and clickhouse table's column can be matched by name, rather than index

Versions

Version of package with the problem. = 0.2.1
Python version. = 3.8.9

@xzkostyan
Copy link
Member

Fix with arbitrary columns order was merged into master branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants