Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix column names using df in upload #964

Merged
merged 5 commits into from
Sep 12, 2019
Merged

Conversation

oleurud
Copy link
Contributor

@oleurud oleurud commented Sep 10, 2019

Fix: #947 & #914

Copy link
Contributor

@esloho esloho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@@ -89,10 +86,27 @@ def _copyfrom(self, normalized_column_names, with_lnglat):
columns_origin,
with_lnglat,
geom_col,
enc_type)
enc_type,
len(columns_normalized))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this length needed now? Oh, is it because with lnglat columns_origin would have a column less than normalized columns, which has the_geom?

Copy link
Contributor Author

@oleurud oleurud Sep 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, how we get the columns is a bit weird, basically, because we have the responsibility spread in several methods. With that, I want to ensure the same number of columns in the COPY query and in the _rows method. And yes, it smells.

Case 1: _rows is receiving:

  • cols: the column names, but without the geometry one
  • geom_col: the geom one or None
  • with_lnglat: to generate geom column by latitude and longitude columns

We could create 2 geom columns (if geom_col and with_lnglat are set). The number of column allow us to avoid that.

Case 2: filtering column names:
We are filtering column names, getting the geom in different places, getting columns in different places (table creation, copy query, _rows), ... As a result, we could get a different number of columns in different places. This is the real source of the bad smell

Copy link
Member

@jgoizueta jgoizueta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the fix I think it's an improvement to have moved some stuff to _copyfrom_column_names (and it allowed for the simple tests). I find it still confusing that origin/normalized columns business with columns in different orders etc. (it took me a while to understand and accept it was correct). I would be nice to refactor that but I don't think that's the scope of this PR. Thanks for fixing this!

@oleurud
Copy link
Contributor Author

oleurud commented Sep 12, 2019

I find it still confusing that origin/normalized columns business with columns in different orders etc. (it took me a while to understand and accept it was correct). I would be nice to refactor that

Yes, I agree. With this one and the next one, we will be close to it, but we will need more steps until having a good solution.

@oleurud oleurud merged commit 13b3c26 into develop Sep 12, 2019
@Jesus89 Jesus89 deleted the fix/947-df-no-geom-upload branch September 30, 2019 16:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error uploading dataframes
4 participants