SQL: add index_label keyword to to_sql #6642

jorisvandenbossche · 2014-03-14T23:01:01Z

Further work on #6292. While looking at possible multi-index support, I thought of first adding this:

added ability to specify the used column name for the index column in to_sql (analoguous to to_csv). Good idea?
I only did it for the new sqlalchemy function, not the legacy one. Only problem is that it starts from the same function call, so all keyword arguments have also to be added to the legacy to_sql (https://github.com/jorisvandenbossche/pandas/compare/sql-multiindex?expand=1#diff-b41f9fd042c423682f8e4c4d808dbe64R891) without using it. Is there a better approach? Should I warn that this is ignored if the user specifies this?
added tests for it

to do:

should also change this in generic.py
check for name conflicts (warn and suggest to use index_label?)

jreback · 2014-03-14T23:10:08Z

why don't you just have index accept a string / list / True

True means use the index name (raise if it is None)
string use this as the cindex label
list use if multiple cols (and raise if not a mi)

false / none are don't store index

instead of adding another kw

jorisvandenbossche · 2014-03-14T23:23:14Z

I did that for consistency with to_csv (and to_excel)

jreback · 2014-03-14T23:28:17Z

ok then!
didn't realize that

jreback · 2014-03-14T23:33:49Z

should think about what to do if u have an unnamed column (I know you are defaulting it) but maybe should warn/raise? (I mean an unnamed index an no label is specified)

jorisvandenbossche · 2014-03-14T23:41:13Z

What do you mean with an unnamed column?

Also, now 'pandas_index' is used when the index has no name (and not just 'index'). But maybe should also think about that. Is there any precedence somewhere in pandas? df.reset_index() just uses 'index'.

jreback · 2014-03-14T23:51:21Z

yep maybe need to do exactly like reset index
have to worry about name conflicts though (what if their is a column named index )

jorisvandenbossche · 2014-03-15T00:05:52Z

The name conflicts is now also a problem (if you would have a column named pandas_index). It doesn't give an error, but the column is overwritten by the index if the names are the same. So maybe just raise a warning for this?

jreback · 2014-03-15T00:09:51Z

yep

and I would name index
and level_0, etc for mi ( in fact u can just do a reset_index directly)

mangecoeur · 2014-03-15T21:22:38Z

My concern with using the name "index" is that it is not allowed as a column name in certain SQL flavors (this is a problem for other reserved keywords too). See for example http://dev.mysql.com/doc/refman/5.6/en/reserved-words.html

Note - i don't know if they make the distinction between lower case and upper case versions

jorisvandenbossche · 2014-03-16T13:05:03Z

@mangecoeur Good concern (and sql mostly makes no distinction between lower and upper case, with some exceptions).
But after some thinking about it, there are a lot of reserved keywords, so this could be a problem more in general. But the way this is dealt with, is by quoting those column names. For example a dataframe that I wrote to postgresql using to_sql, gets this description in postgresql:

CREATE TABLE test_column_keyword
(
  index integer,
  "select" integer,
  "Col2" double precision
)

So the select column is quoted because this is a reserved keyword, the Col2 column is also quoted because it is mixed with uppercase (and because postgresql otherwise converts everything to lowercase, the column name will not be recognised/reserved), and the index column is not quoted as this is not a reserved keyword in postgresql (but it is in MySQL).

So for postgresql this is working, and I suppose other database systems will have a similar mechanism to deal with this kind of column names?

jorisvandenbossche · 2014-03-28T20:18:31Z

OK, to move forward, I am going to merge this. In any case will also have to touch it again (the naming issue) when adding multi-index support.

SQL: add index_label keyword to to_sql

jorisvandenbossche closed this Mar 15, 2014

jorisvandenbossche reopened this Mar 15, 2014

jreback added this to the 0.14.0 milestone Mar 22, 2014

jreback added the SQL label Mar 22, 2014

SQL: add index_label keyword to to_sql

578ae49

jorisvandenbossche added a commit that referenced this pull request Mar 28, 2014

Merge pull request #6642 from jorisvandenbossche/sql-multiindex

de167f7

SQL: add index_label keyword to to_sql

jorisvandenbossche merged commit de167f7 into pandas-dev:master Mar 28, 2014

jorisvandenbossche mentioned this pull request Mar 30, 2014

ENH: SQL multiindex support #6735

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

SQL: add index_label keyword to to_sql #6642

SQL: add index_label keyword to to_sql #6642

Uh oh!

jorisvandenbossche commented Mar 14, 2014

Uh oh!

jreback commented Mar 14, 2014

Uh oh!

jorisvandenbossche commented Mar 14, 2014

Uh oh!

jreback commented Mar 14, 2014

Uh oh!

jreback commented Mar 14, 2014

Uh oh!

jorisvandenbossche commented Mar 14, 2014

Uh oh!

jreback commented Mar 14, 2014

Uh oh!

jorisvandenbossche commented Mar 15, 2014

Uh oh!

jreback commented Mar 15, 2014

Uh oh!

mangecoeur commented Mar 15, 2014

Uh oh!

jorisvandenbossche commented Mar 16, 2014

Uh oh!

jorisvandenbossche commented Mar 28, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

SQL: add index_label keyword to to_sql #6642

SQL: add index_label keyword to to_sql #6642

Uh oh!

Conversation

jorisvandenbossche commented Mar 14, 2014

Uh oh!

jreback commented Mar 14, 2014

Uh oh!

jorisvandenbossche commented Mar 14, 2014

Uh oh!

jreback commented Mar 14, 2014

Uh oh!

jreback commented Mar 14, 2014

Uh oh!

jorisvandenbossche commented Mar 14, 2014

Uh oh!

jreback commented Mar 14, 2014

Uh oh!

jorisvandenbossche commented Mar 15, 2014

Uh oh!

jreback commented Mar 15, 2014

Uh oh!

mangecoeur commented Mar 15, 2014

Uh oh!

jorisvandenbossche commented Mar 16, 2014

Uh oh!

jorisvandenbossche commented Mar 28, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants