Skip to content

Conversation

@jorisvandenbossche
Copy link
Member

Further work on #6292. While looking at possible multi-index support, I thought of first adding this:

to do:

  • should also change this in generic.py
  • check for name conflicts (warn and suggest to use index_label?)

@mangecoeur

@jreback
Copy link
Contributor

jreback commented Mar 14, 2014

why don't you just have index accept a string / list / True

True means use the index name (raise if it is None)
string use this as the cindex label
list use if multiple cols (and raise if not a mi)

false / none are don't store index

instead of adding another kw

@jorisvandenbossche
Copy link
Member Author

I did that for consistency with to_csv (and to_excel)

@jreback
Copy link
Contributor

jreback commented Mar 14, 2014

ok then!
didn't realize that

@jreback
Copy link
Contributor

jreback commented Mar 14, 2014

should think about what to do if u have an unnamed column (I know you are defaulting it) but maybe should warn/raise? (I mean an unnamed index an no label is specified)

@jorisvandenbossche
Copy link
Member Author

What do you mean with an unnamed column?

Also, now 'pandas_index' is used when the index has no name (and not just 'index'). But maybe should also think about that. Is there any precedence somewhere in pandas? df.reset_index() just uses 'index'.

@jreback
Copy link
Contributor

jreback commented Mar 14, 2014

yep maybe need to do exactly like reset index
have to worry about name conflicts though (what if their is a column named index )

@jorisvandenbossche
Copy link
Member Author

The name conflicts is now also a problem (if you would have a column named pandas_index). It doesn't give an error, but the column is overwritten by the index if the names are the same. So maybe just raise a warning for this?

@jreback
Copy link
Contributor

jreback commented Mar 15, 2014

yep

and I would name index
and level_0, etc for mi ( in fact u can just do a reset_index directly)

@mangecoeur
Copy link
Contributor

My concern with using the name "index" is that it is not allowed as a column name in certain SQL flavors (this is a problem for other reserved keywords too). See for example http://dev.mysql.com/doc/refman/5.6/en/reserved-words.html

Note - i don't know if they make the distinction between lower case and upper case versions

@jorisvandenbossche
Copy link
Member Author

@mangecoeur Good concern (and sql mostly makes no distinction between lower and upper case, with some exceptions).
But after some thinking about it, there are a lot of reserved keywords, so this could be a problem more in general. But the way this is dealt with, is by quoting those column names. For example a dataframe that I wrote to postgresql using to_sql, gets this description in postgresql:

CREATE TABLE test_column_keyword
(
  index integer,
  "select" integer,
  "Col2" double precision
)

So the select column is quoted because this is a reserved keyword, the Col2 column is also quoted because it is mixed with uppercase (and because postgresql otherwise converts everything to lowercase, the column name will not be recognised/reserved), and the index column is not quoted as this is not a reserved keyword in postgresql (but it is in MySQL).

So for postgresql this is working, and I suppose other database systems will have a similar mechanism to deal with this kind of column names?

@jreback jreback added this to the 0.14.0 milestone Mar 22, 2014
@jreback jreback added the SQL label Mar 22, 2014
@jorisvandenbossche
Copy link
Member Author

OK, to move forward, I am going to merge this. In any case will also have to touch it again (the naming issue) when adding multi-index support.

jorisvandenbossche added a commit that referenced this pull request Mar 28, 2014
SQL: add index_label keyword to to_sql
@jorisvandenbossche jorisvandenbossche merged commit de167f7 into pandas-dev:master Mar 28, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

IO SQL to_sql, read_sql, read_sql_query

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants