## Reindexing DataFrames

The **index** of a DataFrame is the set of labels for rows. Just as columns have names, or labels, rows have indexes. By default, these indexes are just numbers, starting from 0. But, you can change the index to anything you want, such as months of the year, participant ID codes, etc.. You can also use the index to sort the DataFrame, or access specific rows.

If you're familiar with Excel or other spreadsheet software, you're familiar with indices - they are the row numbers in grey that appear on the left side of the spreadsheet:

<img src='images/spreadsheet_indices.png' width=150>

In pandas, indexes have some "superpowers" relative to Excel: firstly, although they are numbers by default, you can change them top whatever you want. Secondly, a DataFrame can have more than one index (multi-indexing, which we'll cover later).

The DataCamp lesson introduces the (convenient, but not universal) convention of calling the individual labels for rows *indices*, while *indexes* refers to the more general class of labeling rows. So if we're referring to the sets of labels for two different DataFrames, we'd call them "indexes", but if we're referring to the row labels then the set of labels for one DataFrame can be referred to as its "indices". On the other hand, if you find this a bit confusing it's not the most important thing to keep straight in your head. The important thing to know is that DataFrame rows have indices, just like columns have column labels. 

Carrying on with our sample data from above, we could use `trial` as the index for our data, rather than having it as a column within the data. If we've already loaded the data, we can do this with the `.set_index()` method:

In [26]:
df.set_index('trial')

Unnamed: 0_level_0,s1,s2
trial,Unnamed: 1_level_1,Unnamed: 2_level_1
1,0.508971,0.433094
2,0.389858,0.392526
3,0.404175,0.396831
4,0.26952,0.417988
5,0.437765,0.37181
6,0.368142,0.659228
7,0.400544,0.411051
8,0.335198,0.40958
9,0.341722,0.486828
10,0.439583,0.468912


However, if you already know there's a column in your data that you want to use as the index, you can specify that when you first import the data, using the `index_col=` argument to `pd.read_csv()`:

In [27]:
df_list = []
for filename in filenames:
    df_list.append(pd.read_csv(filename, index_col='trial'))
    
df_list[0]

Unnamed: 0_level_0,RT
trial,Unnamed: 1_level_1
1,0.508971
2,0.389858
3,0.404175
4,0.26952
5,0.437765
6,0.368142
7,0.400544
8,0.335198
9,0.341722
10,0.439583


Some important things to note here:
- Compared to the output earlier, which had 3 columns, the output here only has two columns. In fact the first "column" in a DataFrame is the index, not an actual column. Since we used `trial` as the index here, we don't have a separate, unlabeled "column" of index values starting from 0, as we did earlier. I'm using "column" in quotes to refer to the index because it's not treated as a column by pandas - it's the index. 
- we put `'trial'` in quotation marks, because it's a string
- note that in the above command, the `pd.read_csv()` command is embedded inside the `.append()` method. The argument to `pd.read_csv()` needs to be inside the parentheses for *that* command, and not the outer parentheses that enclose the input to `df_list.append()`