# Renaming columns and indexes

One of the things that is quite difficult to do in most desktop GIS software is renaming fields.  Fortunately that is something that is quite easy to do in GeoPandas.

First lets load some data.

In [1]:
%matplotlib inline
import geopandas as gpd

raptor = gpd.read_file("data/Raptor_Nests.shp")
raptor

Unnamed: 0,postgis_fi,lat_y_dd,long_x_dd,lastsurvey,recentspec,recentstat,Nest_ID,geometry
0,361.0,40.267502,-104.870872,2012-03-16,Swainsons Hawk,INACTIVE NEST,361,POINT (-104.79595 40.29891)
1,362.0,40.264321,-104.860255,2012-03-16,Swainsons Hawk,INACTIVE NEST,362,POINT (-104.78897 40.22089)
2,1.0,38.650081,-105.494251,2014-07-28,Swainsons Hawk,INACTIVE NEST,1,POINT (-105.50223 38.68694)
3,2.0,40.309574,-104.932604,2011-01-06,Swainsons Hawk,INACTIVE NEST,2,POINT (-104.84889 40.35215)
4,3.0,40.219343,-104.729246,2014-07-03,Swainsons Hawk,ACTIVE NEST,3,POINT (-104.74466 40.18571)
...,...,...,...,...,...,...,...,...
874,911.0,40.006950,-104.894370,2015-08-18,Red-tail Hawk,INACTIVE NEST,911,POINT (-104.98394 40.00297)
875,912.0,39.998876,-104.900128,2015-09-01,Red-tail Hawk,INACTIVE NEST,912,POINT (-104.84766 39.96975)
876,,,,2020-05-08,Northern Harrier,INACTIVE NEST,9991,POINT (-104.95039 40.24432)
877,,,,2020-05-05,SWHA,INACTIVE NEST,1001,POINT (-104.94502 40.24443)


In GeoPandas however it is quite easy to change the column names and because everything is done in memory those changes won't be written to the original data but they will persist while you are using the dataframe in GeoPandas.

So lets change the lat_y_dd and long_x_dd fields to latitude and longitude.  There are a number of ways to do this.  You can pass a list of new field names to the columns property of the raptors GeoDataFrame.  **NOTE:** with this method you need an element in the list for each column.

In [2]:
raptor.columns=["gid", "latitude", "longitude", "lastsurvey", "recentspec", "recentstat", "nest_id", "geometry"]

In [3]:
raptor.info()

<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 879 entries, 0 to 878
Data columns (total 8 columns):
 #   Column      Non-Null Count  Dtype   
---  ------      --------------  -----   
 0   gid         876 non-null    float64 
 1   latitude    877 non-null    float64 
 2   longitude   877 non-null    float64 
 3   lastsurvey  878 non-null    object  
 4   recentspec  879 non-null    object  
 5   recentstat  879 non-null    object  
 6   nest_id     879 non-null    int64   
 7   geometry    879 non-null    geometry
dtypes: float64(3), geometry(1), int64(1), object(3)
memory usage: 55.1+ KB


A more convenient way, especially with datasets that have a lot of columns is to use the dataframes rename method and pass a dictionary of individual column names that you want to change. The key value in the dictionary will be the current column name and the value will be the name that you want it to be renamed to.  Let's reload the raptor dataset first.

In [4]:
raptor = gpd.read_file("data/Raptor_Nests.shp")
raptor

Unnamed: 0,postgis_fi,lat_y_dd,long_x_dd,lastsurvey,recentspec,recentstat,Nest_ID,geometry
0,361.0,40.267502,-104.870872,2012-03-16,Swainsons Hawk,INACTIVE NEST,361,POINT (-104.79595 40.29891)
1,362.0,40.264321,-104.860255,2012-03-16,Swainsons Hawk,INACTIVE NEST,362,POINT (-104.78897 40.22089)
2,1.0,38.650081,-105.494251,2014-07-28,Swainsons Hawk,INACTIVE NEST,1,POINT (-105.50223 38.68694)
3,2.0,40.309574,-104.932604,2011-01-06,Swainsons Hawk,INACTIVE NEST,2,POINT (-104.84889 40.35215)
4,3.0,40.219343,-104.729246,2014-07-03,Swainsons Hawk,ACTIVE NEST,3,POINT (-104.74466 40.18571)
...,...,...,...,...,...,...,...,...
874,911.0,40.006950,-104.894370,2015-08-18,Red-tail Hawk,INACTIVE NEST,911,POINT (-104.98394 40.00297)
875,912.0,39.998876,-104.900128,2015-09-01,Red-tail Hawk,INACTIVE NEST,912,POINT (-104.84766 39.96975)
876,,,,2020-05-08,Northern Harrier,INACTIVE NEST,9991,POINT (-104.95039 40.24432)
877,,,,2020-05-05,SWHA,INACTIVE NEST,1001,POINT (-104.94502 40.24443)


In [5]:
raptor.rename(columns={"postgis_fi":"gid", "lat_y_dd":"latitude", "long_x_dd":"longitude"})
raptor.info()

<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 879 entries, 0 to 878
Data columns (total 8 columns):
 #   Column      Non-Null Count  Dtype   
---  ------      --------------  -----   
 0   postgis_fi  876 non-null    float64 
 1   lat_y_dd    877 non-null    float64 
 2   long_x_dd   877 non-null    float64 
 3   lastsurvey  878 non-null    object  
 4   recentspec  879 non-null    object  
 5   recentstat  879 non-null    object  
 6   Nest_ID     879 non-null    int64   
 7   geometry    879 non-null    geometry
dtypes: float64(3), geometry(1), int64(1), object(3)
memory usage: 55.1+ KB


And nothing seemed to happen, why?  With this method we need to set the inplace parameter to True in order to persist the change to the GeoDataFrame beyond the immediate command.

In [6]:
raptor.rename(inplace=True, columns={"postgis_fi":"gid", "lat_y_dd":"latitude", "long_x_dd":"longitude"})
raptor.info()

<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 879 entries, 0 to 878
Data columns (total 8 columns):
 #   Column      Non-Null Count  Dtype   
---  ------      --------------  -----   
 0   gid         876 non-null    float64 
 1   latitude    877 non-null    float64 
 2   longitude   877 non-null    float64 
 3   lastsurvey  878 non-null    object  
 4   recentspec  879 non-null    object  
 5   recentstat  879 non-null    object  
 6   Nest_ID     879 non-null    int64   
 7   geometry    879 non-null    geometry
dtypes: float64(3), geometry(1), int64(1), object(3)
memory usage: 55.1+ KB


Other commands that will change column names are add_suffix and add_prefix. These will add a prefix or suffix to every column name.  This can be useful, as an example, before conducting a join with another GeoDataFrame that has similar column names to make it clear which GeoDataFrame the column refers too.

In [7]:
raptor.add_suffix("_old")

Unnamed: 0,gid_old,latitude_old,longitude_old,lastsurvey_old,recentspec_old,recentstat_old,Nest_ID_old,geometry_old
0,361.0,40.267502,-104.870872,2012-03-16,Swainsons Hawk,INACTIVE NEST,361,POINT (-104.79595 40.29891)
1,362.0,40.264321,-104.860255,2012-03-16,Swainsons Hawk,INACTIVE NEST,362,POINT (-104.78897 40.22089)
2,1.0,38.650081,-105.494251,2014-07-28,Swainsons Hawk,INACTIVE NEST,1,POINT (-105.50223 38.68694)
3,2.0,40.309574,-104.932604,2011-01-06,Swainsons Hawk,INACTIVE NEST,2,POINT (-104.84889 40.35215)
4,3.0,40.219343,-104.729246,2014-07-03,Swainsons Hawk,ACTIVE NEST,3,POINT (-104.74466 40.18571)
...,...,...,...,...,...,...,...,...
874,911.0,40.006950,-104.894370,2015-08-18,Red-tail Hawk,INACTIVE NEST,911,POINT (-104.98394 40.00297)
875,912.0,39.998876,-104.900128,2015-09-01,Red-tail Hawk,INACTIVE NEST,912,POINT (-104.84766 39.96975)
876,,,,2020-05-08,Northern Harrier,INACTIVE NEST,9991,POINT (-104.95039 40.24432)
877,,,,2020-05-05,SWHA,INACTIVE NEST,1001,POINT (-104.94502 40.24443)


# Renaming index labels

Although much less common, it is also possible to update index labels using similar methods.  The GeoDataFrame has an index method which will allow the user to supply a list of index labels to replace the existing ones.  

The rename method also can take a dictionary as the index parameter if you only want to rename a few specific index labels.

There are no equivalents to add_prefix or add_suffix methods for index labels.

There are also the set_index method which allows you to replace the existing index values with an existing column name.

You can also look at the documentation for the pandas reindex method which has a lot of ways in which you can rebuild index values from scratch and fill in missing values, etc.
