In [2]:
import pandas as pd

[Tenger Data Technologies Ltd.](http://www.tengerdata.com/)

*Joe T. Boka*

*joe.tb (at) tengerdata (dot) com*

*This question was posted on Quora and I was asked to answer it. Here is the link to my [Quora answer](https://www.quora.com/How-do-I-convert-in-Pandas-DataFrame-a-column-with-discrete-valued-names-to-discrete-valued-number-and-vice-versa)*

**The Problem**

How to convert a column with discrete valued names to discrete valued numbers and vice versa in a Pandas DataFrame.

**Info**

X1, X2… XN are column names.

We want to map X2 to a set of *categorical* discrete numbers and vice versa.

Let’s say the unique values in X2 are *Volvo, Skoda, Mazda, Ford*.
We need to convert the X2 values to numbers: 1,2,3,4.

**Solution**

We will use Pandas *categorical* data type to solve this problem.

Let’s say, this is our DataFrame:

In [3]:
df = pd.DataFrame({"X1":["a","b","c","d","e","f","g","h", "i"], "X2":['Volvo', 'Skoda', 'Mazda', 'Ford', 'Mazda', 'Ford', 'Volvo', 'Skoda', 'Volvo' ], "X3":[110,120,130,140,150,160,170,180,190]})
df

Unnamed: 0,X1,X2,X3
0,a,Volvo,110
1,b,Skoda,120
2,c,Mazda,130
3,d,Ford,140
4,e,Mazda,150
5,f,Ford,160
6,g,Volvo,170
7,h,Skoda,180
8,i,Volvo,190


In [4]:
df.dtypes

X1    object
X2    object
X3     int64
dtype: object

The **X2** column type is *object*. We need to convert it to *categorical* data type.

First, we should select the unique values in column **X2**, using *pandas.Series.unique()*.

In [5]:
df['X2'].unique()

array(['Volvo', 'Skoda', 'Mazda', 'Ford'], dtype=object)

As we convert the **X2** column to *categorical* data type, we can specify the ordering by using the list of the unique values returned by the *pandas.Series.unique()* function.

In [6]:
df["X2"] = pd.Categorical(df['X2'], df['X2'].unique())

In [7]:
df

Unnamed: 0,X1,X2,X3
0,a,Volvo,110
1,b,Skoda,120
2,c,Mazda,130
3,d,Ford,140
4,e,Mazda,150
5,f,Ford,160
6,g,Volvo,170
7,h,Skoda,180
8,i,Volvo,190


In [8]:
df.dtypes

X1      object
X2    category
X3       int64
dtype: object

Although the DataFrame doesn't look any different, we can see that the **X2** column data type was converted from *object* to *categorical*.

Now, we are ready to use *cat.rename_categories()* to convert the **X2** values to numbers.

In [9]:
df["X2"]  = df["X2"].cat.rename_categories([1,2,3,4])
df

Unnamed: 0,X1,X2,X3
0,a,1,110
1,b,2,120
2,c,3,130
3,d,4,140
4,e,3,150
5,f,4,160
6,g,1,170
7,h,2,180
8,i,1,190


When we want to convert the values back to the original, we can just do the same thing, using the *cat.rename_categories()* again, going the other way.

In [10]:
df["X2"] = df["X2"].cat.rename_categories(['Volvo', 'Skoda', 'Mazda', 'Ford'])
df

Unnamed: 0,X1,X2,X3
0,a,Volvo,110
1,b,Skoda,120
2,c,Mazda,130
3,d,Ford,140
4,e,Mazda,150
5,f,Ford,160
6,g,Volvo,170
7,h,Skoda,180
8,i,Volvo,190
