# How do I change the data type of a pandas Series?

In [1]:
import pandas as pd

In [4]:
drinks = pd.read_csv('https://bit.ly/drinksbycountry')

In [5]:
drinks.head()

Unnamed: 0,country,beer_servings,spirit_servings,wine_servings,total_litres_of_pure_alcohol,continent
0,Afghanistan,0,0,0,0.0,Asia
1,Albania,89,132,54,4.9,Europe
2,Algeria,25,0,14,0.7,Africa
3,Andorra,245,138,312,12.4,Europe
4,Angola,217,57,45,5.9,Africa


In [6]:
drinks.dtypes

country                          object
beer_servings                     int64
spirit_servings                   int64
wine_servings                     int64
total_litres_of_pure_alcohol    float64
continent                        object
dtype: object

`object` usually means string type

## How to convert?

In [7]:
drinks.beer_servings = drinks.beer_servings.astype(float)

In [8]:
drinks.dtypes

country                          object
beer_servings                   float64
spirit_servings                   int64
wine_servings                     int64
total_litres_of_pure_alcohol    float64
continent                        object
dtype: object

## Why convert?

Sometimes is because you want to do some mathematics on numeric data, and if you want to do that, you need to convert interger type to float type.

## change data type while reading files

In [9]:
drinks = pd.read_csv('https://bit.ly/drinksbycountry', dtype={'beer_servings':float})

In [10]:
drinks.dtypes

country                          object
beer_servings                   float64
spirit_servings                   int64
wine_servings                     int64
total_litres_of_pure_alcohol    float64
continent                        object
dtype: object

In [11]:
orders = pd.read_table('https://bit.ly/chiporders')

In [12]:
orders.item_price

0        $2.39 
1        $3.39 
2        $3.39 
3        $2.39 
4       $16.98 
5       $10.98 
6        $1.69 
7       $11.75 
8        $9.25 
9        $9.25 
10       $4.45 
11       $8.75 
12       $8.75 
13      $11.25 
14       $4.45 
15       $2.39 
16       $8.49 
17       $8.49 
18       $2.18 
19       $8.75 
20       $4.45 
21       $8.99 
22       $3.39 
23      $10.98 
24       $3.39 
25       $2.39 
26       $8.49 
27       $8.99 
28       $1.09 
29       $8.49 
         ...   
4592    $11.75 
4593    $11.75 
4594    $11.75 
4595     $8.75 
4596     $4.45 
4597     $1.25 
4598     $1.50 
4599     $8.75 
4600     $4.45 
4601     $1.25 
4602     $9.25 
4603     $9.25 
4604     $8.75 
4605     $4.45 
4606     $1.25 
4607    $11.75 
4608    $11.25 
4609     $1.25 
4610    $11.75 
4611    $11.25 
4612     $9.25 
4613     $2.15 
4614     $1.50 
4615     $8.75 
4616     $4.45 
4617    $11.75 
4618    $11.75 
4619    $11.25 
4620     $8.75 
4621     $8.75 
Name: item_price, Length

we can see that there are signs `$` in front of prices, so we need to remove it, and convert the string data type to float

In [13]:
orders.item_price.str.replace('$','').astype(float).mean()

7.464335785374397

## bonus

Sometimes, we need to transfer a boolean dataFrame into a 0/1 dataframe, which is useful in most of machine learning works, we can simply use `astype(int)` to accomplish it.

In [16]:
orders.item_name.str.contains('Chicken').astype(int).head()

0    0
1    0
2    0
3    0
4    1
Name: item_name, dtype: int64