![](../../src/logo.svg)

**© Jesús López**

Follow him on **[LinkedIn](https://linkedin.com/in/jsulopz)** or **[Twitter](https://twitter.com/jsulopz)**

<br>
<a rel="license" href="http://creativecommons.org/licenses/by-nc-nd/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-nd/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-nd/4.0/">Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License</a>.

# #05 | DataFrame.dtypes

In [1]:
import seaborn as sns #! starting object
import pandas as pd

pd.set_option('display.min_rows', 4)
pd.set_option('display.max_rows', 10)

df_tips = sns.load_dataset('tips')
df_tips.total_bill = df_tips.total_bill.astype(str)
df_tips

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
0,16.99,1.01,Female,No,Sun,Dinner,2
1,10.34,1.66,Male,No,Sun,Dinner,3
...,...,...,...,...,...,...,...
242,17.82,1.75,Male,No,Sat,Dinner,2
243,18.78,3.00,Female,No,Thur,Dinner,2


The `DataFrame` is composed by a set of `Series`; naming for the column.

Each Series of the DataFrame contains values of a certain type:

In [2]:
df_tips.dtypes

total_bill      object
tip            float64
sex           category
smoker        category
day           category
time          category
size             int64
dtype: object

Why is it important to know the type of object?

- It is important because we cannot perform certain operations with certain objects...

Fox example, we cannot sum up the values of a `category`

In [3]:
df_tips.sex

0      Female
1        Male
        ...  
242      Male
243    Female
Name: sex, Length: 244, dtype: category
Categories (2, object): ['Male', 'Female']

In [4]:
df_tips.sex.sum()

TypeError: 'Categorical' with dtype category does not support reduction 'sum'

But we can sum up the values of a numerical object like the `float64`:

In [5]:
df_tips.tip.sum()

731.5799999999999

Sometimes, pandas may read data wrongly and put a numerical variable as a text. Look at `total_bill`:

In [7]:
df_tips.total_bill

0      16.99
1      10.34
       ...  
242    17.82
243    18.78
Name: total_bill, Length: 244, dtype: object

If we try to sum up the values, they are concatenated, not added up.

In [8]:
df_tips.total_bill.sum()

'16.9910.3421.0123.6824.5925.298.7726.8815.0414.7810.2735.2615.4218.4314.8321.5810.3316.2916.9720.6517.9220.2915.7739.4219.8217.8113.3712.6921.719.659.5518.3515.0620.6917.7824.0616.3116.9318.6931.2716.0417.4613.949.6830.418.2922.2332.428.5518.0412.5410.2934.819.9425.5619.4938.0126.4111.2448.2720.2913.8111.0218.2917.5920.0816.453.0720.2315.0112.0217.0726.8625.2814.7310.5117.9227.222.7617.2919.4416.6610.0732.6815.9834.8313.0318.2824.7121.1628.9722.495.7516.3222.7540.1727.2812.0321.0112.4611.3515.3844.322.4220.9215.3620.4925.2118.2414.3114.07.2538.0723.9525.7117.3129.9310.6512.4324.0811.6913.4214.2615.9512.4829.88.5214.5211.3822.8219.0820.2711.1712.2618.268.5110.3314.1516.013.1617.4734.341.1927.0516.438.3518.6411.879.787.5114.0713.1317.2624.5519.7729.8548.1725.013.3916.4921.512.6616.2113.8117.5124.5220.7631.7110.5910.6350.8115.817.2531.8516.8232.917.8914.489.634.6334.6523.3345.3523.1740.5520.6920.930.4618.1523.115.6919.8128.4415.4816.587.5610.3443.1113.013.5118.7112.7413.016.420.5316.4726