## Transforming column type

In [1]:
import pandas as pd

In [2]:
videos = pd.read_csv('videolist_search500_2021_09_16-08_37_30.tab', sep='\t')

In [3]:
videos.head()

Unnamed: 0,position,channelId,channelTitle,videoId,publishedAt,publishedAtSQL,videoTitle,videoDescription,tags,videoCategoryId,...,dimension,definition,caption,thumbnail_maxres,licensedContent,viewCount,likeCount,dislikeCount,favoriteCount,commentCount
0,1,UC18vz5hUUqxbGvym9ghtX_w,Full Frontal with Samantha Bee,xkgt1Avnvw0,2021-09-16T05:59:45Z,2021-09-16 05:59:45,Vivian Howard: The Perfect Recipe for Combatin...,Apocalyptic climate crisis got you down? Allow...,"Full Frontal with Samantha Bee,Full Frontal,Sa...",24,...,2d,hd,False,https://i.ytimg.com/vi/xkgt1Avnvw0/maxresdefau...,1.0,796.0,88.0,2.0,0,9.0
1,2,UCJ6DCjlsOB8dwCrDbOLia6g,Info Viral,Sf-klthGrqA,2021-09-16T06:00:06Z,2021-09-16 06:00:06,"Global climate change!Cameroon under the snow,...",#GlobalClimateChange #CameroonUnderTheSnow #Sn...,"snow,snow fall,snow falling,snow fall in Camer...",26,...,2d,hd,False,,1.0,2.0,0.0,0.0,0,0.0
2,3,UC18vz5hUUqxbGvym9ghtX_w,Full Frontal with Samantha Bee,4Z0O1lJBU7g,2021-09-16T05:59:46Z,2021-09-16 05:59:46,Food Waste and Climate Change: How Your Leftov...,The recent devastating climate report left man...,"Full Frontal with Samantha Bee,Full Frontal,Sa...",24,...,2d,hd,False,https://i.ytimg.com/vi/4Z0O1lJBU7g/maxresdefau...,1.0,1735.0,152.0,5.0,0,22.0
3,4,UCEfvFsy9qbzeKyASsDs0V-w,KJ Singh,KWoI9jTHHlk,2021-09-16T05:38:04Z,2021-09-16 05:38:04,Climate Change in Australia and Brexit #downun...,,,22,...,2d,hd,False,https://i.ytimg.com/vi/KWoI9jTHHlk/maxresdefau...,,0.0,0.0,0.0,0,0.0
4,5,UC7pluR6rB5KZIbN2IxamzxQ,BBC News Marathi,32y2dG2tLOg,2021-09-16T04:58:32Z,2021-09-16 04:58:32,Climate Change : 50c सेल्शिअस तापमानाच्या ठिका...,#ClimateChange #Temperature #Heat नायजेरिया दे...,"Global Warming,Warming,Hottest place in world,...",25,...,2d,hd,False,,1.0,5442.0,81.0,3.0,0,5.0


In [4]:
videos.columns

Index(['position', 'channelId', 'channelTitle', 'videoId', 'publishedAt',
       'publishedAtSQL', 'videoTitle', 'videoDescription', 'tags',
       'videoCategoryId', 'videoCategoryLabel', 'duration', 'durationSec',
       'dimension', 'definition', 'caption', 'thumbnail_maxres',
       'licensedContent', 'viewCount', 'likeCount', 'dislikeCount',
       'favoriteCount', 'commentCount'],
      dtype='object')

In [5]:
videos.dtypes

position                int64
channelId              object
channelTitle           object
videoId                object
publishedAt            object
publishedAtSQL         object
videoTitle             object
videoDescription       object
tags                   object
videoCategoryId         int64
videoCategoryLabel     object
duration               object
durationSec             int64
dimension              object
definition             object
caption                  bool
thumbnail_maxres       object
licensedContent       float64
viewCount             float64
likeCount             float64
dislikeCount          float64
favoriteCount           int64
commentCount          float64
dtype: object

In [6]:
sentiment = pd.read_pickle('JoannaStrycharz_YouTubeClimateChange_EN_completed.pkl')

In [7]:
sentiment.head()

Unnamed: 0,videoId,negative,positive,neutral
0,xkgt1Avnvw0,-1,1,0
1,Sf-klthGrqA,-1,2,1
2,4Z0O1lJBU7g,-1,1,0
3,KWoI9jTHHlk,-1,1,0
4,32y2dG2tLOg,-1,1,0


In [8]:
sentiment.columns

Index(['videoId', 'negative', 'positive', 'neutral'], dtype='object')

In [9]:
sentiment.dtypes

videoId     object
negative    object
positive    object
neutral     object
dtype: object

When we want to merge two dataframes we need to make sure that the key variable we are merging on is identical in both dataframes. This means it needs to:
* have the same column name in both dataframes
* have the same data type in both dataframes.

In this case, we are all right, but in other cases you may need to change the data type. Let's have a look how to do it for a selected column.

### Changing data type

There are different ways to change data type for a selected column. 

First, you can use the .apply method we have used in the tutorial.

In [10]:
sentiment['positive'] = sentiment['positive'].apply(pd.to_numeric)

In [13]:
sentiment.dtypes

videoId     object
negative     int64
positive     int64
neutral     object
dtype: object

Another way to do it is to use the .astype method.

In [12]:
sentiment['negative'] = sentiment['negative'].astype("int64")

In [14]:
sentiment['negative'] = sentiment['negative'].astype("string")

In [15]:
sentiment.dtypes

videoId     object
negative    string
positive     int64
neutral     object
dtype: object