# Extract Data From Youtube Video

Criado em: 11/12/2021

## Obejtivo

Extrair dados de uma URL de vídeo de Youtube e resgatar seus dados

## Procedimento

Busca registros de `raw_data.feather`.

Usa `youtube_dl` para buscar dados daquela URL

Reune tudo e gera os arquivos: `raw_data_sem_labels.feather`, `raw_data_sem_labels.csv`

## Resultados

Gera os dataframes de dados dos vídeos com as seguitnes colunas

## Próximos Passos

Os dados foram salvos como `raw_data_sem_labels`. 

**NÂO TEM A LABEL TARGET DO NOSSO DATASET**. Até agora pecamos todos os dados doeixo X (features para predizer) mas não sabemos quais desses vídeos vão ou não ser bem classificados.

A forma primária de se fazer isso seria: **OLHAR CADA VÍDEO E CATEGORIZAR SE TE INTERRESSA OU NÃO**.

In [1]:
import pandas as pd
import numpy as np
import re
import time
import datetime
import requests as rq

import youtube_dl 
from tqdm import tqdm_notebook

pd.set_option("display.max_columns", None)

## Coleta

- Dados foram coletados no arquivo `coleta_videos.py`

In [3]:
df_videos = pd.read_feather("raw_search_dataset.feather")
print('Dataset:', df_videos.shape)
df_videos.head()

Dataset: (1746, 4)


Unnamed: 0,title,webpage_url,upload_date,view_count
0,How Data Engineering Works,https://www.youtube.com/watch?v=qWru-b6m030,2021-03-17,70717
1,[Parte 01] Como é o Trabalho de um Data Engine...,https://www.youtube.com/watch?v=nQu6s8FPhfA,2020-05-25,11121
2,Data Engineering Road Map - How To Learn Data ...,https://www.youtube.com/watch?v=SpaFPPByOhM,2021-07-06,48565
3,Data Scientists vs Data Engineers: Which one i...,https://www.youtube.com/watch?v=vmYaAzbv9xk,2019-12-19,246228
4,What Skills Do Data Engineers Need To Know,https://www.youtube.com/watch?v=LgSHaOvNodA,2021-04-08,33364


### Verificação dos Dados

In [4]:
ydl = youtube_dl.YoutubeDL({"ignoreerrors": True, 'verbose':False})

In [5]:
# 50minutos
video_list = []
error_videos_convert = []

for link in df_videos['webpage_url'].unique():

    video_info  = {}
    try:
        r = ydl.extract_info(url=link, download=False)
        year = r['upload_date'][:4]
        month = r['upload_date'][4:6]
        day = r['upload_date'][6:]
    except Exception:
        print(f"Fail - {link}")
        continue
    
    try:
        video_info = {
                'uploader': r['uploader'],
                'title': r['title'],
                'upload_date': f"{year}-{month}-{day}",
                'user': r['uploader_id'],
                'view_count': r['view_count'],
                'like_count': r['like_count'],
                # 'dislike_count': r['dislike_count'], ## Dislike foi retirado em 2021
                'thumbnail': r['thumbnail'],
                'width': r['width'],
                'height': r['height'],
                'categories': '|'.join(r['categories']) if r['categories'] is not None else None,
                'tags': '|'.join(r['tags']) if r['tags'] is not None else None,
                'channel_url': r['channel_url'],
                'description': r['description']
                }
        video_list.append(video_info)
    except Exception:
        print(f"Fail to CONVERT - {link}")
        error_videos_convert.append(link)
        continue
    
    

[youtube] qWru-b6m030: Downloading webpage
[youtube] nQu6s8FPhfA: Downloading webpage
[youtube] nQu6s8FPhfA: Downloading MPD manifest
[youtube] SpaFPPByOhM: Downloading webpage
[youtube] vmYaAzbv9xk: Downloading webpage
[youtube] LgSHaOvNodA: Downloading webpage
[youtube] 6RiA_Qur2yo: Downloading webpage
[youtube] vNNoNs_VeWc: Downloading webpage
[youtube] Fna3ufamji0: Downloading webpage
[youtube] 73QC3qw5b2Y: Downloading webpage
[youtube] pzfgbSfzhXg: Downloading webpage
[youtube] ZZr9oE4Oa5U: Downloading webpage
[youtube] 4Spo2QRTz1k: Downloading webpage
[youtube] EzYVqg0zzlY: Downloading webpage
[youtube] lVj0RlSxTXk: Downloading webpage
[youtube] dvviIUKwH7o: Downloading webpage
[youtube] ikZMVIWrSsc: Downloading webpage
[youtube] ikZMVIWrSsc: Downloading MPD manifest
[youtube] x17BZXUcYeE: Downloading webpage
[youtube] UjYc8uH6lHw: Downloading webpage
[youtube] D0Z6ZsNNeJs: Downloading webpage
[youtube] D0Z6ZsNNeJs: Downloading MPD manifest
[youtube] N8SJPb5JpOA: Downloading webp

[youtube] Fvu2oFyFCT0: Downloading MPD manifest
[youtube] pF33rE1UbsA: Downloading webpage
[youtube] pF33rE1UbsA: Downloading MPD manifest
[youtube] rLvpY2v9hWQ: Downloading webpage
[youtube] 2i9k3dgVLp0: Downloading webpage
[youtube] 2i9k3dgVLp0: Downloading MPD manifest
[youtube] gHkItZL5Wd4: Downloading webpage
[youtube] gHkItZL5Wd4: Downloading MPD manifest
[youtube] IsLjws1gNvA: Downloading webpage
[youtube] IsLjws1gNvA: Downloading MPD manifest
[youtube] i25ttd32-eo: Downloading webpage
[youtube] 8f1AL_BW6nQ: Downloading webpage
[youtube] 8f1AL_BW6nQ: Downloading MPD manifest
[youtube] MG1XMUWG1oM: Downloading webpage
[youtube] MG1XMUWG1oM: Downloading MPD manifest
[youtube] BQumogSBsUw: Downloading webpage
[youtube] BQumogSBsUw: Downloading MPD manifest
[youtube] ZT1I35BkA6Q: Downloading webpage
[youtube] ZT1I35BkA6Q: Downloading MPD manifest
[youtube] eNMZq3FBq_8: Downloading webpage
[youtube] eNMZq3FBq_8: Downloading MPD manifest
[youtube] X7O1XSS2JMA: Downloading webpage
[you

[youtube] OS4r8C8u3Cc: Downloading webpage
[youtube] h0AYbLGRa2I: Downloading webpage
[youtube] 1q0GXcsCiug: Downloading webpage
[youtube] 1q0GXcsCiug: Downloading MPD manifest
[youtube] g5xw7rj-Dzk: Downloading webpage
[youtube] xC-c7E5PK0Y: Downloading webpage
[youtube] CLAX-f7TpMo: Downloading webpage
[youtube] qU44M6dTWnk: Downloading webpage
[youtube] d0EW2eURJDw: Downloading webpage
[youtube] 23_1WlxGGM4: Downloading webpage
[youtube] JttLhotOrZw: Downloading webpage
[youtube] G7P258R0QOY: Downloading webpage
[youtube] vvdRV4Lidqo: Downloading webpage
[youtube] LNfmUPGndkE: Downloading webpage
[youtube] LNfmUPGndkE: Downloading MPD manifest
[youtube] 0VeqCtqj2jM: Downloading webpage
[youtube] GYqt94348wU: Downloading webpage
[youtube] pe6Gh_fi3Wc: Downloading webpage
[youtube] Yqj_w_wjZmY: Downloading webpage
[youtube] RN0I-oS4cQw: Downloading webpage
[youtube] qv1ErFMmhC4: Downloading webpage
[youtube] XNr0mX2SzjA: Downloading webpage
[youtube] mvkt7B65jmk: Downloading webpage
[

[youtube] d6KqD3ehgr0: Downloading MPD manifest
Fail to CONVERT - https://www.youtube.com/watch?v=d6KqD3ehgr0
[youtube] A513spqCGqA: Downloading webpage
[youtube] A513spqCGqA: Downloading MPD manifest
[youtube] _kpDgYgmWoc: Downloading webpage
[youtube] gvy5MbBQrD0: Downloading webpage
[youtube] TrTnnG_SnRw: Downloading webpage
[youtube] TrTnnG_SnRw: Downloading MPD manifest
[youtube] TxXQ2JkAMzA: Downloading webpage
[youtube] TxXQ2JkAMzA: Downloading MPD manifest
[youtube] h1nJzHyPwEU: Downloading webpage
[youtube] h1nJzHyPwEU: Downloading MPD manifest
[youtube] LcVz8wSM-AM: Downloading webpage
[youtube] LcVz8wSM-AM: Downloading MPD manifest
[youtube] EtYv7zPyS2A: Downloading webpage
[youtube] LuUULaIoOsM: Downloading webpage
[youtube] LuUULaIoOsM: Downloading MPD manifest
[youtube] IvfORHxKYUU: Downloading webpage
[youtube] f4XLT6Odgmw: Downloading webpage
[youtube] c12IQXNyjuU: Downloading webpage
[youtube] oXb9IvXc6iI: Downloading webpage
[youtube] oXb9IvXc6iI: Downloading MPD mani

[youtube] f4TVX6oA154: Downloading MPD manifest
[youtube] Jy9qWxZJ4tc: Downloading webpage
[youtube] Jy9qWxZJ4tc: Downloading MPD manifest
[youtube] hhKecthef6Y: Downloading webpage
[youtube] hhKecthef6Y: Downloading MPD manifest
[youtube] IQdISZCosAE: Downloading webpage
[youtube] HNd_VtPmZ18: Downloading webpage
[youtube] sABaakwymmM: Downloading webpage
[youtube] sABaakwymmM: Downloading MPD manifest
[youtube] ZPWsDeCxZr0: Downloading webpage
[youtube] ZPWsDeCxZr0: Downloading MPD manifest
[youtube] 9ixQkWkbzGk: Downloading webpage
[youtube] 9ixQkWkbzGk: Downloading MPD manifest
[youtube] CEzhMxUJkHQ: Downloading webpage
[youtube] CEzhMxUJkHQ: Downloading MPD manifest
[youtube] kdzOdRH5c-4: Downloading webpage
[youtube] kdzOdRH5c-4: Downloading MPD manifest
[youtube] oPfZwjsAkag: Downloading webpage
[youtube] kMgfN-37-I4: Downloading webpage
[youtube] 8TPBGGEgJvk: Downloading webpage
[youtube] ltKuMEEs7DI: Downloading webpage
[youtube] ltKuMEEs7DI: Downloading MPD manifest
[youtube]

[youtube] BLJVXMSldzE: Downloading MPD manifest
[youtube] AlOgnPCZNf4: Downloading webpage
[youtube] V2skgFgm32c: Downloading webpage
[youtube] V2skgFgm32c: Downloading MPD manifest
[youtube] bsOJhMTllew: Downloading webpage
[youtube] bsOJhMTllew: Downloading MPD manifest
[youtube] UPAk5N00Bj4: Downloading webpage
[youtube] UPAk5N00Bj4: Downloading MPD manifest
[youtube] MotN5f6_xl8: Downloading webpage
[youtube] txrG-w6oN1M: Downloading webpage
[youtube] RDbKS8Mugno: Downloading webpage
[youtube] RDbKS8Mugno: Downloading MPD manifest
[youtube] PvO4e-TB2AQ: Downloading webpage
[youtube] PvO4e-TB2AQ: Downloading MPD manifest
[youtube] fzhD9CeMfKw: Downloading webpage
[youtube] fzhD9CeMfKw: Downloading MPD manifest
[youtube] gidW-3YOcbs: Downloading webpage
[youtube] gidW-3YOcbs: Downloading MPD manifest
[youtube] CJ_ou-ITkf4: Downloading webpage
[youtube] CJ_ou-ITkf4: Downloading MPD manifest
[youtube] ElfU6nc1ZOs: Downloading webpage
[youtube] Z8_O0wEIafw: Downloading webpage
[youtube]

[youtube] knU-p68JPE0: Downloading webpage
[youtube] VJZsgsY_sXY: Downloading webpage
[youtube] VJZsgsY_sXY: Downloading MPD manifest
[youtube] FcIEf_fG3eE: Downloading webpage
[youtube] FcIEf_fG3eE: Downloading MPD manifest
[youtube] 9IAd9YqKzWE: Downloading webpage
[youtube] GeNFEtt-D4k: Downloading webpage
[youtube] _sMEd8OvcaY: Downloading webpage
[youtube] CmPp9kYECeQ: Downloading webpage
[youtube] CmPp9kYECeQ: Downloading MPD manifest
[youtube] lZFpOzSabvs: Downloading webpage
[youtube] lZFpOzSabvs: Downloading MPD manifest
[youtube] GRPLRONVDWY: Downloading webpage
[youtube] q3LvOEAmftc: Downloading webpage
[youtube] q3LvOEAmftc: Downloading MPD manifest
[youtube] n5sOpjeVA88: Downloading webpage
[youtube] n5sOpjeVA88: Downloading MPD manifest
[youtube] wCq1rcO31Nk: Downloading webpage
[youtube] wCq1rcO31Nk: Downloading MPD manifest
[youtube] CXdJC4fbPpA: Downloading webpage
[youtube] CXdJC4fbPpA: Downloading MPD manifest
[youtube] go8G0tW4KgU: Downloading webpage
[youtube] go8G

[youtube] qUL7QabcKcw: Downloading webpage
[youtube] MpEsLFLW8zQ: Downloading webpage
[youtube] MpEsLFLW8zQ: Downloading MPD manifest
[youtube] wqbYlRhwvhM: Downloading webpage
[youtube] wqbYlRhwvhM: Downloading MPD manifest
[youtube] tc5ApNqhAQ4: Downloading webpage
[youtube] bag2LYfKRJA: Downloading webpage
[youtube] rX6Jkxn7aao: Downloading webpage
[youtube] 8fje9XcYA3M: Downloading webpage
[youtube] 8fje9XcYA3M: Downloading MPD manifest
[youtube] XuzkPxyVcag: Downloading webpage
[youtube] XuzkPxyVcag: Downloading MPD manifest
[youtube] ScViBpGacok: Downloading webpage
[youtube] ScViBpGacok: Downloading MPD manifest
[youtube] uqiv5LAiJe0: Downloading webpage
[youtube] uqiv5LAiJe0: Downloading MPD manifest
[youtube] kdKCeAZq48E: Downloading webpage
[youtube] kdKCeAZq48E: Downloading MPD manifest
[youtube] VkzSdsSS-A8: Downloading webpage
[youtube] VkzSdsSS-A8: Downloading MPD manifest
[youtube] 81F8A6tHM30: Downloading webpage
[youtube] 1wruPVz32QA: Downloading webpage
[youtube] 1wru

[youtube] b1NRBZRV7YM: Downloading MPD manifest
[youtube] nj60Ocg52-Y: Downloading webpage
[youtube] nj60Ocg52-Y: Downloading MPD manifest
[youtube] 1UjZdmvIkZ0: Downloading webpage
[youtube] Dc9bDA_LN-U: Downloading webpage
[youtube] phNF82Yy5Kg: Downloading webpage
[youtube] phNF82Yy5Kg: Downloading MPD manifest
[youtube] 7lmr_2hEBHs: Downloading webpage
[youtube] 7lmr_2hEBHs: Downloading MPD manifest
[youtube] 6ZNTVxsich4: Downloading webpage
[youtube] 6ZNTVxsich4: Downloading MPD manifest
[youtube] XcZiQBSyG7E: Downloading webpage
[youtube] XcZiQBSyG7E: Downloading MPD manifest
[youtube] 7HmyPjUm1Mg: Downloading webpage
[youtube] 7HmyPjUm1Mg: Downloading MPD manifest
[youtube] M4xUQXogSfo: Downloading webpage
[youtube] M4xUQXogSfo: Downloading MPD manifest
[youtube] zP1UF_3U5Ic: Downloading webpage
[youtube] StFRPmGgMu0: Downloading webpage
[youtube] StFRPmGgMu0: Downloading MPD manifest
[youtube] YgxlnpDRkak: Downloading webpage
[youtube] YgxlnpDRkak: Downloading MPD manifest
[you

[youtube] lzqyREuOnI4: Downloading MPD manifest
[youtube] 7lzD-1zo3Ps: Downloading webpage
[youtube] vRUsKUY4Tj0: Downloading webpage
[youtube] vRUsKUY4Tj0: Downloading MPD manifest
[youtube] oqKvrnupEy0: Downloading webpage
[youtube] oqKvrnupEy0: Downloading MPD manifest
[youtube] bt9Rpz4Cfng: Downloading webpage
[youtube] bt9Rpz4Cfng: Downloading MPD manifest
[youtube] QdrUxMPEQfo: Downloading webpage
[youtube] QdrUxMPEQfo: Downloading MPD manifest
[youtube] 5rqaIzhryNc: Downloading webpage
[youtube] 5rqaIzhryNc: Downloading MPD manifest
[youtube] AZtTd3pFVTY: Downloading webpage
[youtube] lJV5CusqAWc: Downloading webpage
[youtube] lrErvRVBGYQ: Downloading webpage
[youtube] lrErvRVBGYQ: Downloading MPD manifest
[youtube] 13aKXDz6OFc: Downloading webpage
[youtube] umcc-lf_rug: Downloading webpage
[youtube] poKd8Y0kC1o: Downloading webpage
[youtube] poKd8Y0kC1o: Downloading MPD manifest
[youtube] Bm3llvQ-Kjc: Downloading webpage
[youtube] Bm3llvQ-Kjc: Downloading MPD manifest
[youtube]

[youtube] STdfVPeKWvQ: Downloading webpage
[youtube] STdfVPeKWvQ: Downloading MPD manifest
[youtube] I5Gr8hLsJug: Downloading webpage
[youtube] I5Gr8hLsJug: Downloading MPD manifest
[youtube] R8PQdP0_Zck: Downloading webpage
[youtube] R8PQdP0_Zck: Downloading MPD manifest
[youtube] AHenjC0YbUQ: Downloading webpage
[youtube] GUJSTM494pw: Downloading webpage
[youtube] ux6xUGjN1XE: Downloading webpage
[youtube] ux6xUGjN1XE: Downloading MPD manifest
[youtube] czN7LE1PC1U: Downloading webpage
[youtube] czN7LE1PC1U: Downloading MPD manifest
[youtube] YA0yqYSs9BQ: Downloading webpage
[youtube] YA0yqYSs9BQ: Downloading MPD manifest
[youtube] bXHgQwRnFoQ: Downloading webpage
[youtube] G4uIDpfSW6s: Downloading webpage
[youtube] G4uIDpfSW6s: Downloading MPD manifest
[youtube] v-6ah-kMZC4: Downloading webpage
[youtube] U0xaOZBT9Co: Downloading webpage
[youtube] qNdDChVIWcM: Downloading webpage
[youtube] qNdDChVIWcM: Downloading MPD manifest
[youtube] 0KVGwBkaTg4: Downloading webpage
[youtube] 0KVG

[youtube] QgzkB1hcq5s: Downloading MPD manifest
[youtube] Oemg-3aiAiI: Downloading webpage
[youtube] Oemg-3aiAiI: Downloading MPD manifest
[youtube] wDr3Y7q2XoI: Downloading webpage
[youtube] wDr3Y7q2XoI: Downloading MPD manifest
[youtube] 428AiCBMZoQ: Downloading webpage
[youtube] 428AiCBMZoQ: Downloading MPD manifest
[youtube] MpnfLcJflaw: Downloading webpage
[youtube] MpnfLcJflaw: Downloading MPD manifest
[youtube] ATUARuFh3JQ: Downloading webpage
[youtube] ATUARuFh3JQ: Downloading MPD manifest
[youtube] 3nhsEcBHz4Y: Downloading webpage
[youtube] 3nhsEcBHz4Y: Downloading MPD manifest
[youtube] 8MFYtDosBqM: Downloading webpage
[youtube] 8MFYtDosBqM: Downloading MPD manifest
[youtube] NnePXN-0geQ: Downloading webpage
[youtube] NnePXN-0geQ: Downloading MPD manifest
[youtube] wxmy6A3v4Nw: Downloading webpage
[youtube] IIp5BkWvjo4: Downloading webpage
[youtube] IIp5BkWvjo4: Downloading MPD manifest
[youtube] rSkIa0lREUc: Downloading webpage
[youtube] rSkIa0lREUc: Downloading MPD manifest



[youtube] IFNGh86w0yM: Downloading API JSON
[youtube] IFNGh86w0yM: Downloading API JSON
[youtube] yWaXo6R69hM: Downloading webpage
[youtube] ASEKytFbeLw: Downloading webpage
Fail to CONVERT - https://www.youtube.com/watch?v=ASEKytFbeLw
[youtube] eFmDmk1Siqc: Downloading webpage
[youtube] bFl5pEe-7uo: Downloading webpage
[youtube] L_NlsqZ4LIM: Downloading webpage
[youtube] -7jsdj7sqGQ: Downloading webpage
[youtube] dTghgmnbCWg: Downloading webpage
[youtube] 4tO4P8lMeR0: Downloading webpage
[youtube] Qs02p3mh8m4: Downloading webpage
[youtube] Qs02p3mh8m4: Downloading MPD manifest
[youtube] b-RZKIylM3E: Downloading webpage
[youtube] b-RZKIylM3E: Downloading MPD manifest
[youtube] AvzHb1z42SE: Downloading webpage
[youtube] XidE7Z2L4PA: Downloading webpage
[youtube] j0i1tbKeXRQ: Downloading webpage
[youtube] j0i1tbKeXRQ: Downloading MPD manifest
[youtube] gfYnyz5f1Dc: Downloading webpage
[youtube] Eh5n5WfuGOY: Downloading webpage
[youtube] Bv8gqA_rdhE: Downloading webpage
[youtube] Bv8gqA_r

[youtube] 22JtXkYyVW4: Downloading webpage
[youtube] l9Z_bwARx7w: Downloading webpage
[youtube] c-qRjRrBJGs: Downloading webpage
[youtube] HGLtj9GQmbU: Downloading webpage
[youtube] chXmvx3dgVs: Downloading webpage
[youtube] chXmvx3dgVs: Downloading MPD manifest
[youtube] V4IabTELJXk: Downloading webpage
[youtube] rHCww-St_0o: Downloading webpage
[youtube] rHCww-St_0o: Downloading MPD manifest
[youtube] bpwR9zMT6ME: Downloading webpage
[youtube] bpwR9zMT6ME: Downloading MPD manifest
[youtube] mqnwOvFNek4: Downloading webpage
[youtube] mqnwOvFNek4: Downloading MPD manifest
Fail to CONVERT - https://www.youtube.com/watch?v=mqnwOvFNek4
[youtube] Fglit4CYpQU: Downloading webpage
[youtube] Fglit4CYpQU: Downloading MPD manifest
[youtube] 9LSvpLZJh20: Downloading webpage
[youtube] 9LSvpLZJh20: Downloading MPD manifest
[youtube] n4VxtqcYiOQ: Downloading webpage
[youtube] CBdP1U2wbzQ: Downloading webpage
[youtube] Is2WCWC-t1o: Downloading webpage
[youtube] 0k1OxkKt4l0: Downloading webpage
[yout

[youtube] mmoa-LHSY1I: Downloading webpage
[youtube] mmoa-LHSY1I: Downloading MPD manifest
[youtube] NBZKpdpeZvc: Downloading webpage
[youtube] fGchNOpTBXQ: Downloading webpage
[youtube] fGchNOpTBXQ: Downloading MPD manifest
[youtube] E4USoU5czEc: Downloading webpage
[youtube] nkIcoY_tILY: Downloading webpage
[youtube] nkIcoY_tILY: Downloading MPD manifest
[youtube] muzze0UP9HM: Downloading webpage
[youtube] mfc6nosjO34: Downloading webpage
[youtube] YX9PG6MCNOk: Downloading webpage
[youtube] mwOW9LdfMig: Downloading webpage
[youtube] 2OF-_afpKdM: Downloading webpage
[youtube] 2OF-_afpKdM: Downloading MPD manifest
[youtube] t1h8al5lcdM: Downloading webpage
[youtube] Uw7SNPZxtjc: Downloading webpage
[youtube] ie0kBCQnrso: Downloading webpage
[youtube] U-fOlcPU4-I: Downloading webpage
[youtube] U-fOlcPU4-I: Downloading MPD manifest
[youtube] W4m-JMxagyM: Downloading webpage
[youtube] dol5-e3Svxw: Downloading webpage
[youtube] xYkhij9oVmg: Downloading webpage
[youtube] xYkhij9oVmg: Downlo

In [6]:
df_video_info = pd.DataFrame(video_list)
print(df_video_info.shape)
df_video_info.head()

(1661, 13)


Unnamed: 0,uploader,title,upload_date,user,view_count,like_count,thumbnail,width,height,categories,tags,channel_url,description
0,AltexSoft,How Data Engineering Works,2021-03-17,AltexSoftChannel,70764,4014,https://i.ytimg.com/vi/qWru-b6m030/maxresdefau...,1920,1080,Science & Technology,data engineering|data science|data infrastruct...,https://www.youtube.com/channel/UCEKI_F16hUtBH...,"So, the sole purpose of data engineering is to..."
1,Seja Um Data Scientist,[Parte 01] Como é o Trabalho de um Data Engine...,2020-05-25,UCar5Cr-pVz08GY_6I3RX9bA,11125,814,https://i.ytimg.com/vi_webp/nQu6s8FPhfA/maxres...,1920,1080,Science & Technology,data engineer|engenheiro de dados o que faz|fo...,https://www.youtube.com/channel/UCar5Cr-pVz08G...,"Nesse vídeo, eu vou mostrar qual o papel de um..."
2,Seattle Data Guy,Data Engineering Road Map - How To Learn Data ...,2021-07-06,UCmLGJ3VYBcfRaWbP6JLJcpA,48598,2561,https://i.ytimg.com/vi_webp/SpaFPPByOhM/maxres...,1920,1080,Education,big data|data analytics|tableau|sql|big query|...,https://www.youtube.com/channel/UCmLGJ3VYBcfRa...,How do you go from 0 to data engineer?\n\nWhat...
3,Joma Tech,Data Scientists vs Data Engineers: Which one i...,2019-12-19,UCV0qA-eDDICsRR9rPcnG7tw,246236,6683,https://i.ytimg.com/vi/vmYaAzbv9xk/maxresdefau...,1920,1080,Education,joma|vlog|data scientist|data science|data eng...,https://www.youtube.com/channel/UCV0qA-eDDICsR...,📚 Video courses from JomaClass:\n🎓 New to prog...
4,Seattle Data Guy,What Skills Do Data Engineers Need To Know,2021-04-08,UCmLGJ3VYBcfRaWbP6JLJcpA,33370,1870,https://i.ytimg.com/vi/LgSHaOvNodA/maxresdefau...,1920,1080,Education,big data|data analytics|tableau|sql|big query|...,https://www.youtube.com/channel/UCmLGJ3VYBcfRa...,Learn More about data engineering Googles DE c...


In [7]:
df_video_info.to_csv("raw_videos_dataset_nolabel.csv")
df_video_info.to_feather("raw_videos_dataset_nolabel.feather")