data

History

Name		Name	Last commit message	Last commit date
parent directory ..
cleansing		cleansing
scraping		scraping
README.md		README.md

README.md

💾 Data collection

Using the youtube-scraper package, data from 1000 youtube videos were collected using 2200 YouTube Data API v3 quotas, that is, 22% of the daily available quotas.

Execution time (Obtaining 100 videos for each query):

From scraping/ run:

go mod tidy
go run main.go

📝 Data "cleansing"

Using regular expressions on python, unwanted characters, like emojis, urls and linebreaks were deleted.

Fom cleansing/ run:

(optional) virtualenv -p python3 venv
(optional) source venv/bin/activate

pip install notebook 
jupyter notebook

Then, open the cleansing.ipynb file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

data

data

README.md

💾 Data collection

📝 Data "cleansing"

Files

data

Directory actions

More options

Directory actions

More options

Latest commit

History

data

Folders and files

parent directory

README.md

💾 Data collection

📝 Data "cleansing"