Skip to content

This is a repository to automate the scraping of every film shown in the Spanish public TV, using rvest and GitHub actions.

Notifications You must be signed in to change notification settings

GuilleDiaz7/Automatic-Web-Scraping-of-Spanish-TDT-Films

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Automatic Web Scraping of Spanish Public TV Films

This is a repository to automate the collection of data on films shown in Spanish public TV, named TDT. We use the R package for web scraping rvest and GitHub Actions.

The data comes from the following website.

It is updated every day and provides the film title (both the original and the spanish version), the film genre, a brief film synopsis, the TV channel and the day and time.

In the workflows folder it is the .yaml file that calls GitHub to autoscrape the data, using a R Script.

A quick report

If you want to see a report based on this data just clink the link.

Some useful resources

  1. Automate Web Scraping with GitHub Actions: video tutorial.

  2. Link to the repository used in the video tutorial: repository.

  3. It provides the cron numnbers required to schedule the autoscrape: web.

About

This is a repository to automate the scraping of every film shown in the Spanish public TV, using rvest and GitHub actions.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages