bilibili-link-grabber

Overview

This is a simple script that scrapes all the video urls from certain BiliBili pages. Currently it is able to scrape all video links from search result pages, user's 投稿(submissions) pages, and user's 频道 (channel) pages. Rather than right clicking each video in order to obtain its url, this script is designed to allow users to obtain all the video urls from each page and writing all of that into a csv file. The csv file can be used by annie or youtube-dl(not recommended as youtube-dl doesn't download all parts of the video) to download every single video with the given url in the csv file.

Installation

Requires/recommended Python version 3.7+

This script requires non-standard modules: requests, BeautifulSoup, and Selenium. ChromeDriver is also required to navigate through BiliBili. Therefore, the following needs to be installed:

A requirements text file will be included and the command pip3 install -r requirements.txt (or pip)can be used to install the required modules(also ensure that python was installed on path if errors involving pip pops up). Note it is recommended that pip is installed and/or up to date.
ChromeDriver which is required to work with Selenium, and ensure you're downloading the version that matches your Chrome build. It is also recommended that the ChromeDriver is installed and placed in the same folder as this script as there would be no need to specify the driver path later on. It should be noted that the default ouput for the csv file will be in the same folder as this script.

The executable file in the repository is a basic demo version of the current link scraper script. This executable file does not involve command line arugments, thus making it easier to use. The file is temporary and will probably be removed after a little more progress on the link scraper script.

Options and Usages

python blinkgrab.py [OPTIONS] -l [URL]

python blinkgrab.py -h

  -h, --help         show this help message and exit
  -n, --name         Name of the csv file
  -d, --driver       Absolute web driver path
  -p, --save         The user's chosen absolute save path for the csv file 
  -l, --link         Link to extract video urls
  -w, --wait         The amount of second to stop and wait for browser to load. Recommended for use only if browser
                        is taking longer than 2 seconds to load. Prevent links from being extracted on same page as
                        the next page has yet to load.
  -p, --page		 Select specific page(s) to scrape
  -a, --append       Scrape links to an existing csv file
  -q, --quiet        Limit the information printed onto the console as the script executes

Example:python blinkgrab.py -l <Bilibili Link>

Note: It is reccommended that to include '&' in any of the argument that the entire argument gets enclosed in quotes or double quotes.

This script is intended to be used on the following BiliBili pages:

BiliBili search pages

User's 投稿(submissions) pages and with the 全部(all) filter

User's 频道 (channel) pages which in most cases are single page application(meaning url doesn't change no matter which page you're on)

About Me

As a data hoarder, anime enthusiast, and a self proclaim Japanophile, I needed a way to get all the video links from certain BiliBili pages in order to download it. Normally I would google a software or program that would allow me to do just that, but as a student studying in Computer Science(with a concentration in Software Engineer), I felt it was time to create and work on a project. I decided to work on the script using Python due to the fact that I wanted to practice and learn more about Python. I had fun learning about web scraping and enjoyed working on this script! Any tips or feedback on this script is much appreciated as I am sure there can be much improvement to be done!

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
images		images
.gitignore		.gitignore
BiliBili_VideoURL_WebScraping_DEMO.exe		BiliBili_VideoURL_WebScraping_DEMO.exe
README.md		README.md
blinkgrab.py		blinkgrab.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bilibili-link-grabber

Overview

Installation

Options and Usages

This script is intended to be used on the following BiliBili pages:

About Me

About

Languages

Spicadox/bilibili-link-grabber

Folders and files

Latest commit

History

Repository files navigation

bilibili-link-grabber

Overview

Installation

Options and Usages

This script is intended to be used on the following BiliBili pages:

About Me

About

Topics

Resources

Stars

Watchers

Forks

Languages