Skip to content

euberdeveloper/tum-conf-scraper

Repository files navigation

Build Commitizen friendly License GitHub issues GitHub stars npm

tum-conf-scraper

This is a scraper written in Node.js and using Puppeteer that gets the videos served by Tum Conf services.

Install

To install tum-conf-scraper, run:

$ npm install tum-conf-scraper

Project purpose

This module is written because videos hosted on Tum Conf are difficult to download and watchable only in the browser. By using the module video-scraper-core, I created this module, that allows those videos to be recorderd.

Project usage

To scrape a video available at "https://tum-conf.zoom.us/rec/share/myvideo" and save it to "./saved.webm":

const { TumConfVideoScraper } = require('tum-conf-scraper');

async function main() {
    // Create an instance of the scraper
    const scraper = new TumConfVideoScraper('mypasscode', {
        debug: true
    });
    // Launch the Chrome browser
    await scraper.launch();
    // Scrape and save the video
    await scraper.scrape('https://tum-conf.zoom.us/rec/share/myvideo', './saved.webm');
    // Close the browser
    await scraper.close();
}
main();

To scrape and download more than one video:

const { TumConfVideoScraper } = require('tum-conf-scraper');

async function main() {
    // Create an instance of the scraper
    const scraper = new TumConfVideoScraper('mypasscode', {
        debug: true
    });
    // Launch the Chrome browser
    await scraper.launch();
    // Scrape and save the first video
    await scraper.scrape('https://tum-conf.zoom.us/rec/share/myvideo', './saved.webm');
    // Scrape and save the second video
    await scraper.scrape('https://tum-conf.zoom.us/rec/share/myvideo-bis', './saved_bis.webm');
    // Close the browser
    await scraper.close();
}
main();

To scrape and download in parallel more than one video:

const { TumConfVideoScraper } = require('tum-conf-scraper');

async function scrape(dest, link) {
    // Create an instance of the scraper
    const scraper = new TumConfVideoScraper('mypasscode', {
        debug: true
    });
    // Launch the Chrome browser
    await scraper.launch();
    // Scrape and save the video
    await scraper.scrape(link, dest);
    // Close the browser
    await scraper.close();
}

async function main() {
    const tasks = [
        ['./saved.webm', 'https://tum-conf.zoom.us/rec/share/myvideo'],
        ['./saved_bis.webm', 'https://tum-conf.zoom.us/rec/share/myvideo-bis']
    ].map(([dest, link]) => scrape(dest, link));
    await Promise.all(tasks);
}
main();

With custom options:

const { TumConfVideoScraper } = require('tum-conf-scraper');

async function main() {
    // Browser options
    const scraper = new TumConfVideoScraper('mypasscode', {
        debug: true,
        debugScope: 'This will be written as scope of the euberlog debug',
        windowSize: {
            width: 1000,
            height: 800
        },
        browserExecutablePath: '/usr/bin/firefox'
    });
    await scraper.launch();

    // Scraping options
    await scraper.scrape('https://tum-conf.zoom.us/rec/share/myvideo', './saved.webm', { duration: 1000 });
    await scraper.scrape('https://tum-conf.zoom.us/rec/share/myvideo-bis', './saved_bis.webm', { 
        audio: false,
        delayAfterVideoStarted: 3000,
        delayAfterVideoFinished: 2000 
    });

    await scraper.close();
}
main();

...all the options can be seen in the API section or with the Typescript definitions.

API

The documentation site is: tum-conf-scraper documentation

The documentation for development site is: tum-conf-scraper dev documentation

TumConfVideoScraper

The TumConfVideoScraper class, that scrapes a video from a "BBB WebKonferenze" and saves it to a file.

Syntax:

const scraper = new TumConfVideoScraper(passcode, options);

Parameters:

  • passcode: A string that specifies the passcode to access the video page.
  • options: Optional. A BrowserOptions object that specifies the options for this instance.

Methods:

  • setBrowserOptions(options: BrowserOptions): void: Changes the browser options with the ones given by the options parameter.
  • launch(): Promise: Launches the browser window.
  • close(): Promise: Closes the browser window.
  • scrape(url: string, destPath: string, options: ScrapingOptions): Promise: Scrapes the video in url and saves it to destPath. Some ScrapingOptions can be passed.

BrowserOptions

The options given to the TumConfVideoScraper constructor, see video-scraper-core for more information.

ScrapingOptions

The options passing to a scrape method, see video-scraper-core for more information.

Errors

There are also some error classes that can be thrown by this module, see video-scraper-core for more information.

Tests

The package is tested by using jest and ts-jest. The tests try for real to download some videos and check if they are saved, therefore, are not run in the CI because they are not headless.

Notes

  • The default browser is Google Chrome on /usr/bin/google-chrome, because Chromium did not support the BBB videos. You can always change the browser executable path on the configurations.
  • By default (if the duration option is null), the duration of the recording will be automatically detected by looking at the vjs player of the page and by adding a stopping delay of 15 seconds.
  • This module can be uses only in headful mode.

About

This is a scraper written in Node.js and using Puppeteer that gets the videos served on TUM conf (zoom)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published