Skip to content

Bulk-converts articles contained in a passed wiki-article URL into PDF (with Dompdf)

Notifications You must be signed in to change notification settings

OlegKorn/wikipedia-to-pdf

Repository files navigation

PHP wikipedia articles downloader & PDF converter (based on Dompdf 0.8.5 - https://github.com/dompdf/dompdf/releases/tag/v0.8.5)

The main purpose is to obtain & convert & download as PDF-files the articles, contained within certain div in a given initial Wiki-article, passed via input.

Example:

Снимок экрана от 2020-02-09 15-44-38

Modus operandi

  1. Processes an URL passed from the input.
$url = urldecode($_POST["initialArticle"]);

Creates a DB table title removing wrong charachters

 $tableName = substr($url, strpos($url, 'wiki/') +5);
 $tableName = str_replace("(", "_", $tableName);
 $tableName = str_replace(")", "_", $tableName);
 $tableName = str_replace(",", "_", $tableName);
  1. Checks if the table exists already in DB: if the table is not empty, prints out the URLs contained in the table. If table doesn't exist, creates it and records the URLs into the table.

  2. Then downloads PDFs.

About

Bulk-converts articles contained in a passed wiki-article URL into PDF (with Dompdf)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages