Skip to content

Google Summer of Code 2019 | The Linux Foundation | Improving pdftoraster filter to use stable Poppler APIs

Tanmay Anand edited this page Aug 21, 2019 · 6 revisions

Introduction

image

This repository is a part of the project Improve the pdftoraster filter done during the Google Summer of Code Program, 2019 under the mentorship of The Linux Foundation as the mentoring organization.

About Me

I am Tanmay Anand, and I am an pre-final year student of Indian Institute of Technology, Kanpur. This is my first time of taking part in the Google Summer of Code Program.

Mentors

  1. Sahil Arora: Sahil is an alumnus of IIT Mandi. He has made great contributions to The Linux Foundations and was part of GSoC 17 and 18 under The Linux Foundation and has extensively worked on bannnertopdf and PCLm support for cups-filters. He is currently Analyst at Goldman Sachs.
  1. Till Kamppeter: Till Kamppeter was invited in 2006 to work for the Free Standards Group (now The Linux Foundation) merging http://linuxprinting.org into OpenPrinting and leading the OpenPrinting project full time. With OpenPrinting, he leads the development of new printing architectures, technologies, printing infrastructure, and interface standards for Linux and Unix-style operating systems. For this, he is in contact with the leading printer manufacturers, all relevant free software projects, and the distribution vendors. Till is also a distinguished Linux Foundation Fellow.

Acknowledgements

I would like to express my sincere vote of thanks to my mentors, Sahil and Till, for being extremely supportive throughout the project. I would like to express my gratitide to Aveek Basu (basu [dot] aveek [at] gmail [dot] com), who found me this year during the Google Summer of Code selections, and for selecting me to contribute to such great projects.

About CUPS and CUPS-FILTERS

CUPS is the standards-based, open source printing system developed by Apple Inc. for macOS® and other UNIX®-like operating systems. CUPS uses the Internet Printing Protocol (IPP) to support printing to local and network printers. CUPS is the primary printing software. CUPS-FILTERS is a software which is shipped with CUPS on non-MAC-OS operating systems and is mainly responsible for, as the name suggests, filtering the data that goes to the printer, for example converting the print file to a format supported by the printer, getting IPP attributes from the printer, etc.

About the project

Problem

Previously, pdftoraster uses the Poppler libraries which are unstable and change their function definition after updates due to which there are building and installation errors. The most plausible solution to overcome this problem is to instead of directly using the unstable internal APIs we can use stable APIs and rewrite any functionality which isn't even in stable APIs. I created a repository specifically for testing pdftoraster, and you can find it here: @tanmayanand44/Automated_conversion_pdftoraster_tests.

pdftoraster, as the name suggests, is the filter which is part of the cups-filter source, responsible for converting pdf to raster data format.

When the pdftoraster filter was originally written, it used XPDF APIs of Poppler. However, those APIs were unsupported in the recent versions of Poppler and completely removed from Poppler's source code. This affected cups-filters repository since pdftoraster filter was now unable to build or be used by anyone in any operating system. This also affected cups-filters as a whole, because it was unable to build. Hence, it was very important to find a solution to this problem.

Solution

The solution which was approached was to replace all the functionalities of pdftoraster which used Poppler unstable APIs with stable libraries

How does it impact

Since we now use stable APIs of poppler, there are no build errors regarding missing functions, definitions changes, etc. whenever poppler is updated.

Challenges

Firstly, it took me a lot of time to understand the huge codebase. Then when I was almost complete, I realized there were color spaces issues which lead to segmentation faults. Also, there were issues relating to page size rendering and margins.

Achievements

With all these, we are able to correctly replicate all the required functionalities of pdftoraster, at the same time eliminating unstable Poppler APIs from its source. The code has been merged in the master branch of cups-filters and will be released first as a long-standing improvement fix in Ubuntu 19.10 Eoan Ermine.

Code Links

Issues resolved:

  1. pdftoraster uses non-public/internal APIs of Poppler #9
  2. pdftoraster: New version segfaults on imagetopdf output #131
  3. Graceful/Consistent handling of zero-page jobs #117

Pull requests created:

  1. Integration of Stable API of poppler #125
  2. Added handling of zero page input and minor code quality improvement #127
  3. Added antialiasing for better raster images #129
  4. Fixes offset issues leading to segmentation faults in pdftoraster #132
  5. Fixes crashing of imagetopdf when ppd is not provided #133

Commits:

All the commits can be found in the pull request. Here is the list of commits which are present in the pull-request:

  1. Integration of Stable api's of poppler
  2. Added output message prefixes
  3. Minor poppler page enum fix
  4. Fixed compile warnings
  5. Added support for cmy and rgbw pdftoraster conversions
  6. COPYING: Updated copyright information
  7. Added handling of zero page input and minor code quality improvement
  8. Added antialiasing for better raster images
  9. Fixes offset issues leading to segmentation faults
  10. Fixes crashing of imagetopdf when ppd is not provided

Additional Tasks

With great support from my mentors, I was able to finish my official project before the timeline and therefore was assigned additional tasks.

  1. Bug fixes in imagetopdf
  2. GTK3+ Adaptor Backend: This project is a continuation from last year GSoC work. I am currently working on this and will able to finish this soon.

Footnotes

cups-filters

cups-filters got split out of CUPS for CUPS version 1.6.x, containing the filters and backends which Apple does not need for Mac OS X and therefore did not want to maintain anymore. Till Kamppeter had overtaken this part as an OpenPrinting project named cups-filters. He added cups-browsed as CUPS itself did not automatically make remote CUPS queues available locally anymore. He also took maintainership on all CUPS features which Apple has given up. With the time, cups-filters got improved cups-filters, especially switched to a PDF-based printing workflow, added legacy CUPS broadcasting/browsing, sophisticated filtering of remote printers, auto setup of remote IPP printers, driverless printing, etc., and all the time kept it compatible with new CUPS features.

You can’t perform that action at this time.