Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected behaviour due to fpdf being the module name #925

Closed
nemkin opened this issue Sep 14, 2023 · 12 comments · Fixed by #1042
Closed

Unexpected behaviour due to fpdf being the module name #925

nemkin opened this issue Sep 14, 2023 · 12 comments · Fixed by #1042
Labels

Comments

@nemkin
Copy link

nemkin commented Sep 14, 2023

fpdf2 is a fork of fpdf, which seems to be no longer maintained. Both projects have the fpdf module name, which is really confusing and causes issues due to API changes between the two.

Error details

Consider for example seeing this python script:

from fpdf import FPDF

pdf = FPDF()
pdf.add_page(format=(100, 100))
content.set_fill_color(0,0,0)
content.rect(0, 0, 100, 100, 'F')
content.output('result.pdf')

Based on what you see, you'd want to run pip install fpdf in order to use the script, which succeeds. However that results in the following error, as the API has been changed in fpdf2.

$ python example.py 
Traceback (most recent call last):
  File "/home/user/example.py", line 4, in <module>
    pdf.add_page(format=(100, 100))
TypeError: add_page() got an unexpected keyword argument 'format'

Looking up the error, you can see other people are having similar issues [1, 2, 3, 4].

It seems to me that the best solution would be to have pip install fpdf install fpdf2, but that requires permission from its maintainer.

If not, there should be clear distinction between the two projects, e.g. use import fpdf2, so it is clear which one of the two a Python script was based on.

I've commented this on your issue in the fpdf repository too.

@nemkin nemkin added the bug label Sep 14, 2023
@Lucas-C
Copy link
Member

Lucas-C commented Sep 14, 2023

Hi @nemkin

Thank you for taking the time to open a discussion on this subject.

I am one of fpdf2 maintainers, along with @gmischler & @andersonhc.

@reingart, the maintainer of fpdf, has not been very active on GitHub over the past years.
He reached out to me & @MartinThoma by email on March 2023.
In order to be fully transparent with fpdf2 users, I will share here the content of this quick exchange.
I haven't heard back from him since my answer below.

Hi Martin and Lucas,

Thanks for your email, and sorry if I didn't reply in the past.
I was a bit busy / lacking time for this project.
Also, I missed some notifications (and saw that anyway some were moving to a fork).

BTW, this library is still in use, it is not dead :-)

I saw the fpdf2 but honestly the change was too big to review, and there could be security and backward compatibility issues.

We had several projects in Argentina that depend on this (including localizations of Odoo/Tryton/etc.), mainly via PyAfipWs, a library to generate electronic invoices and talk with government webservices in the country.
As you can understand, this library is used to manage invoices (money), so I didn't feel comfortable switching blindly (especially when resources/time were scarce these last years).

Sadly I still have to maintain python2 support for some legacy users (specially in windows), and that doesn't seem supported by fpdf2 anymore.
There are other dependencies (mainly py2exe), that until recently didn't have good support for python3 and are important for our user base.
I hope in the near future to finish the python3 migration and drop python2.

Also, the new fpdf2 library seems to have some API changes and includes new third-party dependencies, so I don't know if this can raise conflicts.
I tried it but it seems that fpdf2 is not backwards compatible with the original version (created a ticket #111 in my project)
It seems that the project diverted from original goals of simplicity and symmetry with the upstream php lib, but also I saw that you did a great job with testing and enhancements.

Said that, I don't want to cause issues to the community, so I propose the following steps:

  • Move development to PyFPDF/fpdf2, rename it to fpdf and move it to /py-pdf/fpdf (or fork it there
  • Merge the PR 207 but marking my repo as legacy, not inactive/archived (at least until I can review, test and switch to fpdf2)
  • Rename the master branch to main, and giving you permissions so both forks are synced (this is optional, just to differentiate legacy and current branches)
  • Give me time to change dependencies to fpdf>=1.7.x <2.0.0 (legacy branch) in related projects
  • Put a template for new issues in my personal repo to point directly to /py-pdf/fpdf
  • Give you access to the pypi fpdf record to publish new releases

I think this is the best way to keep history and support a legacy 1.7.x version to help with the transition.

I would like to retain admin/maintainer permissions of the project/pypi record just in case:

  • To publish any legacy 1.7.x version update if needed
  • To fix any packaging issue that this changes could cause
  • To contribute back / help to review new changes so I can start to use fpdf2 for projects here in Argentina

Hopefully the impact could be controlled, third party libraries should point to the pinned version or the full repo url.

Sorry if I am a bit conservative, I had bad experiences with this kind of transfers in the past.
For example, for pysimplesoap library, something similar happened and in the end I had to maintain a new incompatible codebase that grew too much, as new maintainers changed their interest and left.

I see fpdf2 is keeping a fast pace of maintenance, so luckily this time the handover will be more graceful.

Just let me know what do you think about this

Thanks to you both for your time and effort in this library and python ecosystem.

My answer at the time was:

Hi Mariano!

It's nice hearing from you ☺

Without your work on PyFPDF, I wouldn't have been able to create Undying Dusk, and later on become fpdf2 maintainer.
So thank you very much!

BTW, this library is still in use, it is not dead :-)

I know that PyFPDF is still used in many codebases and by many users 😄
However, the lack of activity on the GitHub repository for 5 to 8 years (as stated in this issue),
really made all of us think the project was dead...
Which leaded David Ankin (@alexanderanki) to create a fork,
and later on he handed maintership over to me.

I saw the fpdf2 but honestly the change was too big to review, and there could be security and backward compatibility issues.

Regarding backward compatibility issues, there may be some, but I really took care to maximize fpdf2 compatibilty with PyFPDF.
Regarding security, I think we actually solved a few vulnerabilities... E.g. regarding attacks using pickle or SVG (XML) images.

We had several projects in Argentina that depend on this, mainly via PyAfipWs, [...] I didn't feel comfortable switching blindly

I totally understand that, you are right to be careful 😊
And if you don't have any bug related to PyFPDF, nor any need to use one of the extra features that we added in the past 2 years,
you may be better off sticking with PyFPDF!

Sadly I still have to maintain python2 support for some legacy users (specially in windows), and that doesn't seem supported by fpdf2 anymore.

No, fpdf2 does not support Python 2.
I'm sorry for you if you have to support code for Python 2, which is not maintained since 2020 : https://pythonclock.org/ 😥

I tried it but it seems that fpdf2 is not backwards compatible with the original version (created a ticket #111 in my project)

I answered you yesterday in this GitHub issue 😊
We deprecated several things, that have been documented there by another contributor:
#433 (reply in thread)
I'd say that 90% of the deprecated API parameters are still working, but trigger a DeprecationWarning.

It seems that the project diverted from original goals of simplicity and symmetry with the upstream php lib, but also I saw that you did a great job with testing and enhancements.

We certainly diverted from the goal of sharing the same API as the original PHP lib (I admit that I was a bit horrified initially by the php.py module in PyFPDF source code ^^), but I still have the goal of keeping the library relatively simple, or at least accessible to novice Python developpers.

Said that, I don't want to cause issues to the community, so I propose the following steps: ...

Marking the PyFPDF repository as legacy / archived seems a really good idea to me ☺

If you want to designate fpdf2 as the official "successor" library of PyFPDF on its README page, I would be honored! ❤

However I am not convinced by the idea of renaming fpdf2 into fpdf.
fpdf2 has now gathered a significant community (for example, it has almost as many GitHub stars as PyFPDF),
and I do not want to make all fpdf2 users switch to another library.

Moreover, I do not really want to maintain the original PyFPDF library:
it is now very far behind fpdf2 (in terms of commits & features)
and personnally I cannot allocate more time at the moment to FLOSS projects

Now I'd be very happy to take in consideration the opinions & suggestions of fpdf & fpdf2 users regarding this 🙂
Just keep in mind that none of us at @py-pdf have ANY control over @reingart GitHub repository & the original Pypi package.

@nemkin
Copy link
Author

nemkin commented Sep 14, 2023

Thank you for sharing these emails!

It sounds like both fpdf and fpdf2 are in use by a large community and have differing priorities, so I'd refrain from asking for either of the two taking over the other.

So personally, I feel the only problem is this repository sharing a module name with another popular repository for a similar usecase, with overlapping, but non-identical API, which causes a lot of confusion when people share scripts online.

I'd argue most people, when they install dependencies, they'll do:

  1. pip install <module name>
  2. And if that fails, they'll search for "how to install <module name>".

You can see how both of those steps result in unexpected and confusing results for this repository.

Even though I have specified in my personal case which of the two repositories needs to be installed, it did not help, as noone expects having to pay attention to something like this, thus the error was unexpected on their side as well.

Also I wanted to add here that this & the pypdf module in this organisation are really awesome, I've had quite a hard time editing pdf files before I found these, so thank you for maintaining and improving this project. 😊


Edit:

Although I'm not familiar with Python module packaging, a suggestion for "graceful renaming" would be something like:

Publish fpdf2 under package name fpdf2 and module name fpdf2. Have it depend on a package with module name fpdf and some fitting package name like fpdf2-renamed or similar, specifically from you, not the original fpdf package. So when someone installs fpdf2, that will install your fpdf as its dependency.

This way users will still have an fpdf module installed on their system, when they pip install fpdf2. Have this package behave in the following way when imported: Throw a Deprecation Warning & if it is possible import fpdf2 under the name fpdf(?).

@Lucas-C
Copy link
Member

Lucas-C commented Sep 14, 2023

Thank you for your thoughtful & kind comments @nemkin 🙂

I guess that ideally fpdf2 should have imports like:

from fpdf2 import ...

But changing that now would be a major breaking change to perform in fpdf2...
Yes, that would clarify the distinction between fpdf & fpdf2,
but it would also cause trouble to many users!

We could try to have both kind of imports to work for some time (from fpdf ... AND from fpdf2 ...),
but I'm not entirely sure if that would work (cf. setup.cfg),
and if that's really a good idea...

I'd be very happy to know the points of view of other contributors & users on the subject 🙂

@Lucas-C Lucas-C added question and removed bug labels Sep 14, 2023
@gmischler
Copy link
Collaborator

Given how much more functionality fpdf2 offers, the level of compatibility that still exists between the two is in fact truly amazing. We maintain a list of the differences. Actual porting difficulties should be few and far between.

The align="I" in pyAfipWs #111 was probably meant to be an align="J", and gets silently ignored by the legacy version. I'd consider that lack of input validation a bug and not a compatibility issue.

If I may be even more blunt than usual, I never understood why someone wanted to replicate a PHP API in Python (or any language, and it never was a particularly great API even by PHP standards). But that's what we have now, and we can only try to improve on it.

While the original goal was to create a "simplistic" library, we have now settled on trying to keep it as simple and straightforward as possible, while offering more advanced features. In fact, the text shaping implemented with 2.7.5 is absolutely unique among Python libraries, and opens it up to a truly global audience.

To @reingart s individual points:

  • I don't really see the point of any formal "reunification", or changing the name of fpdf2 back to fpdf again. Given the years of history and solid user base, that would only result in even more confusion.

  • Marking PyFPDF as "legacy" is absolutely necessary (or "inactive", not sure what the difference is). Since development has stopped there years ago, the documentation should clearly recommend new projects to rather use fpdf2.

  • Why does it matter how the main branches of the two projects are named? How does "syncing" them (not sure what exactly that means) help differentiate between the two?

  • Obviously, the legacy project still needs to be around (and separate!) as long as it is in production use.

  • A Template for new issues on the legacy side informing about the current project sounds like a very helpful idea.

  • The pypi entries must also remain seperate in order to support legacy users. Of course the documentation there should also indicate the legacy status.

On our side, we should probably document somehow that having both packages installed on the same machine can lead to problems due to the identical import name. This should really only happen to legacy users while transitioning to fpdf2. But we need to give those a fair warning, so they can take appropriate measures to mitigate the issue.

@andersonhc
Copy link
Collaborator

I'd argue fpdf2 is simpler to work with comparing to the legacy fpdf. I have programs built before table was implemented and reworking them now is making them so much cleaner and easier to maintain.

The arguments for renaming the library and unforking (#906) are valid and make complete sense, however it would cause so many other problems for current users I don't think it's worth it.

@Lucas-C
Copy link
Member

Lucas-C commented Oct 9, 2023

We recently had another occurence of a user confusion between fpdf & fpdf2:
#948
😢

I took this opportunity to make some basic tests about the situations where both fpdf & fpdf2 packages are installed on a system at the same time:

  • pip install fpdf2; pip install fpdf results in fpdf (PyFPDF) being the "active" version loaded when running import fpdf

  • both pip install fpdf; pip install fpdf2 and pip install fpdf2 fpdf (whatever the order of arguments) always result in fpdf2 being the active version

All source files from both libraries are written by pip into .../python3.X/site-packages/fpdf/.

That's not ideal, but it means that we could easily add a check to fpdf2 that would be like this:

try:
    import .ttfonts  # this module only exists in PyFPDF, it has been removed in fpdf2 since v2.5.7
    warnings.warn("You have both PyFPDF & fpdf2 installed - To only keep fpdf2 run: pip uninstall --yes fpdf && pip install --upgrade fpdf2")
except ImportError:
    pass  # no PyFPDF installation detected

Do you think such check would be a good idea @andersonhc & @gmischler?

@MartinThoma
Copy link
Member

Thank for sharing - I haven't thought about how Python handles package installations in that way. Super interesting!

@Lucas-C
Copy link
Member

Lucas-C commented Nov 20, 2023

Do you think such check would be a good idea @andersonhc & @gmischler?

Just a "bump" on this question / suggestion: unless you disagree, I'm planning on adding those 5 lines to fpdf/__init__.py

@gmischler
Copy link
Collaborator

Do you think such check would be a good idea @andersonhc & @gmischler?

Identifying a double install in __init__.py is definitively a good idea.
But why just a warning and not an error? Having both packages installed in the same PYTHONPATH (let alone the same directory) never makes sense, so it's better to be strict about it.

Those few who actually need both installed on the same system should be able to keep them apart with the help of venv.

@Lucas-C
Copy link
Member

Lucas-C commented Nov 20, 2023

But why just a warning and not an error? Having both packages installed in the same PYTHONPATH (let alone the same directory) never makes sense, so it's better to be strict about it.

A warning seemed better to me because an extra ttfonts.py file in fpdf2 source folder will not cause any trouble.
If this results from running pip install fpdf and then pip install fpdf2 (which is very likely), then the latest fpdf2 version has been correctly installed. It's more a matter of "clean up", and making clear that both libraries cannot be installed simultaneously

@andersonhc
Copy link
Collaborator

Do you think such check would be a good idea @andersonhc & @gmischler?

I wish there was a better way to break apart from fpdf but realistically it's never going to happen.
I'm OK with adding the warning - don't cause any overhead or trouble for fpdf2 and can help users confused with the libraries.

@Lucas-C
Copy link
Member

Lucas-C commented Dec 5, 2023

I opened #1042 to introduce this warning

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants