Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code Refactoring #440

Merged
merged 16 commits into from
Aug 16, 2021
Merged

Code Refactoring #440

merged 16 commits into from
Aug 16, 2021

Conversation

jee7
Copy link
Contributor

@jee7 jee7 commented Jul 14, 2021

Removed PHPDoc where possible in favor of Type Declarations.

Refactored the loading of Encoding classes for better typing.

Other refactorings like removing unused or immediately overwritten variables etc.

Copy link
Collaborator

@k00ni k00ni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your work! I took a quick look and found only a few things here and there.

Making function parameters and return values explicit shouldn't break API compatibility IMHO. Any opinions on that? As always, non-collaborator feedback is welcomed too!

CC @j0k3r @smalot

src/Smalot/PdfParser/Element.php Show resolved Hide resolved
src/Smalot/PdfParser/Element/ElementArray.php Outdated Show resolved Hide resolved
src/Smalot/PdfParser/Element/ElementDate.php Show resolved Hide resolved
src/Smalot/PdfParser/Parser.php Show resolved Hide resolved
src/Smalot/PdfParser/RawData/RawDataParser.php Outdated Show resolved Hide resolved
src/Smalot/PdfParser/Element/ElementArray.php Outdated Show resolved Hide resolved
src/Smalot/PdfParser/Font.php Show resolved Hide resolved
src/Smalot/PdfParser/Header.php Show resolved Hide resolved
Copy link
Collaborator

@k00ni k00ni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost done, just a few things.

src/Smalot/PdfParser/Header.php Show resolved Hide resolved
src/Smalot/PdfParser/Pages.php Outdated Show resolved Hide resolved
src/Smalot/PdfParser/Pages.php Outdated Show resolved Hide resolved
src/Smalot/PdfParser/Parser.php Show resolved Hide resolved
src/Smalot/PdfParser/Element.php Show resolved Hide resolved
@k00ni
Copy link
Collaborator

k00ni commented Jul 28, 2021

After updating your code please resolve related open conversation.

@jee7
Copy link
Contributor Author

jee7 commented Jul 28, 2021

After updating your code please resolve related open conversation.

Okay. I thought you'd want to do it yourself to check if all the things were fixed accordingly.

@k00ni
Copy link
Collaborator

k00ni commented Jul 29, 2021

We had problems with Scrutinizer lately, but #443 fixes them. After it is merged, I can take care of this PR next.

@k00ni
Copy link
Collaborator

k00ni commented Aug 3, 2021

Can you merge latest master branch in to trigger CI? I wanna see if scrutinizer fails again.

@k00ni k00ni self-assigned this Aug 3, 2021
Copy link
Collaborator

@k00ni k00ni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work and patience @jee7! Looks good, will leave it open for a few days so everyone has a chance to comment.

@jee7
Copy link
Contributor Author

jee7 commented Aug 4, 2021

Thanks for your work and patience @jee7! Looks good, will leave it open for a few days so everyone has a chance to comment.

Thank you for maintaining the repo and for the code review feedback. In the future, the logic in the lib should be also refactored at certain places if we want to go with type declared inputs and outputs. Especially at the places that currently get differently typed inputs or produce differently typed outputs.

Also, something to think about is that if we use PHPDoc for describing some (but not all) of the input parameters or output, then the PHPStorm IDE for example tells me that the PHPDoc is incomplete.

I think some guidelines for how to format the code could also be written in DEVELOPER.md. So that next PR-s know to use type declaration instead of PHPDoc, or when exactly and how PHPDoc should be used in this project etc.

@k00ni
Copy link
Collaborator

k00ni commented Aug 4, 2021

Thank you for maintaining the repo and for the code review feedback. In the future, the logic in the lib should be also refactored at certain places if we want to go with type declared inputs and outputs. Especially at the places that currently get differently typed inputs or produce differently typed outputs.

As one of the maintainers I personally focus on managing issues and pull requests, only some coding here and there. I want to enable people to contribute to the library, but it is always a trade off between refactoring and good, readable code vs. API compatibility. Going the slow, but steady way seems to be a good middle ground.

Also, something to think about is that if we use PHPDoc for describing some (but not all) of the input parameters or output, then the PHPStorm IDE for example tells me that the PHPDoc is incomplete.

If I remember correctly its because of incomplete paths to the classes. In my opinion, its sufficient for now to just outline the class name, so developers can see and look for themselves. But it should be made more reliable in the future.

I think some guidelines for how to format the code could also be written in DEVELOPER.md. So that next PR-s know to use type declaration instead of PHPDoc, or when exactly and how PHPDoc should be used in this project etc.

Good remark. Feel free to make a suggestion.

@k00ni k00ni merged commit 5667bdf into smalot:master Aug 16, 2021
* @return string
*/
public static function uchr($code)
public static function uchr(int $code): string

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems to have caused a problem of when the variable $code is not an int but rather a float.

 Smalot\PdfParser\Font::uchr(): Argument #1 ($code) must be of type int, float given, called in /home/jessegreathouse/dcol/src/vendor/smalot/pdfparser/src/Smalot/PdfParser/Font.php on line 223

  at vendor/smalot/pdfparser/src/Smalot/PdfParser/Font.php:138
    134▕
    135▕     /**
    136▕      * Convert unicode character code to "utf-8" encoded string.
    137▕      */
  ➜ 138▕     public static function uchr(int $code): string
    139▕     {
    140▕         if (!isset(self::$uchrCache[$code])) {
    141▕             // html_entity_decode() will not work with UTF-16 or UTF-32 char entities,
    142▕             // therefore, we use mb_convert_encoding() instead

This was most likely an unforeseen consequence of an innocent change.

Here is the pdf document for your convenience:
15-1039-1.pdf

<?php
use Smalot\PdfParser\Parser;
$parser = new Parser();
$pdf = $this->parser->parseFile('15-1039-1.pdf');
?>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants