Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improper Display of °(Degree Symbol) #61

Closed
sumit809 opened this issue Jun 26, 2020 · 9 comments
Closed

Improper Display of °(Degree Symbol) #61

sumit809 opened this issue Jun 26, 2020 · 9 comments

Comments

@sumit809
Copy link

Hi Andre,

As I have to show temperature let's say 27°C, it's resulting as 27A°C in the PDF. I opened pdf using MAC Preview and Adobe Reader, but same results.
This problem also occurs with special characters as well.

please take a look at the main code and output.

Thanks

output2.pdf

main.c.txt

@AndreRenaud
Copy link
Owner

Hi. That's really odd - when I run your program it outputs fine (see attached).

The diff shows only a single change of interest (excluding xref/header stuff):

@@ -67,7 +67,7 @@
 endobj
 6 0 obj
 << /Length 118 >>stream
-BT /GS0 gs 50.000000 15.000000 TD /F1 10.000000 Tf 0.000000 0.000000 0.000000 rg 0.000000 Tc (Temperature: 27°C) Tj ET
+BT /GS0 gs 50.000000 15.000000 TD /F1 10.000000 Tf 0.000000 0.000000 0.000000 rg 0.000000 Tc (Temperature: 27�C) Tj ET
 endstream
 endobj
 7 0 obj
@@ -94,7 +94,7 @@

Looking at the binary output, the unknown symbol in my one is 0xb0, but in yours it is two symbols, 0xc2 then 0xb0. I don't entirely understand what is going on here. I think there is an extra character in your file - it is utf-8 encoded? The degrees symbol should just be 0xb0 I think (https://en.wikipedia.org/wiki/Degree_symbol)

output.pdf

@AndreRenaud
Copy link
Owner

Ok, I see what happened. The raw UTF-8 has gone through into the PDF output in your code, but in mine it has been translated down to just the single character. Are you running the most recent code, with the utf8_to_utf32 function?
Could you try patching in the following, and telling me what it outputs?

diff --git a/pdfgen.c b/pdfgen.c
index 08cba4e..2bb2546 100644
--- a/pdfgen.c
+++ b/pdfgen.c
@@ -1113,6 +1113,9 @@ static int utf8_to_utf32(const char *utf8, int len, uint32_t *utf32)
         else
             ch |= ((uint32_t)(*utf8++) & 0x3f) << shift;
     }
+    if (len > 1) {
+        printf("Consumed %d to get 0x%x\n", len, ch);
+    }

     *utf32 = ch;

@sumit809
Copy link
Author

Hi..!
I added the patch suggested by you. It prints
Consumed 2 to get 0xb0

I'm running the latest code at onlineGDB for now. You can see the results at the following link
https://onlinegdb.com/B1b5nC4RL

@AndreRenaud
Copy link
Owner

I can't seem to download the output files from onlinegdb.com. Can you tell if the 0xc2 symbol is still present in the output? It shouldn't be (the 'consumed 2' line means that two utf-8 characters were absorbed, and the output was only 0xb0, which means it should be put into the PDF as a single character).

@AndreRenaud
Copy link
Owner

I made a fork of your project, https://onlinegdb.com/B1RB27HCI, and added a hex dump of the output. In that project, the 0xc2 does not seem to be appearing in the final PDF file:
(This is a snippet from the full dump).

0x20                                                                                                                        
0x32 2                                                                                                                      
0x37 7                                                                                                                      
0xb0 �                                                                                                                      
0x43 C                                                                                                                      
0x29 )                                                                                                                      
0x20                                                                                                                        
0x54 T  

Can you confirm if you run that project on your local machine whether it works ok?

@sumit809
Copy link
Author

Hi Andre!
I ran your forked project and it's producing same output on onlineGDB.

Then I tried to run this code on my local machine on Visual Studio Code. And the issue is resolved, it's working fine on local machine.
patchTest.pdf

So, is it the problem of onlineGDB which adds extra character?

@AndreRenaud
Copy link
Owner

Hi,
When you originally tested it, was it with Visual Studio, or the onlineGDB? I cannot fully explain why this would make a difference. I am unable to replicate the issue, either in onlineGDB or elsewhere.
The onlineGDB fork that I ran did not have the issue - the output from it skipped over the 0xc2 character (as expected).
Good to hear that it's resolved for you, but I'll leave this open in case we get more information.

@sumit809
Copy link
Author

sumit809 commented Jul 1, 2020

Hi,
I have been testing your library on onlineGDB since beginning. I found different expected results only on Visual Studio in first attempt.

You can observe the output file by clicking on download code in your forked project on onlineGDB. I'm still getting extra character there in the output pdf.

@AndreRenaud
Copy link
Owner

I believe this issue is resolved with the latest code, but feel free to reopen this if that is not the case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants