Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CGPDFStringRef to NSString * #40

Open
Ismael-Schellemberg opened this issue Jun 26, 2012 · 2 comments
Open

CGPDFStringRef to NSString * #40

Ismael-Schellemberg opened this issue Jun 26, 2012 · 2 comments

Comments

@Ismael-Schellemberg
Copy link

Hi, i'm working on a project and I need to be able to highlight parts of the text by location and not by match, so I took your project and slightly modified it so that [scanner selections] returns every single character frame instead of wherever it matches the keyword

the change was fairly simple and it works like a charm, however i did found a "bug", and it's that some CGPDFStringRef's are wrongly converted (this happens on the pdf downloaded from here)

When the scanner starts, it reads the first "A" (from "A cat in his...") and gets an error when converting it

- (NSString *)stringWithCode:(int)code
{
    static NSString *singleUnicodeCharFormat = @"%C";
    NSString *characterName = [names objectForKey:[NSNumber numberWithInt:code]];
    unichar unicodeValue = [FontFile characterByName:characterName];
    return [NSString stringWithFormat:singleUnicodeCharFormat, unicodeValue];
}

unicodeValue is 0, so when it creates the return value, it's an incorrect value

this happens with about 40% of the characters found in that PDF
i tried using CGPDFStringCopyTextString like this:

CFStringRef cfStr = CGPDFStringCopyTextString(string);
NSString *cidString = [NSString stringWithString:(NSString *)cfStr];
NSString *unicodeString = [[NSString stringWithString:(NSString *)cfStr] lowercaseString];
CFRelease(cfStr);

and all the characters are converted correctly

is there a reason I should be using your method? or should I (and possibly you too) use the CGPDFStringCopyTextString function?

if i can get in contact with you i could provide you with further detail / screenshots

anyways, thanks for the great work you've done :)

@KurtCode
Copy link
Owner

Hi, thanks for your input. Sorry it has taken me this long to answer.

Anyways, the first method finds ligatures which CGPDFStringCopyTextString wont.

@rayray
Copy link

rayray commented Dec 13, 2012

@Ismael-Schellemberg Where in the scanner did you insert your CGPDFStringCopyTextString implementation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants