New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML dictionary link support #3603

Merged
merged 10 commits into from Jan 15, 2018

Conversation

Projects
None yet
3 participants
@TnS-hun
Contributor

TnS-hun commented Jan 14, 2018

Closes #3588

@poire-z

readerdictionary.lua indeed looks a lot clearer now :)

h = math.abs(link.y1 - link.y0),
}
self.highlight = nil

This comment has been minimized.

@poire-z

poire-z Jan 14, 2018

Contributor

No side effect with unsetting self.highlight (like the highlight in the book not being unhighlighted when you leave all dict windows) ? (Just reading code, can't test for now.)

This comment has been minimized.

@TnS-hun

TnS-hun Jan 14, 2018

Contributor

You're right, it's better to leave the highlight alone and keep the value that was set when the dictionary got opened. I didn't see any side effects though. :)

for _, link in pairs(links) do
if pos.x >= link.x0 and pos.x < link.x1 and pos.y >= link.y0 and pos.y < link.y1 then
return link

This comment has been minimized.

@poire-z

poire-z Jan 14, 2018

Contributor

Never really had PDF files with links, so just curious: how does MuPDF do when a link spans multiple words (may not happen with dictionaries definitions if they are a single word), and the line is wrapped in the middle of these multiple words: does it return 2 links (one for the first word at the far right of 1st line, one for the second word at the far lest of 2nd line)? (If not, the above may not work in such cases)

This comment has been minimized.

@Frenzie

Frenzie Jan 14, 2018

Member

Most links I come across in regular use are footnotes (i.e., just a few digits) but should be simple to check on the vast majority of PDFs. TOCs are common.

This comment has been minimized.

@Frenzie

Frenzie Jan 14, 2018

Member

This PDF has a multiline link at the bottom of p. 471 http://www.oapen.org/search?identifier=641735 (Sémantique formelle: Volume 1 : Introduction à la grammaire de Montague)

This comment has been minimized.

@TnS-hun

TnS-hun Jan 14, 2018

Contributor

MuPDF returns one link per line if the link spans multiple lines.

@@ -313,14 +304,50 @@ If you'd like to change the order in which dictionaries are queried (and their r
end
function ReaderDictionary:onLookupWord(word, box, highlight, link)
logger.dbg("lookup word:", word, box)

This comment has been minimized.

@Frenzie

Frenzie Jan 14, 2018

Member

Since we're modifying this line anyway maybe add some prefix? dict lookup word and dict stripped word or something. (You know, for grepping.)

This comment has been minimized.

@TnS-hun

TnS-hun Jan 14, 2018

Contributor

Okay, I've added the dict prefix.

end)
return true
end
function ReaderDictionary:onHtmlDictionaryLinkTapped(dictionary, link)
if not link.uri then

This comment has been minimized.

@Frenzie

Frenzie Jan 14, 2018

Member

Presumably also or link.uri == ""?

This comment has been minimized.

@TnS-hun

TnS-hun Jan 14, 2018

Contributor

It can't hurt. :)

-- The protocol is either "bword" or there is no protocol, only the word.
-- https://github.com/koreader/koreader/issues/3588#issuecomment-357088125
local word = link.uri

This comment has been minimized.

@Frenzie

Frenzie Jan 14, 2018

Member

Would it be clearer to write it like

local word -- equivalent to local word = nil
if prefix then word = link.uri:sub
elseif :// then return
else word = link.uri end

(Pay no mind to the formatting.)

This comment has been minimized.

@TnS-hun

TnS-hun Jan 14, 2018

Contributor

To my non-Lua trained eyes those single line ifs are a bit harder to read but I've changed the code.

This comment has been minimized.

@Frenzie

Frenzie Jan 14, 2018

Member

ah no no no! D-:

I meant the if elseif else structure. Hence "don't mind the formatting" I'm just typing this up in 10 seconds. :-P

This comment has been minimized.

@Frenzie

Frenzie Jan 14, 2018

Member

Yes, exactly. 👍

end
end
return nil

This comment has been minimized.

@Frenzie

Frenzie Jan 14, 2018

Member

You can just leave this out.

This comment has been minimized.

@TnS-hun

TnS-hun Jan 14, 2018

Contributor

No problem.

@Frenzie

This comment has been minimized.

Member

Frenzie commented Jan 14, 2018

Didn't test; looks good.

@Frenzie Frenzie added the enhancement label Jan 14, 2018

TnS-hun added some commits Jan 14, 2018

@Frenzie

lgtm

@Frenzie

This comment has been minimized.

Member

Frenzie commented Jan 14, 2018

@poire-z I'll leave the honors to you (plus I don't know if you might have any further comments).

@poire-z

This comment has been minimized.

Contributor

poire-z commented Jan 14, 2018

I just tested it, and I think the self.highlight = nil I wondered about may be needed.
Without it, when you close a bword nested dict window, the highlight in the book is unhighlighted (select a word at top or bottom of page to witness that). It should stay till we close the last window (like it does when you close a nested dict window you got with Hold on a non-link word).

With self.highlight = nil, it seems to work as it does with Hold.
(I'm just not sure what part of the code does the unhighlight now :)

Note that the hold on word callback does (with lookup_target="LookupWord")

self.ui:handleEvent(
-- don't pass self.highlight to subsequent lookup, we want
-- the first to be the only one to unhighlight selection
-- when closed
Event:new(lookup_target, text)

so it calls ReaderDictionary:onLookupWord() without the 3rd highlight argument, which then sets self.highlight to nil.
So, I guess it's fine to do that also where you initially did it (sorry for the invitation to remove it :| )
It should be good to merge after that.

As a side note about MuPDF:
my PetitRobert2007 gives in its HTML: <a HREF="bword://boutique">boutique,</a>
The uppercase HREF is enough to make MuPDF not consider it a link (page:getPageLinks() returned nothing). I had to lowercase() them all in my fix_html_func to see them returned and be able to test this PR.

@Frenzie

This comment has been minimized.

Member

Frenzie commented Jan 15, 2018

Yes, in HTML both are valid; in XHTML only lowercase.

@TnS-hun

This comment has been minimized.

Contributor

TnS-hun commented Jan 15, 2018

(I'm just not sure what part of the code does the unhighlight now :)

It's removed in DictQuickLookup:onClose and in DictQuickLookup:onHoldClose.

sorry for the invitation to remove it

No problem. :) It's quite tricky that only the first dicitonary window stores the highlight.

@poire-z poire-z merged commit b40bc53 into koreader:master Jan 15, 2018

1 check passed

ci/circleci Your tests passed on CircleCI!
Details

@TnS-hun TnS-hun deleted the TnS-hun:patch-html-dictionary-link-support branch Jan 15, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment