Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FB3 format basic support (for QT only) #105

Closed
wants to merge 14 commits into from
Closed

FB3 format basic support (for QT only) #105

wants to merge 14 commits into from

Conversation

pkb
Copy link
Contributor

@pkb pkb commented Aug 21, 2019

Open issues

  • Advanced FB3 features such as SVG images, float property, extended description etc are not supported
  • fb2.css was used for fb3 as is, probably some tweaks will be required for FB3
  • Not much tested, https://github.com/gribuser/FB3/blob/master/Examples/Hardcore%20file%20structure.fb3 can't be opened.
  • images having names like '%D0%BA%D0%BB%D0%B5%D1%82%D0%BA%D0%B0.jpg' in Anathomy tutorial example.fb3 also can't be opened (getObjectImageRefName() calls DecodeHTMLUrlString() which converts these %* chars, as a result the file can't be found in a zip file, not sure how to handle this caseproperly

@Frenzie
Copy link

Frenzie commented Aug 21, 2019

Pinging @poire-z

@pkb Btw, we use nanosvg for some basic SVG support, but unfortunately I notice the library's added a notice that it's no longer actively maintained.

@pkb
Copy link
Contributor Author

pkb commented Aug 21, 2019

@Frenzie thaks, I saw it. I assume when #91 issue will be closed we will be using it too :) . Don't know how good is nanosvg, but it is cool it is header only library it is a pity it is no longer maintained,.

@poire-z
Copy link
Contributor

poire-z commented Aug 21, 2019

(@Frenzie : looks easily mergable, but let's wait a bit for followup stuff if any - as there's no real demand for fb3 on our side.)

@pkb : regarding your 4th issue:

images having names like '%D0%BA%D0%BB%D0%B5%D1%82%D0%BA%D0%B0.jpg' can't be opened

We may have fixed that in koreader/crengine#143 (small commit koreader/crengine@3a74635 ). Mostly tested with links, but may work with images src.

@pkb
Copy link
Contributor Author

pkb commented Aug 21, 2019

@poire-z, thanks with this patch I'm able to see proper file name in the error message in the log. The image is in a zip file so it seems I shouldn't call DecodeHTMLUrlString() to open it. I just don't know conditions when to call and when not, so i'm going to leave it as is. Don't think it is a big issue.

@virxkane
Copy link
Collaborator

@pkb, just retry without DecodeHTMLUrlString() if file can't be opened. Like this:

diff --git a/crengine/include/lvtinydom.h b/crengine/include/lvtinydom.h
index 30549a092..e5bd05813 100644
--- a/crengine/include/lvtinydom.h
+++ b/crengine/include/lvtinydom.h
@@ -851,7 +851,7 @@ public:
     /// returns object image source
     LVImageSourceRef getObjectImageSource();
     /// returns object image ref name
-    lString16 getObjectImageRefName();
+    lString16 getObjectImageRefName(bool percentDecode = true);
     /// returns object image stream
     LVStreamRef getObjectImageStream();
     /// formats final block
diff --git a/crengine/src/lvtinydom.cpp b/crengine/src/lvtinydom.cpp
index 1f7daa7c7..d96c2c113 100644
--- a/crengine/src/lvtinydom.cpp
+++ b/crengine/src/lvtinydom.cpp
@@ -11035,7 +11035,7 @@ public:
 };
 
 /// returns object image ref name
-lString16 ldomNode::getObjectImageRefName()
+lString16 ldomNode::getObjectImageRefName(bool percentDecode)
 {
     if (!isElement())
         return lString16::empty_str;
@@ -11072,7 +11072,8 @@ lString16 ldomNode::getObjectImageRefName()
     }
     if ( refName.length()<2 )
         return lString16::empty_str;
-    refName = DecodeHTMLUrlString(refName);
+    if (percentDecode)
+        refName = DecodeHTMLUrlString(refName);
     return refName;
 }
 
@@ -11090,11 +11091,18 @@ LVStreamRef ldomNode::getObjectImageStream()
 /// returns object image source
 LVImageSourceRef ldomNode::getObjectImageSource()
 {
-    lString16 refName = getObjectImageRefName();
+    lString16 refName = getObjectImageRefName(true);
     LVImageSourceRef ref;
     if ( refName.empty() )
         return ref;
     ref = getDocument()->getObjectImageSource( refName );
+    if (ref.isNull()) {
+        // try again without percent decoding (for fb3)
+        refName = getObjectImageRefName(false);
+        if ( refName.empty() )
+            return ref;
+        ref = getDocument()->getObjectImageSource( refName );
+    }
     if ( !ref.isNull() ) {
         int dx = ref->GetWidth();
         int dy = ref->GetHeight();

@virxkane
Copy link
Collaborator

Why tag "spacing" create new paragraph? It's right?
tag-spacer

@pkb
Copy link
Contributor Author

pkb commented Aug 22, 2019

@virxkane thanks for testing, no it is not right, i will make it inline in fb3.css and will apply your patch for DecodeHTMLUrlString() today.

@virxkane
Copy link
Collaborator

Tag "spacing": still new paragraphs created.
Document's language not loaded from description file, may be this patch will be usefull?

diff --git a/crengine/src/fb3fmt.cpp b/crengine/src/fb3fmt.cpp
index 9d2dcd9d5..e0e3bab8c 100644
--- a/crengine/src/fb3fmt.cpp
+++ b/crengine/src/fb3fmt.cpp
@@ -4,6 +4,7 @@
 
 const lChar16 * const fb3_BodyContentType = L"application/fb3-body+xml";
 const lChar16 * const fb3_PropertiesContentType = L"application/vnd.openxmlformats-package.core-properties+xml";
+const lChar16 * const fb3_DescriptionContentType = L"application/fb3-description+xml";
 const lChar16 * const fb3_CoverRelationship = L"http://schemas.openxmlformats.org/package/2006/relationships/metadata/thumbnail";
 const lChar16 * const fb3_ImageRelationship = L"http://www.fictionbook.org/FictionBook3/relationships/image";
 
@@ -128,10 +129,24 @@ bool ImportFb3Document( LVStreamRef stream, ldomDocument * doc, LVDocViewCallbac
         return false;
     }
 
+    LVStreamRef descStream = arc->OpenStream(context.getContentPath(fb3_DescriptionContentType).c_str(), LVOM_READ );
+    if ( descStream.isNull() ) {
+        CRLog::error("Couldn't read description");
+        return false;
+    }
+
+    ldomDocument * descDoc = LVParseXMLStream( descStream );
+    if ( !descDoc ) {
+        CRLog::error("Couldn't parse description doc");
+        return false;
+    }
+
     lString16 author = propertiesDoc->textFromXPath( cs16("coreProperties/creator") );
     lString16 title = doc->textFromXPath( cs16("coreProperties/title") );
+    lString16 language = descDoc->textFromXPath( cs16("fb3-description/lang") );
     doc_props->setString(DOC_PROP_TITLE, title);
     doc_props->setString(DOC_PROP_AUTHORS, author );
+    doc_props->setString(DOC_PROP_LANGUAGE, language);
     CRLog::info("Author: %s Title: %s", author.c_str(), title.c_str());
     delete propertiesDoc;

@pkb
Copy link
Contributor Author

pkb commented Aug 23, 2019

@virxkane thanks, now spacing should be inline, but it looks like we have a glitch with letter-spacing

@virxkane
Copy link
Collaborator

@pkb Can you fix this glitches?

@pkb
Copy link
Contributor Author

pkb commented Aug 23, 2019

Don't know. I will take a look, if it will not require much efforts I will :).

@virxkane
Copy link
Collaborator

virxkane commented Aug 25, 2019

@pkb looks good.
Other glitch about footnotes (file nightmare_example.fb3):
footnotes-20190825-1
footnotes-20190825-2
Why are there so many footnotes?

@virxkane
Copy link
Collaborator

virxkane commented Aug 25, 2019

Lists: Invalid item numbers: starting at 2, not 1.
lists-20190825-1
The third item has been moved to the next page, possibly due to a nested tag "title".

P.S. In the file "nightmare_example.fb3" the tags "ol" and "ul" must be replaced, but this is creator's mistake.

@pkb pkb closed this Sep 14, 2019
poire-z added a commit to poire-z/crengine that referenced this pull request Sep 14, 2019
Skip elements among siblings that are not list items.
By @pkb from buggins/coolreader#105
poire-z added a commit to koreader/crengine that referenced this pull request Sep 15, 2019
Skip elements among siblings that are not list items.
By @pkb from buggins/coolreader#105
@pkb pkb deleted the fb3_support branch September 22, 2019 06:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants