Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Content offset is ignored on macOS using LibZim 9.2.0 #886

Closed
BPerlakiH opened this issue May 27, 2024 · 7 comments · Fixed by #889
Closed

Content offset is ignored on macOS using LibZim 9.2.0 #886

BPerlakiH opened this issue May 27, 2024 · 7 comments · Fixed by #889
Assignees
Milestone

Comments

@BPerlakiH
Copy link

BPerlakiH commented May 27, 2024

Under macOS Kiwix app, using LibZIM 9.2.0

We have the following code to read the content from the zim file using offset and size:

blob = item.getData(start, fmin(item.getSize() - start, end - start + 1));

Which if I got this right has the corresponding C++ function here:
https://github.com/openzim/libzim/blob/main/src/item.cpp#L50C1-L50C61

My aim is to read the data in chunks, but it seems there's a bug with the offset.
Reading the entire data from offset=0 to item.getSize() works as expected.
Starting with offset=0 and increasing the size also works as expected.

Changing the offset, does not work, it seems to be ignored, I am keep getting back the same data.
I did tried it with text/html, webp image and what I would like most to be working is webm video content.

Here are some debug prints from webm chunks:

"getURLContent: /videos/9987/video.webm 2076400 - 2076464: 1a45dfa39f4286810142f7810142f2810442f381084282847765626d4287810242858102185380670100000000f1a98a114d9b74bb4dbb8b53ab841549a96653"

"getURLContent: /videos/9987/video.webm 2076600 - 2076664: 1a45dfa39f4286810142f7810142f2810442f381084282847765626d4287810242858102185380670100000000f1a98a114d9b74bb4dbb8b53ab841549a96653"

It is the same on HTML as well read in chunks sized 100:

"Optional(\"text/html\") -> Optional(7369)"
"mime: text/html"

mType text/html from 0 to 99, size: 100
"getURLContent: /A/App/IntroPage 0 - 99: Optional(\"<!DOCTYPE html>\\n<html class=\\\"client-js\\\"><head>\\n  <meta charset=\\\"UTF-8\\\">\\n  <title>App/IntroPage</titl\")"
"got content @ isMain: false from: 0 - 99 with data: 100"

mType text/html from 100 to 199, size: 100
"getURLContent: /A/App/IntroPage 100 - 199: Optional(\"<!DOCTYPE html>\\n<html class=\\\"client-js\\\"><head>\\n  <meta charset=\\\"UTF-8\\\">\\n  <title>App/IntroPage</titl\")"
"got content @ isMain: false from: 100 - 199 with data: 100"

mType text/html from 200 to 299, size: 100
"getURLContent: /A/App/IntroPage 200 - 299: Optional(\"<!DOCTYPE html>\\n<html class=\\\"client-js\\\"><head>\\n  <meta charset=\\\"UTF-8\\\">\\n  <title>App/IntroPage</titl\")"
"got content @ isMain: false from: 200 - 299 with data: 100"

Note: I did tested it with a couple of unrelated ZIM files, and the offset was still ignored.
I even hard coded the offset values, and that still resulted in the same data returned.

@BPerlakiH
Copy link
Author

I am not sure if this other issue is related to this or not, but might be: #670

@BPerlakiH BPerlakiH changed the title Content offset is ignored on macOS Content offset is ignored on macOS using LibZim 9.2.0 May 27, 2024
@kelson42 kelson42 added this to the 9.3.0 milestone May 28, 2024
@mgautierfr
Copy link
Collaborator

How are you compiling libzim ? Do you use https://download.openzim.org/release/libzim/libzim_macos-x86_64-9.2.1.tar.gz (or the arm64 version) ?

Do you have the same issue with an older version ?

@rgaudin
Copy link
Member

rgaudin commented May 30, 2024

I believe @BPerlakiH is using libkiwix 13.1.0-1 release from https://download.kiwix.org/release/libkiwix/libkiwix_xcframework-13.1.0-1.tar.gz

@mgautierfr
Copy link
Collaborator

@BPerlakiH Can you test this https://tmp.kiwix.org/ci/dev_preview/test_reader/libkiwix_xcframework-2024-05-30.tar.gz ?
This is a build with this branch : #889

@BPerlakiH
Copy link
Author

BPerlakiH commented Jun 2, 2024

@mgautierfr I did test this build, and it seems to have the same issue:

here is a fragment of the logs from reading a css file:

From 104448 to 105471 data: 2f2a207374796c652066726f6d2068747470733a2f2f66722e6d2e77696b6970 ...
"kiwix://91BB58AE-13DF-0100-9423-D2B8617607B0/-/inserted_style.css"
From 105472 to 106495 data: 2f2a207374796c652066726f6d2068747470733a2f2f66722e6d2e77696b6970 ...
"kiwix://91BB58AE-13DF-0100-9423-D2B8617607B0/-/inserted_style.css"
From 106496 to 107519 data: 2f2a207374796c652066726f6d2068747470733a2f2f66722e6d2e77696b6970 ...
"kiwix://91BB58AE-13DF-0100-9423-D2B8617607B0/-/inserted_style.css"
From 107520 to 108543 data: 2f2a207374796c652066726f6d2068747470733a2f2f66722e6d2e77696b6970 ...
"kiwix://91BB58AE-13DF-0100-9423-D2B8617607B0/-/inserted_style.css"
From 108544 to 109567 data: 2f2a207374796c652066726f6d2068747470733a2f2f66722e6d2e77696b6970 ...
"kiwix://91BB58AE-13DF-0100-9423-D2B8617607B0/-/inserted_style.css"
From 109568 to 110591 data: 2f2a207374796c652066726f6d2068747470733a2f2f66722e6d2e77696b6970 ...
"kiwix://91BB58AE-13DF-0100-9423-D2B8617607B0/-/inserted_style.css"
From 110592 to 111615 data: 2f2a207374796c652066726f6d2068747470733a2f2f66722e6d2e77696b6970 ...
"kiwix://91BB58AE-13DF-0100-9423-D2B8617607B0/-/inserted_style.css"
From 111616 to 111685 data: 2f2a207374796c652066726f6d2068747470733a2f2f66722e6d2e77696b6970 ...

It keeps repeating the same data, regardless of the offset.

The actual output from 1024 sized chunks, is also repeating (the offset is not moving):

/*
Problematic modules: {
    "skins.minerva.base.reset": "missing",
    "skins.minerva.content.styles": "missing",
    "ext.cite.style": "missing",
    "mobile.app.pagestyles.android": "missing"
}
*/
/*
MediaWiki:Common.css
*/
.incomplete {
	background-color:#f2edb3;
	border:2px solid #ffbd00;
	padding:10px;
	line-height:35px;
}
.incomplete p:before {
	content:"!";
	color:white;
	display:block;
	float:left;
	font-size:35px;
	line-height:35px;
	padding:1px 14px;
	background-color:#ffbd00;
	border-radius:100px;
	margin-right:10px;
}
pre,.mw-code {
	background-color:#f2f2f2;
	border:1px solid #a8a8a8;
	border-radius:2px;
	display:inline-block:overflow:scroll;
	padding:5px;
	margin-bottom:10px;
        line-height:1;
}
code {
	white-space:nowrap;
}
.page-Main_Page #firstHeading,.page-Main_Page #toc {
	display:none;
}
#tagline {
	display:none;
}
h1,h2,h3,h4,h5,h6 {
	margin-top:40px;
	margin-bottom:20px;
}
video {
	height:auto!important;
}
.thumb {
	padding:5px;
	border:1px solid #bbb;
	margin-left:10px;
}

a.exte/*
Problematic modules: {
    "skins.minerva.base.reset": "missing",
    "skins.minerva.content.styles": "missing",
    "ext.cite.style": "missing",
    "mobile.app.pagestyles.android": "missing"
}
*/
/*
MediaWiki:Common.css
*/
.incomplete {
	background-color:#f2edb3;
	border:2px solid #ffbd00;
	padding:10px;
	line-height:35px;
}
.incomplete p:before {
	content:"!";
	color:white;
	display:block;
	float:left;
	font-size:35px;
	line-height:35px;
	padding:1px 14px;
	background-color:#ffbd00;
	border-radius:100px;
	margin-right:10px;
}
pre,.mw-code {
	background-color:#f2f2f2;
	border:1px solid #a8a8a8;
	border-radius:2px;
	display:inline-block:overflow:scroll;
	padding:5px;
	margin-bottom:10px;
        line-height:1;
}
code {
	white-space:nowrap;
}
.page-Main_Page #firstHeading,.page-Main_Page #toc {
	display:none;
}
#tagline {
	display:none;
}
h1,h2,h3,h4,h5,h6 {
	margin-top:40px;
	margin-bottom:20px;
}
video {
	height:auto!important;
}
.thumb {
	padding:5px;
	border:1px solid #bbb;
	margin-left:10px;
}

a.exte/*
Problematic modules: {
    "skins.minerva.base.reset": "missing",
    "skins.minerva.content.styles": "missing",
    "ext.cite.style": "missing",
    "mobile.app.pagestyles.android": "missing"
}
*/
/*
MediaWiki:Common.css
*/

@mgautierfr
Copy link
Collaborator

In this line https://github.com/kiwix/kiwix-apple/blob/main/Model/ZimFileService/ZimFileService.mm#L201, you return the data from item.getData().data(). Should it be blob.data() instead?

@BPerlakiH
Copy link
Author

🤦‍♂️ ohh.. that makes sense now, it is working as expected with those changes..
closing this ticket. Thank you @mgautierfr for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants