Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to load the article requested when opening a ZIM #779

Closed
benoit74 opened this issue May 30, 2024 · 7 comments · Fixed by #797
Closed

Unable to load the article requested when opening a ZIM #779

benoit74 opened this issue May 30, 2024 · 7 comments · Fixed by #797
Milestone

Comments

@benoit74
Copy link

ZIM: https://mirror.download.kiwix.org/zim/.hidden/dev/fas-military-medicine_en_2024-05.zim
Kiwix app: 3.4.0 (161) - Testflight
OS: MacOS Sonoma 14.5

When opening ZIM, we get a weird Unable to load the article requested message but after clicking ok we actually do get the page and everything works fine.

Screenshot 2024-05-30 at 17 39 25

ZIM is known to work in kiwix-serve and Android, so I suspect an issue in Kiwix Apple, tbc.

@benoit74
Copy link
Author

Same issue with https://dev.library.kiwix.org/viewer#mes-quartiers-chinois_fr_all_2024-05 but this time the ZIM does not work at all even after clicking "OK".

@BPerlakiH
Copy link
Collaborator

Thank you @benoit74 for raising this issue.
Could you give some more context to it please?
Do you download the file within the app, and opening it or maybe downloading it externally and try to open from Finder?

@kelson42
Copy link
Contributor

kelson42 commented Jun 1, 2024

Thank you @benoit74 for raising this issue.

Could you give some more context to it please?

Do you download the file within the app, and opening it or maybe downloading it externally and try to open from Finder?

@BPerlakiH The two ZIM files given as example are not available in official public library.kiwix.org. They have to be sideloaded. But, how would that play a role? Are side loaded ZIM file open differently than the others?

@benoit74
Copy link
Author

benoit74 commented Jun 1, 2024

Sure, file is downloaded externally, loaded in the app with the + button in opened tab, and the issue appears when I click "Open main page".
Do you achieve to repro the issue?

@BPerlakiH
Copy link
Collaborator

@benoit74 I have tested the fas-military one, and could reproduce the issue consistently.
I looked underneath, and found that it is a content issue, it tries to load the following url, which is missing:
kiwix://6E4F3D4A-2F8A-789A-3B88-212219F4FB27/www.fas.org/websiteimprovementform.html
That is defined in the index.html as per:

<iframe frameborder="0" src="../../../www.fas.org/websiteimprovementform.html" width="750" height="300" allowtransparency="true" scrolling="no"></iframe>

Additionally the following links are also broken:

kiwix://6E4F3D4A-2F8A-789A-3B88-212219F4FB27/ssl.google-analytics.com/ga.js
kiwix://6E4F3D4A-2F8A-789A-3B88-212219F4FB27/irp.fas.org/doc_logo.gif
kiwix://6E4F3D4A-2F8A-789A-3B88-212219F4FB27/www.google-analytics.com/urchin.js
kiwix://6E4F3D4A-2F8A-789A-3B88-212219F4FB27/irp.fas.org/doc_logo.gif

will look into the 2nd ZIM as well.

@BPerlakiH
Copy link
Collaborator

The other one has a missing content on the main url:
Missing content: kiwix://861C031F-DAFB-9688-4DB4-8F1199FE2926/mesquartierschinois.wordpress.com/

I am not sure why / how this URL is different from other ZIMs.

In fact it gets stuck at the point of looking up the redirect URL path for the content path:
"mesquartierschinois.wordpress.com"

Kiwix`zim::parseLongPath:
    0x100a43824 <+0>:   sub    sp, sp, #0x40
    0x100a43828 <+4>:   stp    x20, x19, [sp, #0x20]
    0x100a4382c <+8>:   stp    x29, x30, [sp, #0x30]
    0x100a43830 <+12>:  add    x29, sp, #0x30
    0x100a43834 <+16>:  mov    x1, x0
    0x100a43838 <+20>:  mov    x19, x8
    0x100a4383c <+24>:  ldrsb  w8, [x0, #0x17]
    0x100a43840 <+28>:  tbnz   w8, #0x1f, 0x100a438ac    ; <+136>
    0x100a43844 <+32>:  and    x9, x8, #0xff
    0x100a43848 <+36>:  ldrb   w11, [x1]
    0x100a4384c <+40>:  cmp    w11, #0x2f
    0x100a43850 <+44>:  cset   w8, eq
    0x100a43854 <+48>:  mov    w10, #0x1
    0x100a43858 <+52>:  cinc   x10, x10, eq
    0x100a4385c <+56>:  cmp    x10, x9
    0x100a43860 <+60>:  b.hi   0x100a43978               ; <+340>
    0x100a43864 <+64>:  cbnz   w9, 0x100a43870           ; <+76>
    0x100a43868 <+68>:  cmp    w11, #0x2f
    0x100a4386c <+72>:  b.eq   0x100a43974               ; <+336>
    0x100a43870 <+76>:  cmp    w11, #0x2f
    0x100a43874 <+80>:  cset   w11, eq
    0x100a43878 <+84>:  cinc   x12, x1, eq
    0x100a4387c <+88>:  ldrb   w12, [x12]
    0x100a43880 <+92>:  cmp    w12, #0x2f
    0x100a43884 <+96>:  b.eq   0x100a43978               ; <+340>
    0x100a43888 <+100>: cmp    x10, x9
    0x100a4388c <+104>: b.hs   0x100a4389c               ; <+120>
    0x100a43890 <+108>: ldrb   w10, [x1, x10]
    0x100a43894 <+112>: cmp    w10, #0x2f
    0x100a43898 <+116>: b.ne   0x100a43978               ; <+340>
    0x100a4389c <+120>: cmp    x11, x9
    0x100a438a0 <+124>: b.hi   0x100a43974               ; <+336>
    0x100a438a4 <+128>: ldrb   w20, [x1, x11]
    0x100a438a8 <+132>: b      0x100a43910               ; <+236>
    0x100a438ac <+136>: ldp    x9, x12, [x1]
    0x100a438b0 <+140>: ldrb   w10, [x9]
    0x100a438b4 <+144>: cmp    w10, #0x2f
    0x100a438b8 <+148>: cset   w8, eq
    0x100a438bc <+152>: mov    w11, #0x1
    0x100a438c0 <+156>: cinc   x11, x11, eq
    0x100a438c4 <+160>: cmp    x11, x12
    0x100a438c8 <+164>: b.hi   0x100a43978               ; <+340>
    0x100a438cc <+168>: cmp    w10, #0x2f
    0x100a438d0 <+172>: cset   w10, eq
    0x100a438d4 <+176>: cmp    x12, x10
    0x100a438d8 <+180>: b.lo   0x100a43974               ; <+336>
    0x100a438dc <+184>: ldrb   w13, [x9, x10]
    0x100a438e0 <+188>: cmp    w13, #0x2f
    0x100a438e4 <+192>: b.eq   0x100a43978               ; <+340>
    0x100a438e8 <+196>: cmp    x11, x12
    0x100a438ec <+200>: b.hs   0x100a438fc               ; <+216>
    0x100a438f0 <+204>: ldrb   w9, [x9, x11]
    0x100a438f4 <+208>: cmp    w9, #0x2f
    0x100a438f8 <+212>: b.ne   0x100a43978               ; <+340>
    0x100a438fc <+216>: ldr    x9, [x1, #0x8]
    0x100a43900 <+220>: cmp    x9, x10
    0x100a43904 <+224>: b.lo   0x100a43974               ; <+336>
    0x100a43908 <+228>: ldr    x11, [x1]
    0x100a4390c <+232>: ldrb   w20, [x11, x10]
    0x100a43910 <+236>: cmp    w8, #0x0
    0x100a43914 <+240>: mov    w8, #0x2
    0x100a43918 <+244>: cinc   w8, w8, ne
    0x100a4391c <+248>: cmp    w8, w9
    0x100a43920 <+252>: csel   w2, w8, w9, lo
    0x100a43924 <+256>: mov    x0, sp
    0x100a43928 <+260>: add    x4, sp, #0x18
    0x100a4392c <+264>: mov    x3, #-0x1
    0x100a43930 <+268>: bl     0x100c37908               ; symbol stub for: std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>::basic_string(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, unsigned long, unsigned long, std::__1::allocator<char> const&)
    0x100a43934 <+272>: strb   w20, [x19], #0x8
    0x100a43938 <+276>: mov    x1, sp
    0x100a4393c <+280>: mov    x0, x19
    0x100a43940 <+284>: bl     0x100c375c0               ; symbol stub for: std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>::basic_string(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&)
    0x100a43944 <+288>: ldrsb  w8, [sp, #0x17]
    0x100a43948 <+292>: tbnz   w8, #0x1f, 0x100a4395c    ; <+312>
    0x100a4394c <+296>: ldp    x29, x30, [sp, #0x30]
    0x100a43950 <+300>: ldp    x20, x19, [sp, #0x20]
    0x100a43954 <+304>: add    sp, sp, #0x40
    0x100a43958 <+308>: ret    
    0x100a4395c <+312>: ldr    x0, [sp]
    0x100a43960 <+316>: bl     0x100c37bfc               ; symbol stub for: operator delete(void*)
    0x100a43964 <+320>: ldp    x29, x30, [sp, #0x30]
    0x100a43968 <+324>: ldp    x20, x19, [sp, #0x20]
    0x100a4396c <+328>: add    sp, sp, #0x40
    0x100a43970 <+332>: ret    
    0x100a43974 <+336>: bl     0x100c38100               ; symbol stub for: abort
    0x100a43978 <+340>: mov    w0, #0x10
    0x100a4397c <+344>: bl     0x100c37b90               ; symbol stub for: __cxa_allocate_exception
    0x100a43980 <+348>: mov    x20, x0
    0x100a43984 <+352>: adrp   x1, 569
    0x100a43988 <+356>: add    x1, x1, #0xc4c            ; "Cannot parse path"
    0x100a4398c <+360>: bl     0x100c37b18               ; symbol stub for: std::runtime_error::runtime_error(char const*)
    0x100a43990 <+364>: adrp   x1, 2835
    0x100a43994 <+368>: ldr    x1, [x1, #0x9e8]
    0x100a43998 <+372>: adrp   x2, 2835
    0x100a4399c <+376>: ldr    x2, [x2, #0xa20]
    0x100a439a0 <+380>: mov    x0, x20
    0x100a439a4 <+384>: bl     0x100c3762c               ; symbol stub for: __cxa_throw
->  0x100a439a8 <+388>: mov    x19, x0
    0x100a439ac <+392>: ldrsb  w8, [sp, #0x17]
    0x100a439b0 <+396>: tbz    w8, #0x1f, 0x100a439bc    ; <+408>
    0x100a439b4 <+400>: ldr    x0, [sp]
    0x100a439b8 <+404>: bl     0x100c37bfc               ; symbol stub for: operator delete(void*)
    0x100a439bc <+408>: mov    x0, x19
    0x100a439c0 <+412>: bl     0x100c386d0               ; symbol stub for: _Unwind_Resume
    0x100a439c4 <+416>: mov    x19, x0
    0x100a439c8 <+420>: mov    x0, x20
    0x100a439cc <+424>: bl     0x100c37770               ; symbol stub for: __cxa_free_exception
    0x100a439d0 <+428>: mov    x0, x19
    0x100a439d4 <+432>: bl     0x100c386d0               ; symbol stub for: _Unwind_Resume

last call on Objective-C side:

- (NSString *_Nullable) getRedirectedPath:(NSUUID *_Nonnull)zimFileID contentPath:(NSString *_Nonnull)contentPath {
    zim::Archive *archive = [self archiveBy: zimFileID];
    if (archive == nil) { return nil; }
    try {
        std::string contentPathC = [contentPath cStringUsingEncoding:NSUTF8StringEncoding];
        zim::Item item = archive->getEntryByPath(contentPathC).getRedirect();  // <---- contentPathC :"mesquartierschinois.wordpress.com"
        return [NSString stringWithUTF8String: item.getPath().c_str()];
    } catch (std::exception) {
        return nil;
    }
}
Screenshot 2024-06-06 at 22 27 24 Screenshot 2024-06-06 at 22 25 25

@rgaudin
Copy link
Member

rgaudin commented Jun 7, 2024

https://mirror.download.kiwix.org/zim/.hidden/dev/mes-quartiers-chinois_fr_all_2024-05.zim

❯ ziminfo.py mes-quartiers-chinois_fr_all_2024-05.zim
ZIM Info for mes-quartiers-chinois_fr_all_2024-05.zim
Properties
  - UUID: 861c031f-dafb-9688-4db4-8f1199fe2926
  - Main Entry: mainPage (mesquartierschinois.wordpress.com/)
  - New NS scheme: True
  - Multipart: False
  - Has Full-Text Index: True
  - Has Title Index: True v0, v1
  - Checksum: ad37462b0c6408d745b8287b77019655
  - Entry Count: 2977
  - All Entry Count: 2994
  - Article Count: 346
  - Media Count: 1741
  - Illustration sizes: {48}
Metadata:
 - Counter: application/javascript=217;application/json=618;application/json+protobuf=2;application/octet-stream=1;application/rss+xml=2;application/vnd.apple.mpegurl=3;application/x-javascript=3;font/woff=1;font/woff2=2;image/avif=2;image/jpeg=1158;image/png=110;image/svg+xml=1;image/webp=307;text/css=23;text/html=346;text/javascript=12;video/MP2T=105;video/mp4=58
 - Creator: -
 - Date: 2024-05-30
 - Description: -
 - Illustration_48x48@1: image/png binary (4172 bytes)
 - Language: fra
 - Name: mes-quartiers-chinois_fr_all
 - Publisher: openZIM
 - Scraper: warc2zim 2.0.0-dev9 + zimit 2.0.0-dev5 + Browsertrix crawler 1.1.3
 - Source: https://mesquartierschinois.wordpress.com/
 - Tags: _ftindex:yes;_category:other
 - Title: Mes quartiers chinois

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants