Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError on search #30

Closed
tionis opened this issue Mar 28, 2022 · 11 comments
Closed

UnicodeDecodeError on search #30

tionis opened this issue Mar 28, 2022 · 11 comments

Comments

@tionis
Copy link

tionis commented Mar 28, 2022

I just tried the server withing the docker container as per the instructions and filled it with a few markdown files to test it.
When typing in a search term and sending it, I get an UnicodeDecodeError as follows:

UnicodeDecodeError

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
Traceback (most recent call last)

    File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 2464, in __call__

    return self.wsgi_app(environ, start_response)

    File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 2450, in wsgi_app

    response = self.handle_exception(e)

    File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1867, in handle_exception

    reraise(exc_type, exc_value, tb)

    File "/usr/local/lib/python3.6/dist-packages/flask/_compat.py", line 39, in reraise

    raise value

    File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 2447, in wsgi_app

    response = self.full_dispatch_request()

    File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1952, in full_dispatch_request

    rv = self.handle_user_exception(e)

    File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1821, in handle_user_exception

    reraise(exc_type, exc_value, tb)

    File "/usr/local/lib/python3.6/dist-packages/flask/_compat.py", line 39, in reraise

    raise value

    File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1950, in full_dispatch_request

    rv = self.dispatch_request()

    File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1936, in dispatch_request

    return self.view_functions[rule.endpoint](**req.view_args)

    File "/app/wiki.py", line 179, in file_page

    return search()

    File "/app/wiki.py", line 80, in search

    fin = f.read()

    File "/usr/lib/python3.6/codecs.py", line 321, in decode

    (result, consumed) = self._buffer_decode(data, self.errors, final)

    UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
@Linbreux
Copy link
Owner

Hi @tionis,

This docker file in this repo has some issues. You an use a newer version that has been made in a fork: https://github.com/artivis/wikmd/tree/docker/docker

Let me know if you got any questions!

@tionis
Copy link
Author

tionis commented Mar 28, 2022

I get the same error with this container, I just rechecked all my files, they are all either images, or text encoded in utf-8 or us-ascii

@Linbreux
Copy link
Owner

I'll have a look at it, thanks for sharing!

@Linbreux
Copy link
Owner

Linbreux commented Mar 28, 2022

Hi @tionis,
I changed a line to ignore encoding errors b1b8ff9 :

with open(root + '/' + item, encoding="utf8") as f:

to

with open(root + '/' + item, encoding="utf8", errors='ignore') as f:

Could you try it again with the latest commits? Unfortunately, I can't reproduce the error myself.
What OS do you use?

@tionis
Copy link
Author

tionis commented Mar 28, 2022

I am running on Linux, specifically Manjaro.
The latest commit fixed the search issue for me, the search needs about half a second, and then I get a list of files in which the keyword was found.
I've got another problem, that the images in the files are not embedded correctly for me, but this doesn't seem to be related to the encoding issue.

@Linbreux
Copy link
Owner

The latest commit fixed the search issue for me, the search needs about half a second, and then I get a list of files in which the keyword was found.

Alright, nice that this is solved. How many files do you have in the wiki? If you got less than 50 files, the search should go pretty fast (I'm working on optimizing this).

I've got another problem, that the images in the files are not embedded correctly for me, but this doesn't seem to be related to the encoding issue.

Could you give an example? Where are the images stored, and what is the path you use in Markdown to implement them?

@tionis
Copy link
Author

tionis commented Mar 28, 2022

Well I imported my obsidian/vimwiki files to test wikmd, which are 787 notes, with 193 attachments totaling about 53 MB of files.
The images were just embedded using the ![](./local_file_here.jpg) syntax

@Linbreux
Copy link
Owner

I guess that the ./ is not necessary in the wiki, but I'll integrate this!

@tionis
Copy link
Author

tionis commented Mar 28, 2022

Actually I didn't use ./ for specifying the file, I will take a look at it again later and open a new issue if I find out more.
I think we can consider the encoding issue fixed then?

@tionis tionis closed this as completed Mar 28, 2022
@tionis
Copy link
Author

tionis commented Mar 29, 2022

Short Update: the Decode Error still also occurs when opening the graph view

@Linbreux
Copy link
Owner

Hi

Actually I didn't use ./ for specifying the file, I will take a look at it again later and open a new issue if I find out more.

Alright perfect!

Short Update: the Decode Error still also occurs when opening the graph view

sorry, forgot to change it for the graph view. e6da598 fixes this.
Thanks for letting me know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants