Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

convert_ipynb UnicodeEncodeError #1880

Closed
1 task done
albertovilla opened this issue Dec 13, 2021 · 3 comments · Fixed by #1881
Closed
1 task done

convert_ipynb UnicodeEncodeError #1880

albertovilla opened this issue Dec 13, 2021 · 3 comments · Fixed by #1881
Labels
type:bug Something isn't working type:documentation Improvements on the docs

Comments

@albertovilla
Copy link
Contributor

Describe the bug
When executing python convert_ipynb.py to generate the Markdown documentation the converter halts with an error.

Error message

Processing ..\..\..\..\tutorials/Tutorial1_Basic_QA_Pipeline.ipynb
Processing ..\..\..\..\tutorials/Tutorial2_Finetune_a_model_on_your_data.ipynb
Processing ..\..\..\..\tutorials/Tutorial3_Basic_QA_Pipeline_without_Elasticsearch.ipynb
Processing ..\..\..\..\tutorials/Tutorial4_FAQ_style_QA.ipynb
Processing ..\..\..\..\tutorials/Tutorial5_Evaluation.ipynb
Traceback (most recent call last):
  File "F:\projects\contrib\haystack\docs\_src\tutorials\tutorials\convert_ipynb.py", line 31, in <module>
    f.write(body)
  File "Z:\miniconda\envs\haystack\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 8650-8652: character maps to <undefined>

Expected behavior
The Markdown documentation should be generated without errors.

Additional context
Executing on Windows 10 with a newly created conda Python 3.9 environment; installation of the packages with the recommended command pip install --editable .

To Reproduce

  1. Install from Github as per the instructions:
git clone https://github.com/deepset-ai/haystack.git
cd haystack
pip install --editable .
  1. Change to the folder where we have the convert_ipynb.py file i.e. docs\_src\tutorials\tutorials\
  2. Execute the command to update the documentation python convert_ipynb.py

FAQ Check

System:

  • OS: Windows 10
  • GPU/CPU: GTX 1060 / Intel(R) Core(TM) i5-4670K CPU @ 3.40GHz
  • Haystack version (commit or version number): 1.0
  • DocumentStore: N/A
  • Reader: N/A
  • Retriever: N/A
@albertovilla
Copy link
Contributor Author

The issue can be easily solved, in fact, I have already a solution for it so I could make a PR if you need to. The solution is just as simple as change the code line 29 to include the encoding as follows:

with open(str(i + 1) + ".md", "w", encoding='utf-8') as f:

albertovilla added a commit to albertovilla/haystack that referenced this issue Dec 13, 2021
@ZanSara
Copy link
Contributor

ZanSara commented Dec 14, 2021

Hello @albertovilla, thank you! Let's make sure it works on the CI too: we use that script almost exclusively there. By the way, if I may ask, how come you found yourself building our documentation locally?

@ZanSara ZanSara added type:documentation Improvements on the docs type:bug Something isn't working labels Dec 14, 2021
@albertovilla
Copy link
Contributor Author

Hi @ZanSara initially I planned to correct the issue with the documentation but before doing the commit I tried to rebuild the documentation, that's when I saw that bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:bug Something isn't working type:documentation Improvements on the docs
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants