Skip to content

not sure if I can ask questions here, but I got stuck on this when I try to download a TMX from the OPUS JW300 set. It has nothing to do with the set, I think. #4

Closed
@miau1

Description

@miau1

not sure if I can ask questions here, but I got stuck on this when I try to download a TMX from the OPUS JW300 set. It has nothing to do with the set, I think.

Alignment file /proj/nlpl/data/OPUS/JW300/latest/xml/en-ta.xml.gz not found. The following files are available for downloading:

   8 MB https://object.pouta.csc.fi/OPUS-JW300/v1/xml/en-ta.xml.gz
 263 MB https://object.pouta.csc.fi/OPUS-JW300/v1/xml/en.zip
  94 MB https://object.pouta.csc.fi/OPUS-JW300/v1/xml/ta.zip

 365 MB Total size
Downloading 3 file(s) with the total size of 365 MB. Continue? (y/n) y
JW300_latest_xml_en-ta.xml.gz ... 100% of 8 MB
JW300_latest_xml_en.zip ... 100% of 263 MB
JW300_latest_xml_ta.zip ... 100% of 94 MB
Traceback (most recent call last):
  File "your_script.py", line 3, in <module>
    opus_reader.printPairs()
  File "C:\Users\gertv\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\opustools_pkg\opus_read.py", line 350, in printPairs
    lastline = self.readAlignment(gzipAlign)
  File "C:\Users\gertv\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\opustools_pkg\opus_read.py", line 308, in readAlignment
    lastline = self.outputPair(self.par, line)[1]
  File "C:\Users\gertv\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\opustools_pkg\opus_read.py", line 251, in outputPair
    self.sendPairOutput(wpair)
  File "C:\Users\gertv\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\opustools_pkg\opus_read.py", line 210, in sendPairOutput
    self.resultfile.write(wpair[0])
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.7_3.7.1264.0_x64__qbz5n2kfra8p0\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 83-89: character maps to <undefined>

I have no idea how to fix this. Your help is highly appreciated.

Originally posted by @gertva in #3 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions