Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garbled characters when upload a file with Japanese characters in its name to Discovery #571

Closed
thetime1102 opened this issue Oct 4, 2018 · 14 comments
Assignees
Labels

Comments

@thetime1102
Copy link

thetime1102 commented Oct 4, 2018

Expected behavior

Upload a file with Japanese characters in its name
ex:
before upload: 日本語.json
after upload: 日本語.json

Actual behavior

Uploaded file name have garbled characters
ex:
before upload: 日本語.json
after upload: ����.json

Steps to reproduce the problem

Using discovery upload document method

Code snippet (Note: Do not paste your credentials)

import os
import json
from watson_developer_cloud import DiscoveryV1

discovery = DiscoveryV1(
    version="2018-03-05",
    username="username",
    password="password"
)

with open(os.path.join(os.getcwd(), 'path_element', '日本語.json')) as fileinfo:
    add_doc = discovery.add_document('environment_id', 'collection_id', file=fileinfo)
print(json.dumps(add_doc, indent=2))

python sdk version

watson-developer-cloud==1.7.0

python version

3.6.6
@germanattanasio
Copy link
Contributor

Thanks a lot for filing this issue! We'll triage and take a look at it as soon as possible! Do you have a file that we can use for testing?

@thetime1102
Copy link
Author

thetime1102 commented Oct 8, 2018

Thanks a lot for filing this issue! We'll triage and take a look at it as soon as possible! Do you have a file that we can use for testing?

this is file use for test. thanks
word ファイル サンプル_doc.zip

@ehdsouza ehdsouza added the bug label Oct 8, 2018
@germanattanasio
Copy link
Contributor

@thetime1102 Can you try updating the SDK to 2.0.1?

@thetime1102
Copy link
Author

thetime1102 commented Oct 9, 2018

@germanattanasio I tried and nothing more change.

When I debug request function in SDK and edit the ../Programs/Python/Python36/Lib/site-packages/urllib3/fields.py file

line 38  (before edit): result.encode('ascii')
line 38  (after edit): result.encode('utf8')

The issue is fixed.

@germanattanasio
Copy link
Contributor

@ehdsouza It looks like we are encoding something with ascii rather than utf8 and that's creating problems with some symbols. Can you please take a look?

@germanattanasio
Copy link
Contributor

This could be related psf/requests#4218 (comment)

@thetime1102
Copy link
Author

Thanks for help.
I'm looking forward to the SDK fixed version

@ehdsouza
Copy link
Contributor

After looking into the details, it looks like the file is encoded properly using RFC 2231 standards.

I created an issue internally for the discovery team to look into the server side.

@germanattanasio
Copy link
Contributor

@ehdsouza can you follow up with the discovery team to see if they fixed this issue?

@ehdsouza
Copy link
Contributor

This is still not fixed from the service side.

@germanattanasio
Copy link
Contributor

We are still waiting for @watson-developer-cloud/watson-discovery to fix the issue @thetime1102. Sorry for the delay.

@apaparazzi0329
Copy link
Contributor

This issue may still be present although new testing is necessary to ensure this is resolved. I will conduct an investigation next week and report findings here

@apaparazzi0329
Copy link
Contributor

This issue has been tested to be resolved in the latest version of the python-sdk (5.2.0) and the latest api version of Discovery V1 (2019-04-30). Closing as resolved

@thetime1102
Copy link
Author

This issue has been tested to be resolved in the latest version of the python-sdk (5.2.0) and the latest api version of Discovery V1 (2019-04-30). Closing as resolved

OMG! Thank you so much!!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

4 participants