-
Notifications
You must be signed in to change notification settings - Fork 317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Different result when giving the same text #31
Comments
Hi Stophface, Thanks for reporting the issue. The algorithm used by Aside, what do you expect to be the correct output for that input? Cheers, On Wed, Mar 25, 2015 at 11:04 PM, Stophface notifications@github.com
|
Hey,
I insert the data into the database with the The returned values when I read them out from the database are a Python tuple. I convert them then to a string as you can see. |
Given the text you showed there should be no difference between ASCII and However, looking at your code more closely, I notice that you do On Thu, Mar 26, 2015 at 8:12 AM, Stophface notifications@github.com wrote:
|
There will be different text. Farsi, Pashto, Arabic. Basically all the languages spokeny might be in the variable I pass to langid. My intention is to pass to langid text, as clean as possible. Thats my output when looking at it byte by byte:
Thanks for providing such a creat tool! |
Solved. Ah, and I expected the language to be identified as french :) All good now! If your interested: I am using your library on the flickr API :) |
Closing as @Stophface indicated issue is solved. |
I have a database from which I read. I want to identify the language in a specific cell, defined by column.
I read from my database like this:
Technically, it works. It identifies the languages used and later on (not displayed here) writes the identified language back into the database.
So, now comes the weird part.
I wanted to double check some results manually. So I have these words:
When I perform the language identification on the
database
it plots me Portuguese into the database.But, performing it like this:
Well, that returns me French. Apart from that it is neither French nor Portuguese, why is it returned different results?!
The text was updated successfully, but these errors were encountered: