Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Czech locale #39

Closed
mihudec opened this issue Apr 20, 2017 · 4 comments
Closed

Czech locale #39

mihudec opened this issue Apr 20, 2017 · 4 comments

Comments

@mihudec
Copy link

mihudec commented Apr 20, 2017

Hi,

I really appreciate your work on this. Could you please add support for czech time formats? I get following error:
Unexpected time format in "13. březen 2014 v 19:42 UTC+01". If you downloaded your Facebook data in a language other than English, then it's possible support may need to be added to this tool.

The timestamp in czech looks like this:
<span class="meta">13. březen 2014 v 19:42 UTC+01</span>
which translates to: "13. march 2014 at 19:42 UTC+01"

I've tried adding support myself, comming up with following format:
("cs_cz", "D. MMMM YYYY [v] HH:mm")

and running python ./setup.py develop as mentioned in this comment, but I still get the error (maybe because of non-ASCII characters, such as ř).

I would really appreciate if you could look into this, thank you.

Sidenote: I've noticed that, for some reason, facebook for some time used different forms of month names, such as 13. března instead of 13. březen, which might cause some issues as arrow library supports only one form. This affected only few messages in my file, but might cause trouble when adding official czech locale support (I myself replaced those in Notepad).

@ownaginatious
Copy link
Owner

So the issue with the messed up ř is probably due to your terminal being set to some encoding other than UTF-8. That shouldn't affect the program itself though in parsing.

As to the březen vs března issue, I've been expecting that the issue with multiple names for a given month might come up eventually :(

I'll try downloading my own data in the Czech locale and see if I can find a solution. I'll post back here when I've got it sorted out 👍

@ownaginatious
Copy link
Owner

@mijujda hmm, I just tried using your code and was able to get it to work on my own data by changing the HH to just H in your parsing string (seems single-digit hours do not have a leading 0 in this locale).

Can you try the same and confirm if that works for you? Not sure the března vs březen thing is actually an issue here.

@mihudec
Copy link
Author

mihudec commented Apr 21, 2017

Thank you for quick reply, unfortunately changing the time format to H:mm didn't seem to solve the problem for me for the first time, however then I tried running it in virtualenv and it works! Thank you very much. Could you please advise how to make it work outside venv?

Update: I managed to get it working outside venv, so everything is fine :)

@ownaginatious
Copy link
Owner

Great! I updated the tool with Czech support in version 0.9.post10, so you shouldn't need to hack around it anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants