Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

.po parser (yecc-based) #3

Merged
merged 8 commits into from
Mar 19, 2015
Merged

.po parser (yecc-based) #3

merged 8 commits into from
Mar 19, 2015

Conversation

whatyouhide
Copy link
Contributor

Sorry it took me so long but I've been super busy and I've also been fighting with a nasty bug (read the commit messages if you want to get as angry as I did 馃槃).

The syntax errors should be refined a little but the core functionality is there.
Let me know what you think for now!

Lists of string tokens are now concatenated into single binaries *by the .yrl
parser*. The parser now returns a list of translations as maps, and the Elixir
`Gettext.PO.Parser` only has to convert those maps to structs (by setting the
`:__struct__` field).
I added a nonterminal "grammar" symbol (in order to remain more flexible in the
future) and cleaned up the code a little bit.
I believe this failure is caused by yecc behaving weirdly when `{:string, line,
value}` tokens are used and the value is a binary. I keep getting
`FunctionClauseError`s for `:io_lib.write_string/1` which gets called with the
value of the token, failing because it expects a char list and it gets a binary
instead.
The commit title is bold, but I think I found the bug highlighted by the failing
test from the previous commit. yecc interprets `{string, Line, Contents}` tokens
differently than it handles other tokens as it can be seen at
https://github.com/erlang/otp/blob/maint/lib/parsetools/src/yecc.erl#L2496.

This caused problems when using binaries (common in Elixir) instead of char
lists since yecc tries to "print" them when a syntax error occurs, and for
printing them it uses `io_lib:write_string/1` which expects a char list and
fails with a binary.

The fix to this problem was to just use `{str, Line, Contents}` tokens in
`Gettext.PO.Tokenize.tokenize/1`.
defmodule Gettext.PO.Parser do
@moduledoc false

defmodule Translation do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move this up to Gettext.PO.Translation and document it. That's because it is likely folks will be using them directly. On the other, you can wait until it happens.

@josevalim
Copy link
Contributor

This looks great! Feel free to merge this. <3

The next step is a function in Gettext.PO that parses a given file. If we are going with elixir naming, it should likely be called parse_file (and maybe followed by a parse_string too). However, if we are going to use gettext naming, we may want to search if there is a command name we could steal from here (chapter 9 in particular).

The `Translation` struct is now documented and ready to be used by users of the
library.
whatyouhide added a commit that referenced this pull request Mar 19, 2015
@whatyouhide whatyouhide merged commit 5e0c805 into elixir-gettext:master Mar 19, 2015
@whatyouhide whatyouhide deleted the yecc-parser branch March 19, 2015 00:15
@whatyouhide
Copy link
Contributor Author

@josevalim moved Gettext.PO.Parser.Translation to Gettext.PO.Translation (way better, thanks!) and merged :).

@whatyouhide
Copy link
Contributor Author

@josevalim with regards to the naming of the parsing functions, I think we should stick to Elixir conventions right now. Parsing a .po file isn't really something that gettext does "in public", I think it's more like an internal process and I think it should be an internal process in this library as well.

What do you think? I'm going to start working on the implementation of those function (regardless of what their name will be) in the meantime.

@josevalim
Copy link
Contributor

@whatyouhide perfect! 馃帀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants