Skip to content

support English keywords, i18N generation for Jython (excluding some languages) #197

Merged
merged 8 commits into from Feb 8, 2012

2 participants

@sabrams
sabrams commented Feb 7, 2012

Hi Aslek,

This update adds support for English keywords, and includes auto-generation code for other languages. Languages that can not be normalized to be usable with ASCII are excluded from the autogeneration, as the current Jython support uses Python class names to support keywords (where Unicode is not supported).

This does not add support for using these languages yet, as I have not yet found a way to load these files from the PythonInterpretor used in JythonBackend correctly when imported from a step def file.

@sabrams

This isn't yet ready for a pull. The generated Jython is correct and non-CLI tests pass when I manually import the EN.py file, but its not running seamlessly with Maven yet (build fails).

Cucumber member

Looks nice. It would be nice if each stepdef script could import the DSL like this:

import cucumber.runtime.jython.EN

@Given('I have (\d+) "(.+)" in my belly')
def something_in_the_belly(self, n, what):
  self.n = int(n)
  self.what = what
Cucumber member

Aha. The codeKeywords strings are definitely unicode, so for jython I think the best is to convert them to ascii. (Watch out for dupes after asciification).

I think java.text.Normalizer might do the trick - haven't tried it. It might not work for Chinese, Arabic, Hebrew etc - so maybe fall back to EN if a language's keywords can't be ASCIIfied?

Cucumber member

I have made some changes in my sabrahams-jython-i18n branch which improve this a little...

@aslakhellesoy
Cucumber member

This seems like a decent workaround. Latin-to-ASCII able Locales get translated. Others don't because it's a limitation of Python/Jython.

Am I right?

That is correct. To support the annotation format we're given by the Jython interpreter for these unicode, non-ascii-able languages, there will at least need to be some pre-processing phase on the file. One possibility for a future update: we could use a character replacement mechanism on both the I18N file generator and this pre-processing phase, creating classes whose names were ascii-able. For example, if in English we suddenly started spelling Given with the Yen sign (Gi¥en), and someone wrote the step def:

@Gi¥en('I have (\d+) "(.+)" in my belly')
def something_in_the_belly(self, n, what):
self.n = int(n)
self.what = what

we could use a preprocess phase that parses this file, and feeds this to the Jython parser:

@GiU_00A5en('I have (\d+) "(.+)" in my belly')
def something_in_the_belly(self, n, what):
self.n = int(n)
self.what = what

This would run, grabbing the class name from the annotation, where the class def was already loaded in the EN.py I18N file:

And = But = GiU_00A5en = Then = When = I18NKeywordTemplate

(U+00A5 is Unicode char point for Yen sym)

Anyway, support for most of the languages is almost there - just need a way to load them and make them usable from any context (maven, cli, ide)

Cucumber member

This sounds like it would allow people to write invalid python code. (non-ASCII annotations are invalid python as it seems).

Allowing people to write invalid python doesn't sound like a good idea to me. I think people working in non-ASCII languages should be forced to use ASCII.

I'm in agreement about keeping the Python code valid.

@aslakhellesoy aslakhellesoy merged commit c0f263c into cucumber:master Feb 8, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.