Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create package hierarchy #13

Merged
merged 3 commits into from Jun 27, 2016
Merged

Create package hierarchy #13

merged 3 commits into from Jun 27, 2016

Conversation

tomasmcz
Copy link
Member

This does not work yet. Also, Travis will not run it, because glibc.

More importantly, this fails with

2016-06-22 22:17:09: Loading INI file: 'tests/small.ini'
2016-06-22 22:17:09: INI file is parsed.
2016-06-22 22:17:09: Failed to create object 'encoder' of class 'neuralmonkey.encoders.sentence_encoder.SentenceEncoder': Value of "vocabulary" in "neuralmonkey.encoders.sentence_encoder.SentenceEncoder" should be "neuralmonkey.vocabulary.Vocabulary" but was "__builtin__.unicode"en
Traceback (most recent call last):
  File "neuralmonkey/config/config_loader.py", line 219, in get_object
    result = clazz(**args)
  File "neuralmonkey/encoders/sentence_encoder.py", line 22, in __init__
    assert_type(self, 'vocabulary', vocabulary, Vocabulary)
  File "neuralmonkey/checking.py", line 37, in assert_type
    format(name, caller_type_str, exptected_str, real_type_str, value))
Exception: Value of "vocabulary" in "neuralmonkey.encoders.sentence_encoder.SentenceEncoder" should be "neuralmonkey.vocabulary.Vocabulary" but was "__builtin__.unicode"en

I do not know what to do with this. My guess is, that the horrible monster of configuration-builder ate some useful exception somewhere.

@jlibovicky
Copy link
Contributor

This is perfectly OK. You should provide vocabulary object, but instead you have string "en" in the configuration. I am not sure where did you get this configuration file, but it seems to me it is based on one of the proposals how the configuration should like but was never implemented.

@jlibovicky jlibovicky removed the bug label Jun 22, 2016
@tomasmcz
Copy link
Member Author

It must have been implemented, I ran it quite a few times. But I see where it differs from translation-example.ini now. I will fix it.

@jlibovicky
Copy link
Contributor

I've just pushed a fixed config.

@tomasmcz
Copy link
Member Author

... and now I have to rebase this. I'm not quite sure what the correct workflow is: should you have posted this as a pull-request for my fork? Thanks anyway.

@tomasmcz
Copy link
Member Author

Now it fails on

2016-06-22 22:40:41: Loading INI file: 'tests/small.ini'
2016-06-22 22:40:41: INI file is parsed.
2016-06-22 22:40:41: Failed to create object 'encoder_vocabulary' of class 'neuralmonkey.vocabulary.from_datasets': 'Dataset' object has no attribute '_series'
Traceback (most recent call last):
  File "neuralmonkey/config/config_loader.py", line 219, in get_object
    result = clazz(**args)
  File "neuralmonkey/vocabulary.py", line 151, in from_datasets
    series = dataset.get_series(series_id, allow_none=True)
  File "neuralmonkey/dataset.py", line 111, in get_series
    return self._series.get(name)
AttributeError: 'Dataset' object has no attribute '_series'

Also, I'm not sure we will be able to run any meaningful tests on Travis, because it has lower version of glibc than TF is compiled with.

@tomasmcz
Copy link
Member Author

I've found a way to run tests on Travis. Strangely enough, it shows different error than my local computer.

Ad correct workflow: this should have been a branch here, not in my fork, then you would be able to push to it. Mea culpa.

@tomasmcz
Copy link
Member Author

tomasmcz commented Jun 22, 2016

OK, now it arrived at the same error. Can you take a look, @jlibovicky? I honestly cannot tell whether this is another error it my ini file or your hungry config-builder ate another exception.

Edit: Even though it probably only eats import errors, so this should be something else.

@jlibovicky
Copy link
Contributor

This is because of @jindrahelcl recent change in the dataset objects contructions. All series names must no begin with "s_". I always forget about this one.

Btw. @jindrahelcl, did you have a good reason to make it this way?

@jindrahelcl
Copy link
Member

I changed it when I was adding the preprocessor option to the dataset config. It seemed simpler to me just to prefix names for dataset series with s_, than to have some restricted keywords that nobody knows or not using underscores in option names. But I agree it's not the best solution either.

Perhaps we could enable the options to have dicts as parameters. Then in dataset, you would have an option series with value {source=>val.en.tok, target=>val.de.tok} or whatever the python syntax for dict literals is.

@jindrahelcl
Copy link
Member

I am aware that I changed also the usage of the preprocessor. Now, it is used on the whole dataset, which is not sufficient, we need to be able to use also preprocessors on a subset of series, jointly or independently. It should also be done in a way that user can stack multiple processors on top of each other.

@tomasmcz
Copy link
Member Author

I don't see anything beginning with "s_" in any .ini file. Is any of the example configurations supposed to work? What exactly should I change to make tests/small.ini work?

@tomasmcz
Copy link
Member Author

tomasmcz commented Jun 23, 2016

OK, I solved it. The next error message is very confusing again. First it says INI file is parsed, then Failed to load INI file: Unexpected fields: test_run, .... That's not what I imagine under "parsing a configuration". This should be addressed in #17.

@tomasmcz tomasmcz self-assigned this Jun 23, 2016
@tomasmcz
Copy link
Member Author

Also, why are the vocabularies empty?

@tomasmcz
Copy link
Member Author

Solved. Now it says that it's missing series.

@tomasmcz
Copy link
Member Author

OK, I get the joke, we write "s_" on the left side, but not on the right. Seems like I'm finally dealing with real errors.

@tomasmcz
Copy link
Member Author

This is an interesting error, seems like a bug in postprocessing.

@tomasmcz
Copy link
Member Author

tomasmcz commented Jun 24, 2016

Finally, the test configuration works and we can run a real test on Travis. This should prevent things like #21 from happening.

I have not yet corrected all the imports (I intend to do that as I work on #6), so you may still run into import errors, but fixing them is easy enough, so I consider this ready to merge (after we move away the last commit from master, as discussed in #21). What do you think, @jindrahelcl, @jlibovicky?

@jindrahelcl
Copy link
Member

One more thing - the lib directory is now regarded as a package. So either think of something how to do it properly using git submodules and then fix the imports in processors/bpe.py or put it to the neuralmonkey dir. I don't really care as long as I don't need to deal with this again.

@tomasmcz
Copy link
Member Author

What's the problem with the lib directory? After I fixed the import from neuralmonkey, everything in processors/bpe.py seems to import correctly.

I agree that the lib situation is not ideal, but we should open a separate issue for that.

I see no other reason not to merge this (other than #21, but that can be solved later). So unless someone objects, I will merge this tonight.

@jindrahelcl
Copy link
Member

Ok, it looks good now.

@tomasmcz tomasmcz merged commit a87b132 into ufal:master Jun 27, 2016
carolinlawrence pushed a commit to carolinlawrence/neuralmonkey that referenced this pull request Jan 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants