Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements in the name and map generators #661

Closed
wants to merge 3 commits into from
Closed

Improvements in the name and map generators #661

wants to merge 3 commits into from

Conversation

spixi
Copy link
Contributor

@spixi spixi commented May 24, 2016

This commit contains several improvements in the name generator and map generator:

  • factory class to simplify creation of name generators and removing dependencies from concrete name generator classes
  • the is_valid() method has been replaced by an exception, because that is more meaningful than using a shared pointer to delete the object, if it is not valid
  • the context-free grammar based name generator now supports variables
  • the map generator now uses the new features of the cfg name generator
  • the redundant wml tag [village_naming] has been removed
  • village and terrain generating rules have been ported from/data/english.cfg to /data/core/macros/names.cfg

name_generator.reset(new markov_generator(utils::split(markov_list), naming["markov_chain_size"], 12));
name_generator_factory* generator_factory = nullptr;
name_generator *base_name_generator = nullptr,
*river_name_generator = nullptr,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I would prefer if you repeat the type name on each line here, rather than declaring several pointers in a single statement. I'm not sure if Wesnoth has an official stance on this though.

@CelticMinstrel
Copy link
Member

CelticMinstrel commented May 25, 2016

Okay, so there's a lot of stuff here. I would have been happier if this had been split into several commits, honestly, but it may be too late to split it up now. (If you want to try, feel free, but keep in mind that splitting commits is kind of difficult in git.)

Some of the changes you've made are a little dubious as well. That's not to say that they're bad, strictly speaking, but...

  1. Though exceptions are nice, I don't think they're categorically better here. The old method works just as well, right? Overuse of exceptions is also problematic when debugging, though I don't think this particular use of them would be (and Wesnoth abuses exceptions far worse in other places).
  2. I don't quite understand how the name generator factory class works, but the existence of dummy_generator feels like a red flag. Why do you need a dummy generator? Surely there's a way to do that without creating a useless class that does nothing? Furthermore, even if you absolutely do need dummy_generator, I don't think there's any point in giving it a whole file-pair to itself; it could just as easily live in name_generator_factory.cpp (yes, a source-only class with no header).
  3. When I was working on the name generators awhile ago, I was under the impression that [village_naming] was used in a different manner than [naming]. Are you quite sure it's redundant? Can you explain your reasoning in detail? Furthermore, even if it's redundant, you can't just up and remove it. There's a deprecation procedure that needs to be followed.
  4. I did scan over the code, but could you explain how the CFG variables work in simple terms? If this is accepted it would need to be documented in the wiki anyway, and I'm currently on the fence about the new feature, so a clear explanation may help me better formulate an opinion on it.

All things considered, it's my opinion that this cannot be merged in its present form. The 3rd point is the main blocker here.

@spixi
Copy link
Contributor Author

spixi commented May 25, 2016

@CelticMinstrel
Thank you for your feedback.

I readded the [village_naming] tag again, removed the exception specifications and the goto statements and the typename in front of the initializer lists. I cannot remove name_generator_factory.hpp, because it is required to initialize a static private attribute. The name_generator::generate method has a parameter where you can provide a string map of variables which should be replaced. The syntax of variables is like nonterminals, but variables start with a Dollar sign, e. g. the context free grammar main={$var} Town|City of ($var}|{$var2} {$var} with the variables var="York" and var2="New" would produce York Town, City of York or New York.

@gfgtdf
Copy link
Contributor

gfgtdf commented May 25, 2016

I don't see where the pointer returned by determine_name_generator_from_config is deleted also i wonder why base_generator_factory is a pointer and just just a normal name_generator_factory object..

Also the name generator api shoudl be moved from game_lua_kernel to lua_kernel_base so that it is also avaialbe to luia random map generators.

@CelticMinstrel
Copy link
Member

CelticMinstrel commented May 25, 2016

I cannot remove name_generator_factory.hpp, because it is required to initialize a static private attribute.

I think you misunderstood my second point. The suggestion was not to remove name_generator_factory.hpp – rather, I suggested that the contents of dummy_name_generator.hpp and dummy_name_generator.cpp should be merged into name_generator_factory.cpp. (Possibly defining all the functions inline, since there's little reason to define them out of line if the class and function definitions are in the same file.)


/**
* @file
* Fallback for name generation (to guarantee, that name_generator_factory always returns a valid object reference)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This leads to the following question: Why does name_generator_factory always need to return a valid object reference? It could easily return a pointer or a boost::optional instead, for example.

@spixi
Copy link
Contributor Author

spixi commented May 25, 2016

I made some changes:

  • removed singleton and pointers
  • removed wrong throw()
  • removed deprecation message

@CelticMinstrel It is a common practise that factories produce Null objects, when they fail to produce any other object. This makes it less error prone, because you always get a valid object and you don't need to check whether it is a null pointer. It also provides a default implementation to reduce code duplication.

@spixi
Copy link
Contributor Author

spixi commented May 25, 2016

How can I fix the

undefined reference to name_generator_factory::name_generator_factory(config const&, std::vector<std::string, std::allocatorstd::string >)'`

undefined reference to name_generator_factory::get_name_generator(std::string)'`

in the Travis build?

The protoypes name_generator_factory::name_generator_factory(config const&, std::vector<std::string>) and name_generator_factory::get_name_generator(const std::string) are included in src/utils/name_generator_factory.hpp. It seems that the linker does not find name_generator_factory.o. It works on my machine, however ...

@CelticMinstrel
Copy link
Member

It also provides a default implementation to reduce code duplication.

Uhh... what are you talking about? The rest of what you said kinda makes sense (even if I'm not sure I agree with it), but I can see no way in which dummy_name_generator provides a default implementation or reduces code duplication (in fact, it kinda increases code duplication). Anyway, if you wanted a default implementation for something, you would put it in the name_generator base class.

// We set the metatable now, even if the generator is invalid, so that it
// will be properly collected if it was invalid.
luaL_getmetatable(L, "name generator");
lua_setmetatable(L, -2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhhh... why did you remove these two lines? They're crucial for this to work. (The comment should be retained too.)

@CelticMinstrel
Copy link
Member

I think perhaps an error should be raised in the CFG generator if you attempt to add a nonterminal beginning with $ - a simple conditional throw statement in the condition block at line 41 would suffice, I think.

@CelticMinstrel
Copy link
Member

Regarding the Travis issue, it's probably because you didn't add the new file to src/SConscript. You should also add it to src/CMakeLists.txt.

@spixi
Copy link
Contributor Author

spixi commented May 29, 2016

I added the name_generator_factory.cpp file into src/SConscript (It was already in src/CMakeLists.txt). Apparently travis uses SCons, but I used CMake for my local build.

$ sign in nonterminal definitions is now checked.

The dummy_generator class has been removed, the base class has now a default implementation.

Furthermore I removed the unused branch in game_lua_kernel.cpp and added the lua_get/setmetatable again.

I should really consider to install SCons, but it requires Python 2.7 and its kitchen sink. (At the moment I have only Python 3.4 installed.)

The coding in context_free_grammar_generator.cpp really needs a code review. I kept it as-is, because I don't know Wesnoth's development guidelines and I didn't want to break it.

if (seed_pos >= seed_size) seed_pos = 0;
if (picked == got.last_) {
picked = seed[seed_pos++] % got.possibilities_.size();
else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...you know, this else isn't actually needed since the if block always returns. It would make the diff clearer, as well.

@CelticMinstrel
Copy link
Member

Well, I don't see any problems left, though I'd prefer someone more familiar with map generation to review that part of the PR.

@gfgtdf
Copy link
Contributor

gfgtdf commented Jun 2, 2016

Hmm what random number is used to generate the names dorign map generation? Usually the map generator has its own random generator default_map_generator_job::rng_ that could be used but note that just like with unit the number of random calls should be independent on the name generator (which might depend on trasnlations) to make random mape rproducable with a given seed.

Other than that, i doubt there there is someone really familiar with the map generator code, i recently refactoeed it a bit, but i mainly moved functions around, not changing the map generators logic. It would be nice if random map generators coudl be tested before this is merged, that is, the mainline randommp maps with both scenario_generation and map_generation.

@CelticMinstrel
Copy link
Member

Even if there isn't anyone familiar with the map generator, it would be nice to know that someone other than me has looked through this in some detail (as opposed to just glancing through it quickly). If you've done that, then great.

@sigurdfdragon
Copy link
Contributor

sigurdfdragon commented Jun 3, 2016

While I'm not qualified to review the c++ code, I could take a look and test that it doesn't obviously break anything with the mainline random maps or my own random maps add-on, but that's about it. It would be sometime in the next 7 days or so.

@gfgtdf
Copy link
Contributor

gfgtdf commented Jun 3, 2016

No i didnd't look into this in deati. It'd be nice if you'd test it in mainline and umc campaigns

@CelticMinstrel
Copy link
Member

@sigurdfdragon That would be great! Perhaps not quite as great as someone scanning the code for obvious errors, but I suppose it may have to do. It'd also be great if someone tested campaign usages of the random map generator, for example in HTTT.

@gfgtdf
Copy link
Contributor

gfgtdf commented Jun 4, 2016

httt only used teh cave map generator which doen't use the code in default_map_generator_job.cpp

@sigurdfdragon
Copy link
Contributor

@CelticMinstrel I've got an add-on where I can test campaign usage too.

@spixi
Copy link
Contributor Author

spixi commented Jun 5, 2016

I just noted that there is a small change in the interface which means that old add-ons may not be compatible with this PR.

Because I moved the village and terrain names from english.cfg to the [naming] and [village_naming] tags it is possible to have different naming schemes per map generator. As long the add-on uses the macro {VILLAGE_NAMES}, everything is okay. But If it uses the old interface to define custom names and only defines male_names= (which is now an alias for base_names=), but not bridge_names, swamp_names, village_names, village_forest_names and so on (or their corresponding *_name_generator equivalents), they won't get any names at all.

@CelticMinstrel
Copy link
Member

Hmm. I think it's actually quite likely that an add-on would use male_names= directly instead of the {VILLAGE_NAMES} macro - this is arguable the point of having the [naming] tag in the first place, really. So, that could be a problem.

I'm not sure what's the best solution, but here's one idea - add a [naming] child to the game config (in data/game_config.cfg) which includes {VILLAGE_NAMES}. Then that can be used as a fallback if the [naming] child in the generator does not include some key. I'm not sure if the {VILLAGE_NAMES} macro is visible in that file though, it should be easy to make it so if it's not already.

(If anyone has any better ideas, feel free to state them.)

@CelticMinstrel
Copy link
Member

The issue here is compatibility - as it currently stands, any older random map generators that do use names will no longer work as the creator intended (assuming that they don't use the {VILLAGE_NAMING} macro, which in my opinion is a fairly good assumption), because they don't contain the extra name specifications.

It would be helpful if we could get some kind of survey of map generator uses in addons on the 1.12 addons server, as well as mainline uses. If it turns out that everyone does use the {VILLAGE_NAMING} macro, then this might be easier to resolve acceptably.

I think that if [naming] and [village_naming] have different expected keys, then it's a little silly to use the same macro for the standard contents of both. I do think that the village_ prefix on those keys is mainly pointless, as well.

...I also suddenly had the thought that integrating variables into the CFG generator is actually kind of redundant. You could have just as easily used the existing WML variable substitution mechanism - use the CFG generator to generate a string containing WML variable substitutions, then substitute the variables into the resulting string. Maybe it's a little late to suggest making that change, though...

@spixi
Copy link
Contributor Author

spixi commented Jun 19, 2016

I am sorry to tell you that I have no time at the moment. Maybe I will have a look at it during the next two weeks.

Regards,

Spixi

@spixi
Copy link
Contributor Author

spixi commented Jun 25, 2016

My idea is the following:

I move the default implementations of the name generators back to english.cfg into the keys [default_naming] and [default_village_naming], so I can access them with

    config = game_config_manager::get()->game_config().child("language")

The user-defined name generators are merged with the system-defined ones, which means, that user-defined override system-defined name generators. If neither a user-defined, nor a system-defined name generator is found, we generate no name.

Then I make {{VILLAGE_NAMING}} an empty macro for backward-compatibility reasons.

Is that okay? If yes, I will do the change in the next weeks (maybe next Saturday).

@CelticMinstrel
Copy link
Member

That sounds okay, but I think I'd put base_names / base_name_generator in the {{VILLAGE_NAMING}} macro rather than in english.cfg.

I'm still having second thoughts on the redundancy of your variables implementation, too.

@Vultraz
Copy link
Member

Vultraz commented Jul 10, 2016

Please rebase this branch with master and resolve conflicts.

@gfgtdf
Copy link
Contributor

gfgtdf commented Jul 11, 2016

you messed up the indention in some files and added some large file 'install_manifest.txt' this must be fixed before mering.

@spixi
Copy link
Contributor Author

spixi commented Jul 16, 2016

I somehow messed up my local repository. I am working on it.

@spixi
Copy link
Contributor Author

spixi commented Jul 16, 2016

running git fsck

@spixi
Copy link
Contributor Author

spixi commented Jul 16, 2016

So, I think, it seems to be okay now?

@CelticMinstrel
Copy link
Member

CelticMinstrel commented Jul 16, 2016

Okay, I want to merge this pull request soon. However, I am against the variable support in the CFG generator. Is there any reason why you can't do something like this?

  1. Rig the CFG generator to generate strings containing WML varaible substitution syntax, eg `bridge_name_generator= _ << main=$base|’s Bridge|$base|’s Crossing >>
  2. Substitute the variables into the resulting string by calling utils::interpolate_variables_into_string(cfg_output, variables). This is defined in formula/string_utils.hpp.

Besides that, my only remaining problem is, why do you need the village_ prefix on all the keys in [village_naming]? That they are village-related is implied by the tag name and doesn't need to be repeated, as far as I can see. I'd prefer if the prefix was dropped, unless you can provide a good reason not to (maybe I've missed something).

Oh, and since I went and moved the definition of the Lua API, it would be handy if you could do what vultraz said, ie rebase this against master. It's not a blocker, but it would make it more convenient for me to merge.

@spixi
Copy link
Contributor Author

spixi commented Jul 16, 2016

Hi @CeltricMinistrel,

I care about your remarks tomorrow. A small question: Should it be possible to support dynamic variables, e. g.

my_generator = <<
main=$foo{var}
var=1,2,3
>>

which would be identical too

my_generator="main=$foo1,$foo2,$foo3"

or should the variable substitution come first, which allows things like:

my_generator = <<
main={foo$var|},{bar1},{bar2}
foo1=foo1
foo2=foo2
bar$var|=bar$var|
>>

Thank you for your feedback.

@CelticMinstrel
Copy link
Member

Okay, so I'm glad you mentioned this, because it brought forward an issue with the idea that I suggested. The issue is quite simple, really - the | character is important both in the name generator and in variable substitution, with different meanings. (It's actually optional in variable substitution, but there exist cases where it is required, eg $name|marsh.)

For your actual question, I favour the first way - that is, the CFG could include sequences that vary which construct variable names from smaller parts. Note that your example of what it would be identical to is actually incorrect, as it uses commas instead of vertical bar as the separator.

Regarding the problem of |, I can see two options:

  1. Implement an escaping mechanism in the CFG parser to allow special characters to be specified as literals instead of with their special meaning. There are two obvious ways to implement an escaping mechanism.
  2. Select a character (eg #) to replace the variable-substitution meaning of | only in generators. Then the procedure would be: generate a string from the generator; replace all # with |; and finally substitute variables.
  3. We could ignore the problem and just posit that variable substitutions requiring | are not supported, but since there are already some cases in english.cfg that would require it, I don't consider this a good option.

The two obvious ways of implementing an escaping mechanism are:

  1. Use an escape character; backslash is the most obvious choice, but percent, hash, or even at could work too. (It should not be a character that would commonly be used in natural text. This excludes ampersand, forward slash, and all sentence punctuation.) When the escape character is encountered, if it is followed by a special character (which includes itself as well as ={}|), then the escape character is ignored and the special character is inserted into the current buffer. If it is not followed by a special character, it could either be ignored or inserted into the current buffer.
  2. Use braces. This method might not allow for escaping the braces themselves (though it could allow the sequences {{} and {}} for that, but that might be confusing). The most important result of this is that the sequence {|} would expand to a vertical bar in the output. (You could also support {=}, though that would only be useful if these escapes are valid on the left hand side as = on the right hand side is already taken as a literal.)
  3. Define an implicit nonterminal in every generator that expands to a vertical bar. With this method, there's no actual escape mechanism, but the sequence {pipe} (or similar) will expand to a vertical bar in the output. This might actually be the easiest solution.

Anyone have thoughts on this?

@Vultraz
Copy link
Member

Vultraz commented Jul 16, 2016

What about a double pipe (||) for the generator, and a single pipe (|) for variable substitution?

@ln-zookeeper
Copy link
Member

If you're talking about making || have special meaning in variable substitution in general: || is very commonly used in cases like $foo_$side_number|| and those cannot be broken.

@spixi
Copy link
Contributor Author

spixi commented Jul 17, 2016

I totally forgot about this. The vertical bar was the main problem, that's why I used the comma instead in my comment above.

For escaping I have the following suggestion:

There are the predefined nonterminals:

( = "{"
) = "}"
! = "|"

This would change the syntax from

forest_name_generator= _ << main={$base} Forest|{$base}’s Forest >>

to

forest_name_generator= _ << main=$base Forest|$base{!}’s Forest >>

The variable substitution will be executed after the name generator.

Your thoughts?

@CelticMinstrel
Copy link
Member

That seems okay to me. I'd prefer {|} to {!}, but I guess either is acceptable.

(Incidentally, I don't think the vertical bar is actually required when it's followed by a single quote / apostrophe - only when followed by a letter, digit, or underscore, and possibly also when followed by a dot or square bracket.)

@ln-zookeeper That's not what we were talking about, so no worries.

@CelticMinstrel
Copy link
Member

Two issues I can still see:

  1. Why is there an error if $ is used in a nonterminal? I suppose it's not really a major issue, but it does seem weird. I could let this pass if there's a good reason.
  2. This still conflicts with some changes made in master. In particular, I recently moved the Lua API implementation for the name generators from game_lua_kernel.cpp to lua_kernel_base.cpp, so if you could update the PR for that, it'd be great. I also made a few bugfixes to the function, so please make sure that you don't accidentally revert those.

@spixi
Copy link
Contributor Author

spixi commented Jul 17, 2016

I fixed the error with the $ sign. I forgot to adjust it, when I reverted the syntax for the variables back to WML variables syntax.

@CelticMinstrel: Do you think you could do the changes with lua_kernel_base.cpp and the bugfixes mentioned in you comment?

@CelticMinstrel
Copy link
Member

I suppose I could give that a try.

@spixi
Copy link
Contributor Author

spixi commented Jul 17, 2016

Thank you

CelticMinstrel added a commit that referenced this pull request Jul 18, 2016
Fixes a few style issues introduced by #661 and also some that were already there
CelticMinstrel added a commit that referenced this pull request Jul 18, 2016
Improvements in the name and map generators
@CelticMinstrel
Copy link
Member

This has now been merged.

@sigurdfdragon
Copy link
Contributor

The crash I reported on June 7th above with this pr was NOT fixed. I have opened bug report: https://gna.org/bugs/index.php?24864
Please fix or revert!

@CelticMinstrel
Copy link
Member

It seems to be fixed by 465ca49 - road names were trying to be generated even if no rules for generating names had been given.

@sigurdfdragon
Copy link
Contributor

confirmed fixed, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants