New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Sudoku extension and GTK# sample #43

Merged
merged 18 commits into from Nov 3, 2018

Conversation

2 participants
@jsboige
Contributor

jsboige commented Oct 26, 2018

I propose the addition of a Sudoku solver as a new set of classes in the Extensions project, and the corresponding sample in the GTK App.
This was made as part of a course I teach where GeneticSharp is used for GAs.
Sudoku is arguably a good example, because it is relatively hard to solve with GAs whereas it is much more tractable with other tools. See for instance those LINQ examples with Microsoft's Z3 solver or a dedicated constraint based solver.
The extensions material include:

  • A class for Sudoku boards, with display, parsing and file loading supporting most common formats
  • Various classes for the genetics with alternative chromosome types.

The GTK sample includes:

  • 4 examples of increasing difficulties and the capability to load alternate samples from a file
  • The possibility to choose between various genetics, and corresponding parameters to configure.
@jsboige

This comment has been minimized.

Contributor

jsboige commented Oct 26, 2018

As per your comment in the other current pull request, I added some code coverage demonstrating using the various strategies. In order to keep the test fast, I only tested for complete resolution with the fastest strategy.

@jsboige

This comment has been minimized.

Contributor

jsboige commented Oct 27, 2018

OK I hope the coverage has got satisfying without taking too much time.

Now I have remarks and questions:

Sudoku has got many local maxima, which makes it hard for GAs, and I believe a good test for your library. Looking at existing documented implementations, it seems most have gone with a simple 1 cell = 1 gene implementation, and they would usually acknowledge the local maximum issue.

In order to get comparison material, I included in the default GTK sample the 2 sudokus documented on this page (easy) and that one (medium), in addition to one easier and one harder than those two.

Using a relatively efficient Permutation based chromosome, I am pleased that all Sudokus seem to be tractable with the current implementation. The "easy" one requires approximately 10 secs, which is satisfying as compared to the 15 minutes mentioned in the article.

However, I figured the only efficient way I found to escape local maxima was to increase the GA's population depending on the Sudoku's difficulty, and that both with the permutation and the more naive cell-based strategies. Basically, if the Sudoku wasn't solved by the 50th generation, there would be nearly no chances the GA escapes its sub-optimal solution in any other number of generations.
The corresponding settings would pretty much guaranty finding a solution :

  • Very Easy Sudoku: 250 chromosomes, <1 sec
  • Easy Sudoku: 5000 chromosomes, 10 sec
  • Medium Sudoku: 100000 chromosomes, 5-10 min
  • Hard Sudoku: 300000 chromosomes, 1-2h

One could argue that the default settings with Elite Selection and reinsertion is responsible for that, but I tried hard to play with all parameters and could not find any better setup. In particular, setting the crossover probability to one to discard parents would yield getting stuck on lower quality local maxima.

Then I introduced 2 alternatives:

  • Multiple Chromosome: I made that one general purpose as part of the Extension project: it evolves an arbitrary number of chromosomes rather than just one by inlining the genes.
  • Random permutations: Instead of having just one gene per row permutation, I had several genes picked randomly.

Those strategies were somehow partially successful at trading population size for generation numbers, playing with the GA settings but I feel this is not quite satisfying, since in all other implementations I have seen, the population size is vastly smaller than ours.

So here goes my questions:

  • Can you think of any set of parameters that make the current implementation more efficient?
  • How would you explain other GAs libraries seemingly able to work with lower populations and yield progresses over higher generation numbers?
@giacomelli

This comment has been minimized.

Owner

giacomelli commented Oct 27, 2018

Thanks for this pull-request, it will be amazing to have a Sudoku sample using GeneticSharp.
I'm glad to hear that you use the library on your course. Is this course an online one?

Your Multiple Chromosome and Random permutations are strong candidates to be moved from Extensions to GeneticSharp core in the future.

About your questions:

  • Can you think of any set of parameters that make the current implementation more efficient?
  • How would you explain other GAs libraries seemingly able to work with lower populations and yield progresses over higher generation numbers?

I will need to take some time to study your implementation and try some configurations.

As your pull-request is quite large, I will need several days of my spare time to review and test it, but no worry I'will review it as soon as possible.

@jsboige

This comment has been minimized.

Contributor

jsboige commented Oct 27, 2018

Thanks for you prompt answer.

Is this course an online one?

It is a regular course in AI I teach in several French engineering school. An earlier version used to be online, but it is now mainly contained in the various LMS for those schools. Hopefully I can wrap it up again online some time next year.

By the way, I have other samples I would gladly propose as pull requests, but I thought this one was the easier pick to get started. Another one is a bit similar to that project and I reckon it does a better job at demonstrating GAs for image processing than the existing bitmap equality sample.
The only thing is I relied on the Accord.Net Framework's image filtering capabilities, through the corresponding Nuget package. That helped keeping the chromosome and fitness very simple, yet since it's a significant additional dependency, I thought you could tell me how best to proceed.

Anyway, I suppose we can discuss that once the Sudoku's sample is properly integrated.

Your Multiple Chromosome and Random permutations are strong candidates to be moved from Extensions to GeneticSharp core in the future.

Well I'll gladly help with that. They are still pretty experimental for now, but they serve a purpose that can be extended to other cases indeed.

I will need to take some time to study your implementation and try some configurations.
As your pull-request is quite large, I will need several days of my spare time to review and test it, but no worry I will review it as soon as possible.

Sure, thanks for having a look. Looking forward to learning how best to use your library.

@jsboige

This comment has been minimized.

Contributor

jsboige commented Oct 27, 2018

Hmmm, not sure what to think about the last continous-integration test fail.

Well it's not my code for sure, but as I realized myself, one should be careful with putting hard limits on unit tests, especially on time measurements, considering you don't really know what's going to happen on the continous-integration VM.

Also, for precise time measurements, one should use a stopwatch, so if you don't mind, I'll commit a change to account for that.

@giacomelli

List<T> properties

Please, change all public properties changing LList<T> to use IList<T>.
Here some discussions about:

Sudoku class name

https://docs.microsoft.com/pt-br/dotnet/standard/design-guidelines/names-of-namespaces
DO NOT use the same name for a namespace and a type in that namespace.

Maybe the class could be called SudokuBoard.

Sudoku Parse and ParseFile

It will be good to have the expected format of string and file described in some place.
Some suggestions:

  • In a help button on GTK sample
  • In the method documentation
  • In a section on the GeneticSharp wiki

Or any other idea you can think about.

Solve issues

Please, solve the issues listed on https://sonarcloud.io/project/issues?branch=MyIntelligenceAgency-develop&id=GeneticSharp&resolved=false

Next steps

Please, feel free to question about any of the sections I pointed above.

After you finish the changes, I will perform another review.

And finally, I think we will merge this pull-request and after I will start a feature branch to try to answer your questions below:

  • Can you think of any set of parameters that make the current implementation more efficient?
  • How would you explain other GAs libraries seemingly able to work with lower populations and yield progresses over higher generation numbers?

After we get some answer, we can take a look into your other samples proposes.

Let's evolve!

@jsboige

This comment has been minimized.

Contributor

jsboige commented Oct 31, 2018

OK I think I covered all your requests.

@giacomelli

Unit tests naming

This section I forgot to mention before, but I need you to follow the naming convention described in this link https://github.com/giacomelli/GeneticSharp/wiki/unit-tests#naming to name your unit tests.

Sudoku Parse and ParseFile

I did not find any change about this one. Let me know if I miss something.

It will be good to have the expected format of string and file described in some place.
Some suggestions:

  • In a help button on GTK sample
  • In the method documentation
  • In a section on the GeneticSharp wiki

Or any other idea you can think about.

Solve issues

There are still 5 remaining issues.

Please, solve the issues listed on https://sonarcloud.io/project/issues?branch=MyIntelligenceAgency-develop&id=GeneticSharp&resolved=false

@jsboige

This comment has been minimized.

Contributor

jsboige commented Nov 1, 2018

I did not find any change about this one. Let me know if I miss something.

This is the option that was implemented:

In a help button on GTK sample

Ok now your turn! Note that considering the level of scrutiny of your review and your tools', my expectations are pretty high...

@giacomelli

Almost there!

There is only one issue to be solved: https://sonarcloud.io/project/issues?branch=MyIntelligenceAgency-develop&id=GeneticSharp&open=AWbMRnGTSgN4uCcl1Kdi&resolved=false

You need to cover with unit tests all lines that are marked with red color.

@jsboige

This comment has been minimized.

Contributor

jsboige commented Nov 1, 2018

Here you go !

@giacomelli

This comment has been minimized.

Owner

giacomelli commented Nov 2, 2018

You still need to improve the code coverage of SudokuBoard (most of them in ToString method) and there is another issue at MultipleFitness.

https://sonarcloud.io/project/issues?branch=MyIntelligenceAgency-develop&id=GeneticSharp&resolved=false

@jsboige

This comment has been minimized.

Contributor

jsboige commented Nov 2, 2018

OK I updated MultipleFitness, but the other issue seem like a bug in Sonarcloud: it points at the exact same lines as before I did actually provide coverage for all of them (Exception in ctor, SetCell, ToString & ParseFile): look for yourself: the file is 5 lines longer and the lines are out of sync now in Sonarcloud now: it even asks to cover comments for the ParseFile method, which does not make any sense.

@giacomelli

This comment has been minimized.

Owner

giacomelli commented Nov 3, 2018

Ok, thanks. I will merge it do develop now.

I will create ASAP a release branch to release it to master.

@giacomelli giacomelli merged commit 0e5e772 into giacomelli:develop Nov 3, 2018

1 check passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
@jsboige

This comment has been minimized.

Contributor

jsboige commented Nov 3, 2018

OK Cool.

Now her are additional thoughts to help with your investigation:

  • Because of the way Sudokus work, I expect that a 2-points crossover at the 4th and 7th row with permutations (or 27th and 54th gene with 81-cells chromosomes) might somehow help in order to properly isolate "boxes", that is, 3x3 cell groups where digits must differ, additionally to rows and columns.
  • However, local maxima are such that it seems nearly impossible to escape one with random mutations once the entire population has commited to a single one. Thus it seems key to avoid such a global collapse and preseve genetic diversity instead. The various texts I ended up adding to the top of the gtk drawing box were meant to assess precisely when the entire population collapses to a single local maximum. They're not too pretty nor helpful otherwise, so feel free to change that part. The collapse to a local maximum usually won't take more than 50 generations.
  • Again the only efficient way I found to escape collapsing into a local maximum was to increase the population size according to the search space (a row with 5 cells in an easy Sudoku will allow for 24 legal permutations, whereas a row with no cell in a hard Sudoku will allow for the complete 9!=362880 permutations), but that is akin to simultaneously visiting and collapsing to all local maxima in the same small number of generations, which I don't consider very satisfying.
  • Now it might simply be that most generally, crossover won't work with Sudokus, that it any mixture of 2 partially resolved Sudokus will result in a much worse one. Then I would suppose that other libraries are better at escaping local maxima by implementing some kind of tabu search not found in genetic sharp, where they will allow some parts of the population to get away from the initial collapsing genome without repeatedly falling in its attractor.
@giacomelli

This comment has been minimized.

Owner

giacomelli commented Nov 3, 2018

Please, start a new question issue referencing this pull-quest, adding your original questions and those additional thoughts above.

I'll give some attention to this issue in the next weekends.

Thanks!

@jsboige

This comment has been minimized.

Contributor

jsboige commented Nov 3, 2018

Please, start a new question issue referencing this pull-quest, adding your original questions and those additional thoughts above.

Done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment