Created a test for sorting interleaved inputs. #3817

B-rando1 · 2024-06-24T14:11:50Z

I created an idealized example of how I think sorting the input statements should go for interleaved inputs. It takes (1) parsing statements, (2) constraint-checks, and (3) derived values, each with the values they define and/or the values they require (as applicable), and sorts them as described below.

The general heuristic is to greedily add lines, with constraint-checking having highest priority, then derived values, and parsing only when the first two cannot be done. I also chose to sort constraint-checks and derived values based on number of dependencies, so that a < 10 will always take place before a < b, if a and b are both defined.

You can run it by opening it in ghci and calling runTests. I'll post the current output below as well, with some added annotations.

ghci> runTests
Test:     -- First test
a = 1     -- parsing
a > 0     -- constraint-check
c = a / 2 -- derived value
c < 100   -- constraint-check
b = 2     -- parsing
a > 2 * b -- constraint-check

Test:  -- Second test
a = 1  -- parsing
b = 2  -- parsing
b < 30 -- constraint-check
a > b  -- constraint-check - goes after `b < 30` because it involves both `a` and `b`

Contributes to #3781

The general heuristic is to greedily add lines, with constraint-checking having highest priority, then derived values, and parsing only when the first two cannot be done. I also chose to sort constraint-checks and derived values based on number of dependencies, so that `a < 10` will always take place before `a < b`, if `a` and `b` are both defined.

B-rando1 · 2024-07-02T13:47:21Z

It will be a few days at least before I get back to working on Choices, but I will need a bit more direction as to what algorithm I should use when I get there.

The first file I added was my idea of how it should go, based on getExecOrder. It is perhaps a bit overcomplicated, but I personally really like the results. It uses enough heuristics that the code it generates is fairly close to what a human would write.

@balacij asked if I could try simplifying the algorithm by putting all of the expressions in one data structure. I wasn't sure how this would work, but I created the second file as a test. The second algorithm is simpler, but because it is straight topological sort based on dependencies, it is pretty haphazard with its ordering. It only cares about producing code that will run, not whether that code is interleaved.

Are the more complex data structures in the first example unwarranted? I've been trying to think how we could modify the second algorithm to produce code as idiomatic as that produced by the first, but I don't think we can without being able to distinguish between types of statements (parsing, constraint-checks, derived inputs).

JacquesCarette · 2024-07-03T12:48:15Z

By "straight topological sorting" you probably mean just using the dependency information - which is indeed sufficient to get something that will compile.

I think it is also possible to create a more subtle "less than" relation that takes into account the various kinds of things being expressed, which is what you did in a more ad hoc way in the longer code. So think of the "kind of data" you have (I think you split things into 4 kinds?) as itself a piece of information on which you want to key your ordering.

I created a new datatype to store the kind of statement, and used it and the number of dependencies to naively sort the statements by the heuristics before running topological sort. Once that is done, we can just repeatedly choose the first element that has all of its dependencies met. This should actually produce slightly better results (though they're the same for these tests), and is hopefully a bit more concise and readable.

B-rando1 · 2024-07-03T19:13:13Z

Thanks for your guidance @JacquesCarette. It took me a bit to understand what you were getting at, but I think I do now. I updated sampleInterleaved.hs, and it should be much more concise now, and likely faster as well.

The main change I made was to naively sort the inputs based on the heuristics first, and then start topological sort. This allows the topological sort to focus only on resolving dependencies rather than the double-duty it previously had of resolving dependencies and applying heuristics.

Is this more in line with how it should look? Are there other changes I should make? Let me know what you think 🙂.

JacquesCarette · 2024-07-04T21:06:55Z

I was originally thinking of doing both of those things at the same time - but there is no need to do that. Your 2-pass approach is simpler.

B-rando1 requested review from smiths and balacij June 24, 2024 14:11

B-rando1 requested a review from JacquesCarette as a code owner June 24, 2024 14:11

B-rando1 marked this pull request as draft June 24, 2024 18:38

Added topological sort version, and simplified the types a bit.

cde9c3b

JacquesCarette previously approved these changes Jun 29, 2024

View reviewed changes

B-rando1 marked this pull request as ready for review July 2, 2024 13:14

B-rando1 dismissed JacquesCarette’s stale review via 467fd1e July 3, 2024 19:08

B-rando1 mentioned this pull request Jul 5, 2024

Drasil Research Group Meeting, Monday July 8, 3pm, ITB/225 #3834

Closed

JacquesCarette approved these changes Jul 7, 2024

View reviewed changes

JacquesCarette merged commit 7370fe5 into main Jul 7, 2024

JacquesCarette deleted the SampleSorting branch July 7, 2024 06:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Created a test for sorting interleaved inputs. #3817

Created a test for sorting interleaved inputs. #3817

B-rando1 commented Jun 24, 2024 •

edited

Loading

B-rando1 commented Jul 2, 2024

JacquesCarette commented Jul 3, 2024

B-rando1 commented Jul 3, 2024

JacquesCarette commented Jul 4, 2024

Created a test for sorting interleaved inputs. #3817

Created a test for sorting interleaved inputs. #3817

Conversation

B-rando1 commented Jun 24, 2024 • edited Loading

B-rando1 commented Jul 2, 2024

JacquesCarette commented Jul 3, 2024

B-rando1 commented Jul 3, 2024

JacquesCarette commented Jul 4, 2024

B-rando1 commented Jun 24, 2024 •

edited

Loading