Many improvements to RIVET #115

mlesnick · 2018-05-10T15:05:41Z

This pull request encompasses my work to the code, as well as Roy Zhao's work and Simon Segert's work.

Here is a summary:

-Roy's work (and my edits to it) introduce the ability to handle multicritical filtrations, including the construction of the degree-Rips bifiltration.

-To improve the structure of the code, Roy introduced the BifiltrationData and FIRep classes. He got rid of the simplex tree representation of a simplicial complex, replacing it with a more straightforward list-of-simplices representations of a simplicial complex in BifiltrationData. Note that this stores only simplices in dimension j-1, j, and j+1, where j is the homology dimension of interest.

-Roy's work also introduces the firep input file type, which allows for presentations and arbitrary boundary matrices (e.g., cellular boundary matrices) as input.

-Roy's work changes the input format for the "bifiltration" input file. In particular, files of this type now need to include simplices of lower dimension, not just maximal simplices.

-My work replaces the linked list column data structures that used to underly RIVET with (optimized versions of) PHAT's lazy heaps.

-I added the ability to compute Betti numbers via computation of a minimal presentation of a 2-D persistence module. This replaces the earlier Koszul Homology algorithm for computing Betti numbers as the default, though the older algorithm is still available in the rivet_console via the --koszul flag. The performance of the two algorithms is comparable for computing Betti numbers, but the new approach makes barcode template computation much, much faster, since a minimal presentation (which is usually quite small) can now be given as input to the barcode template computations. The speed of minimal presentation / Betti number computation will continue to improve significantly in the future as we do more optimizations.

-The minimal presentation can be printed via the --minpres flag.

-I have fixed issue #76, which was the last remaining major bug.

-Simon made improvements to the interface which allow us to adjust the bounds of the viewable window. This allows us to compare two different RIVET visualizations on the same scale, and to zoom in on the interesting parts of the output.

-Simon also added the ability to handle descending coordinates gracefully in RIVET. One can now specify in an input file that a coordinate direction is descending by adding "[-]" in front of the axis label, e.g. "[-]" density. The prefix "[+]" to specify an ascending coordinate direction is optional.

-I have updated the data folder in RIVET to include more examples, including lots of small toy examples that I have accumulated over time.

Note that the RIVET documentation is only partially caught up with these improvements and extensions, and in any case ought to be consolidated into a single file, instead of multiple web pages.
We will work on this in the next weeks.

There is still more to do in many of areas where this pull request provides improvements, but I believe that we are currently at a nice local optimum, and that this is a good time to incorporate these improvements into RIVET.

…-lexicographic ordering.

# Conflicts: # interface/input_manager.cpp

Vectors sorted by comparing lowest index, then if those are equal, the second lowest and so on.

…t's version.

… Moved it to private and altered the comparison method to decrease the number of row operations done in test examples.

Also, code seems not to be building bifiltrations correctly.

… a bigrade. Speeds up persistence computations a lot, as expected, but slows down't FIRep computation too much. Will continue to optimize.

…these more efficient. Draft of changes is done and compiles, but is exhibiting some errors.

…tly on numerous examples. Still, more testing is needed. The edited code is consistently faster than Matthew's code on 1-critical examples, which in turn was faster than Roy's old code. Further optimizations are surely possible, but as far as I know, I've addressed the biggest issues. A brief summary of improvements / changes since Roy's last commit: -In Roy's old code, there was a hash table storing (grade,simplex) pairs that was being constructed but not used. I have removed this. -The old code stored the relevant part of the bifiltration in hash tables (std::unordered_map), as member data of the bifiltration_data object. To build the FI-Reps, the code iterated through the hash table, which is inefficient. Moreover, a hash table representation of the simplices in the high dimension is never needed. In my changes, bifiltration_data stores the simplices as lists. The firep class also builds the hash tables in low and mid dimension, but not in the high dimension. The next bullet gives further motivation for this. -For Vietoris-Rips bifiltrations, within a given grade, Matthew's code naturally orders simplices lexicographically by vertex label. Experiments have suggested that this ordering works much better for the later computations (e.g., barcode templates) than alternatives. By storing simplices only in a hash table, Roy's code also naturally constructed simplices of a given dimension in the lex order on vertices, but by putting these in a hash table, it then discarded that order. One can do sorting to recover the order, but this incurs a time cost. The new code directly builds lists of simplices in lex order (without sorting). A stable sort is then used to sort these by grade, so that the lex order is preserved within a grade. -To build the boundary matrices, we traverse the columns in colex order. In the multicritical case, when building the high boundary matrix, we also look for a bigrade for a boundary simplex in colex order. Given this, it follows that if a (simplex, grade) pair is rejected as we build the boundary matrix because the grade is not compatible, then we never have to consider that pair again. This can save a bit of work. -The earlier code of both Matthew and Roy accepts as text input for the "bifiltration" format just the maximal simplices in a simplicial complex. I have done away with this convention; I now require that all simplices are given directly. (TODO: check whether these simplices need to be given in filtration order.) We could reintroduce the earlier convention, but this seems awkward in the current framework, and will incur a cost. -Generally, I tried to avoid copying simplices and other data. There probably is still be room for improvement in this regard, and where I have a doubt about efficiency, I have left a TODO comment.

…ready for a pull request.

mlesnick · 2018-06-19T08:56:03Z

I think this is because in my fork I changed the distance cutoff to something larger than the max---with 10x10 coarsening, you miss a lot of detail with the larger cutoff.

…

On Tue, Jun 19, 2018 at 1:01 AM Bryn Keller ***@***.***> wrote: @mlesnick <https://github.com/mlesnick> @SimonSegert <https://github.com/SimonSegert> I found something unexpected. When I compare the circle 240 sample file in master vs in this branch (or in Mike's latest before merging master into this PR, now known as 3f78b3e <3f78b3e> due to history rewriting), I get something very different. I'm not sure if this is because the data file has changed and I shouldn't expect the results to look the same, or if there's something else going on. Both are H0, 10x10 coarsening. Any ideas? From master: [image: circle_data_240pts_inv_density txt h0_10_10 line_offset_-0 199553_angle_3 814] <https://user-images.githubusercontent.com/1226204/41577634-d900339e-7342-11e8-8951-fdb721077d11.png> From this PR: [image: circle_data_240pts_codensity txt h0_10_10 line_offset_-0 746484_angle_14 4082] <https://user-images.githubusercontent.com/1226204/41577641-e6658bd8-7342-11e8-9385-e0a89b6d5df0.png> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#115 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AL-lBKWuZZIQvu9ze6uEh9NKeGLHaoroks5t-IWJgaJpZM4T6CyL> .

…indow

mlwright84 · 2018-06-20T01:38:28Z

I just pushed two commits that modify VisualizationWindow. Most notably, I moved two text fields (that display the homology dimension and input file name) to below the control elements. This reduces the minimum width of the window, to allow it to fit on screens as narrow as 1000 pixels or so. (Previously, its minimum width was greater than 1280 pixels, the width of a screen I was testing it on.) I also removed the status bar, since we weren't really using it.

…ndow

cherry-pick number fields fix

In readme, change mentions of Ubuntu 16.10 to Ubuntu 18.04. Change minimum boost version to 1.58.

mlwright84 · 2018-06-21T20:06:29Z

This pull request looks good to me. I've tested RIVET on several data sets, and it works well. I like the improvements to the interface. Also, rivet_console is much faster than before -- e.g. I did a H_0 calculation on 3600 points in 90 seconds, rather than a few hours in the old RIVET.

xoltar

On the whole this is a big improvement! It's a little too big to actually review in any useful way, but I think we should move forward with it since the benefits far outweigh any of the possible downsides. I would appreciate some fixes to meet naming standards as I pointed out in a few places, and please run clang-format as specified in the README.

xoltar · 2018-06-21T20:33:13Z

computation.cpp

-    : params(params)
-    , progress(progress)
-    , verbosity(params.verbosity)
+Computation::Computation(int vrbsty, Progress& progress)


Please use real words, including the vowels.

Hi Bryn, broadly that seems like a good suggestion, but in this case, how do you want it implemented? The natural name for this argument is "verbosity," but that is already taken here.

Parameter names and initializer names resolve as you would hope in initializer lists, so you can actually do verbosity(verbosity) and it does the right thing.

xoltar · 2018-06-21T20:33:59Z

computation.cpp

@@ -55,26 +56,127 @@ std::unique_ptr<ComputationResult> Computation::compute_raw(ComputationInput& in
        }
    }

-    //STAGE 3: COMPUTE MULTIGRADED BETTI NUMBERS
+    //STAGE 3: COMPUTE MINIMAL PRESENTTION AND MULTIGRADED BETTI NUMBERS


typo in PRESENTATION

xoltar · 2018-06-21T20:34:46Z

computation.cpp

+    // If the --koszul flag is not given, then we compute Betti numbers by
+    //computing a minimal presentation
+    if (!koszul)
+    {


This is enough code that it might make sense to extract a separate method for it.

xoltar · 2018-06-21T20:39:26Z

dcel/arrangement_builder.cpp

+     the MST before finding the path... By changing the output format of Krustal 
+     we should be able to eliminate the first of these.
+     */
+    std::vector<NodeAdjacencyList> adjList(arrangement.faces.size(), NodeAdjacencyList());


should be adj_list

xoltar · 2018-06-21T20:42:14Z

dcel/arrangement_builder.cpp

@@ -610,3 +638,77 @@ void ArrangementBuilder::find_subpath(Arrangement& arrangement,
    }

 } //end find_subpath()
+
+void ArrangementBuilder::treeToDirectedTree(std::vector<NodeAdjacencyList>& adjList, unsigned start, std::vector<std::vector<unsigned>>& children)


Should be tree_to_directed_tree

xoltar · 2018-06-21T20:42:43Z

dcel/arrangement_builder.cpp

+void ArrangementBuilder::treeToDirectedTree(std::vector<NodeAdjacencyList>& adjList, unsigned start, std::vector<std::vector<unsigned>>& children)
+{
+    std::vector<bool> discovered(adjList.size(),false);// c++ vector for keeping track of which nodes have been visited
+    std::vector<unsigned> branchWeight(adjList.size(),0); // this will contain the weight of the edges "hanging" from the node represented by its index in branchWeight


variables use snake_case not camelCase.

xoltar · 2018-06-21T20:47:02Z

math/firep.cpp

+#include <stdexcept>
+
+//constructor; requires verbosity parameter
+FIRep::FIRep(Presentation pres, int vbsty)


Vowels again please.

Update README.md

mlesnick · 2018-06-22T00:48:52Z

@xoltar @mlwright84 thanks for testing and for the input. I will take care of the changes Bryn requested.

@SimonSegert is now finishing a fix to issue #120 that I would like to incorporate before merging.

Also, I discovered a bug with the visualization in that arises in the current branch when "fit to window" is deselected. This occurs on my MacBook, but neither on Simon's machine nor on my Linux machine. I'm sure we could track this bug down, but I wonder if it would be better to just remove "fit to window" altogether; this feature has never really been of any use to me.

If we do this, I suppose it would suffice to move "show barcode" to halfway between where it is now and where "fit to window" is. Thoughts?

After we resolve these issues, we should be ready to merge.

mlwright84 · 2018-06-22T02:16:37Z

I'm sorry to hear about the bug involving "fit to window." I think the "fit to window" checkbox is worth preserving in RIVET -- my students and I have occasionally selected and deselected it in our use of RIVET. I think it's OK to merge the pull request, note this "fit to window" issue as a bug, and then fix it later.

mlesnick · 2018-06-22T03:32:34Z

Ok--let's keep the fit to window feature then. And I agree, the fit to window bug need not hold up the merge.

-more style cleanup following Bryn's suggestions, -fixed a misleading indentation warning in firep.cpp.

mlesnick · 2018-06-22T21:30:37Z

I did some cosmetic work on the code to address the style issues Bryn mentioned. There still exist some short variable names in the code, though I changed many of these. I suggest we take care of the others later. Most notably, some short member names that Bryn would not like are used in the structs defined in bifiltration_data.h. But I have some changes in progress to bifiltration_data.h to make the code more memory efficient. We can clean up the variable names as part of those changes.

The only two issues that remain on the table are the minor bugs in the visualization mentioned above, (apparently) related to Simon's work. But it doesn't seem to make sense to hold up the PR for these.

xoltar · 2018-06-22T21:34:09Z

Agreed, please feel free to merge when you're ready!

…

On Fri, Jun 22, 2018 at 2:30 PM mlesnick ***@***.***> wrote: I did some cosmetic work on the code to address the style issues Bryn mentioned. There still exist some short variable names in the code, though I changed many of these. I suggest we take care of the others later. Most notably, some short member names that Bryn would not like are used in the structs defined in bifiltration_data.h. But I have some changes in progress to bifiltration_data.h to make the code more memory efficient. We can clean up the variable names as part of those changes. The only two issues that remain on the table are the minor bugs in the visualization mentioned above, (apparently) related to Simon's work. But it doesn't seem to make sense to hold up the PR for these. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#115 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABK13EZ6Hamz7aZggn8h0Edt2XMfEEvQks5t_WH9gaJpZM4T6CyL> .

Roy Zhao and others added 30 commits March 29, 2017 15:50

Altered input_manager to accept new BifiltrationData class.

883adf2

Replace SimplexTree with FIRep

6758ef9

Bug fixes

c604968

Updated FIRep and Input Manager to accept new input formats.

318bf6e

Altered Bifiltration-Rips creating code to align with correct reverse…

a63fa0e

…-lexicographic ordering.

Fixed bug where distance 0 was not recorded.

3fbd00a

Updated distance indices wrt previous commit.

637bab9

Fixed Merge Multigrade Code

935adc0

Moved FIRep creation into input_manager.cpp from computation.cpp

c9915dd

Added ability to input free implicit representation directly.

a3c8ab9

Added testing output.

1028f01

Merge remote-tracking branch 'rivet-master/master' into firep

6d582e6

# Conflicts: # interface/input_manager.cpp

Fixed a memory leak

242c103

Print only if verbosity is high enough

9296dc0

Removed hom_dim from FIRep

2bc0362

Allow for user to input firep example in non-lexicographic order.

ff18312

Resolve merge conflicts

400239e

Merge remote-tracking branch 'rivet-master/master' into firep

e623f99

Pass reference, not pointer.

a8d5eab

Sort columns of simplices with equal grades by their boundary vector.

062e984

Vectors sorted by comparing lowest index, then if those are equal, the second lowest and so on.

Relabel x label because degrees are sorted in inverted order

7ac6fb8

Change bifiltration data to use std::unordered_map as opposed to boos…

112c7d6

…t's version.

Change FIRep constructor to completely use this new struct Generator.…

d3e11c9

… Moved it to private and altered the comparison method to decrease the number of row operations done in test examples.

Significantly cleaned up merging multigrades algorithm

ebe82fd

Optimized Roy’s FIRep code. Debugging code still needs to be cleaned up.

1437288

Also, code seems not to be building bifiltrations correctly.

Use the lexicographical ordering on simplices to order columns within…

7763caa

… a bigrade. Speeds up persistence computations a lot, as expected, but slows down't FIRep computation too much. Will continue to optimize.

check point work in progress. far from functional at the moment.

27fb182

Extensive changes Roy's bifiltration_data and fi_rep classes to make …

2361387

…these more efficient. Draft of changes is done and compiles, but is exhibiting some errors.

Cleaned up comments in the code. Needs to be tested a bit more; then …

023f63e

…ready for a pull request.

Make method static to avoid tickling a bug in gcc

9db8f52

mlesnick mentioned this pull request Jun 19, 2018

Make FI Rep Construction More Memory Efficent #122

Open

xoltar and others added 3 commits June 19, 2018 13:05

Small fixes

a497a88

moved text fields in VisualizationWindow to reduce minimum width of w…

cb59692

…indow

removed status bar from VisualizationWindow

d3ed283

mlwright84 and others added 8 commits June 20, 2018 06:32

further adjustments to spacing of control elements in VisualizationWi…

a95aa84

…ndow

minor tweaks to VisualizationWindow

7f06171

address compiler warnings

a37d268

address issue with number fields in gui (issue 120)

63b1dea

cherry-pick number fields fix

improvement to spin box trailing zeros fix

800a153

fix to lines near boundary not showing up in persistence dgm

b718946

Update README.md

435c08a

In readme, change mentions of Ubuntu 16.10 to Ubuntu 18.04. Change minimum boost version to 1.58.

Update README.md

5b8f0e0

xoltar requested changes Jun 21, 2018

View reviewed changes

Merge pull request #1 from mlesnick/mlesnick-patch-1

2c74a03

Update README.md

mlesnick added 3 commits June 22, 2018 11:02

Restructuring and typo fix suggested by Bryn.

2c2a791

various minor style fixes, following Bryn's suggestions.

23c7e9e

-fixes to the changes made in the last commit,

1219e01

-more style cleanup following Bryn's suggestions, -fixed a misleading indentation warning in firep.cpp.

xoltar approved these changes Jun 22, 2018

View reviewed changes

mlwright84 approved these changes Jun 22, 2018

View reviewed changes

ran clang-format script.

4fb89ba

mlesnick merged commit ed4afc9 into rivetTDA:master Jun 22, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Many improvements to RIVET #115

Many improvements to RIVET #115

mlesnick commented May 10, 2018 •

edited

Loading

mlesnick commented Jun 19, 2018 via email

mlwright84 commented Jun 20, 2018

mlwright84 commented Jun 21, 2018

xoltar left a comment

xoltar Jun 21, 2018

mlesnick Jun 22, 2018

xoltar Jun 22, 2018

xoltar Jun 21, 2018

xoltar Jun 21, 2018

xoltar Jun 21, 2018

xoltar Jun 21, 2018

xoltar Jun 21, 2018

xoltar Jun 21, 2018

mlesnick commented Jun 22, 2018

mlwright84 commented Jun 22, 2018

mlesnick commented Jun 22, 2018

mlesnick commented Jun 22, 2018

xoltar commented Jun 22, 2018 via email

Many improvements to RIVET #115

Many improvements to RIVET #115

Conversation

mlesnick commented May 10, 2018 • edited Loading

mlesnick commented Jun 19, 2018 via email

mlwright84 commented Jun 20, 2018

mlwright84 commented Jun 21, 2018

xoltar left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mlesnick commented Jun 22, 2018

mlwright84 commented Jun 22, 2018

mlesnick commented Jun 22, 2018

mlesnick commented Jun 22, 2018

xoltar commented Jun 22, 2018 via email

mlesnick commented May 10, 2018 •

edited

Loading