57 add documentation #71

vanlankveldthijs · 2024-04-08T08:49:46Z

To resolve issue #57

There are a few remaining issues to resolve.

The pull request has 2 example notebooks: one with and one without storing and reloading the ordered data before performing the timing tests. To be determined which is better. This one should be called demo_order_stm.ipynb and the other removed.
It seems like stmat.copy does not have the desired result: when I try to run STM operations on the cloned data, there seems to be some parts missing (I think it was the whole .stm part). Solved by loading the whole dataset twice, but that is a bit of an ugly fix.
It seems that when using small chunk sizes, storing to zarr fails. Solved by storing using 1000 size chunks and rechunking after loading. This is a bit ugly.
Possible issue with storing the ordered data to zarr when this needs to overwrite existing data. I think shutil.retree also sometimes gives issues (maybe when the is no directory).
Timing tests using the prerequisite small chunk size (500 in this case) become very slow. Don't know whether we can really fix this though.
Currently running the final timing tests on both notebooks. To be committed when done.

… and test mkdocs documentation.

…hon code.

Note that the example notebook is still a direct copy of the operations example notebook.

…ith the ordering notebook to be added.

There are two example notebooks: with or without storing and reloading the reordered STM before performing timing tests. To be determined which should be used.

rogerkuou · 2024-04-09T11:38:22Z

Thanks @vanlankveldthijs for the implementation. Following our discussion, the next steps can be:

Demonstrate the performance enhancement from re-ordering in a separate md page.
Elaborate on the factors affecting the re-ordering effect: 1) homogeneity of point distribution; 2) spatial spreadness of points. These two factors can influence the choice of chunksize and spatial closeness of points in the same chunk
Only do performance comparision on subset, which should be enough
Use %%timeit for a robust profiling
Still keep an example notebook for the illustration of re-order operation (you can decide either include this in the existing notebooks or make a new one)

Attaching here two example notebooks I made for comparison: notebooks.zip

rogerkuou

Hi @vanlankveldthijs, I also reviewed other changes in you implemented. In general, they are good. Just some minor comments:

I suggested a change on .gitignore;
The build workflows are failing because the unit tests are filling.
The ruff workflow failed because some linting problem. Sarah have already fixed this in Add a function to enrich STM using data from another dataset #66. You do not need to worry about this.

.gitignore

Co-authored-by: Ou Ku <o.ku@esciencecenter.nl>

… to show the effect on large and small chunks.

…mtools into 57_add_documentation

vanlankveldthijs · 2024-04-12T14:22:21Z

Consolidated demo notebooks into one.

Fixed stm.copy issue (should've been stm.copy())

Changed demo notebook to only test subset operation and to test this on large and small chunks (relative to dataset) to see impact.

vanlankveldthijs · 2024-04-12T14:23:22Z

Fixed pytests by checking for existence of time in chunksizes.

Please check whether implementation could be nicer (better python).

rogerkuou · 2024-05-06T12:42:05Z

Hi @vanlankveldthijs, thanks for the nice work and sorry for taking so long to review. I quite like the new notebooks. I think it is good for merging now.

As I have mentioned in the previous comment, the ruff workflow should have been fixed by Sarah in #66.

Two warnings in the notebook are documented in #74 and #73. They are not relevant for this PR and can be fixed later

thijsvl added 9 commits March 26, 2024 13:52

Added a few lines to docs contributing file to describe how to set up…

3690fe8

… and test mkdocs documentation.

Replace double quotes by single quotes in operation documentation pyt…

012a3b8

…hon code.

Renamed documentation example notebook to demo_operations_stm.ipynb

70c4322

Initial documentation for ordering STM

0f90c6c

Note that the example notebook is still a direct copy of the operations example notebook.

Minor adjustments to operations example notebook, to get it in line w…

e3a63c6

…ith the ordering notebook to be added.

Always restore chunk sizes after reordering.

c6d5244

Ignoring additional ordered demo data.

2e31f28

Initial documentation and example notebooks for reordering.

7ec42bd

There are two example notebooks: with or without storing and reloading the reordered STM before performing timing tests. To be determined which should be used.

Example notebooks timing tests updated.

e99413c

rogerkuou requested changes Apr 9, 2024

View reviewed changes

.gitignore Outdated Show resolved Hide resolved

vanlankveldthijs and others added 5 commits April 11, 2024 14:48

Update .gitignore to not ignore example notebook datasets.

17f393e

Co-authored-by: Ou Ku <o.ku@esciencecenter.nl>

Changed ordering demo notebook to only apply the subset operation and…

f1b5355

… to show the effect on large and small chunks.

Forgotten addition with previous commit.

cb028ad

Merge branch '57_add_documentation' of github.com:MotionbyLearning/st…

5ea911b

…mtools into 57_add_documentation

Fixed unit tests by checking chunksizes for the existence of 'time'

a83004b

Minor changes to ordering documentation page.

c2caba2

rogerkuou approved these changes May 6, 2024

View reviewed changes

rogerkuou merged commit 72cf071 into main May 6, 2024
14 of 16 checks passed

rogerkuou mentioned this pull request Jun 5, 2024

Doc: Add documentation for get_order and re_order #57

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

57 add documentation #71

57 add documentation #71

vanlankveldthijs commented Apr 8, 2024

rogerkuou commented Apr 9, 2024

rogerkuou left a comment •

edited

Loading

vanlankveldthijs commented Apr 12, 2024

vanlankveldthijs commented Apr 12, 2024

rogerkuou commented May 6, 2024

57 add documentation #71

57 add documentation #71

Conversation

vanlankveldthijs commented Apr 8, 2024

rogerkuou commented Apr 9, 2024

rogerkuou left a comment • edited Loading

Choose a reason for hiding this comment

vanlankveldthijs commented Apr 12, 2024

vanlankveldthijs commented Apr 12, 2024

rogerkuou commented May 6, 2024

rogerkuou left a comment •

edited

Loading