Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor komodo_state::NPOINTS #481

Merged
merged 37 commits into from
Feb 20, 2022
Merged

Conversation

jmjatlanta
Copy link

Part of #479

The NPOINTS array required linear searches and never released memory. This PR make NPOINTS a boost::multi_index collection which solves those two issues. In addition, access to the collection is done through member methods of the komodo_state object.

@jmjatlanta
Copy link
Author

Note: The method implementations should be moved to komodo_structs.cpp if #476 gets merged in before this one.

@jmjatlanta jmjatlanta marked this pull request as ready for review September 13, 2021 17:02
@ca333 ca333 requested review from DeckerSU and dimxy January 11, 2022 10:34
@dimxy
Copy link
Collaborator

dimxy commented Jan 18, 2022

now the TestEvents.komodo_faststateinit_test seems to work okay

@DeckerSU
Copy link

DeckerSU commented Jan 25, 2022

@jmjatlanta From one side, i highly appreciate your efforts and work on komodod core refactoring, also i like the idea with using boost::multi_index container for NPOINTS storage, instead of just an in-memory array. But from other side i worry every time when changes touches notary / notarisations code, bcz it's very important part and we can't allow any, even small errors in it. For review this PR i prepared a kind of separate "test environment" to ensure that new refactored code is fully correct. I transferred old code (namespace old_space) and new code (namespace new_space) in separate program and tried to ensure that both produce the same results on live data extracted directly from daemon. But in my very small series of tests it turns out that the logic is broken somewhere:

Size: 8043 (8043, 8043)
[old] komodo_npptr_for_height(2524224): (null)
[new] komodo_npptr(2524224): (null) - [ PASS ]

[old] komodo_npptr_for_height(2524223): (null)
[new] komodo_npptr(2524223): (null) - [ PASS ]

[old] komodo_npptr_for_height(2524210): {"0x0b6154c662432d4a6ba9ba9ee72388b576da4069d5e5aa0f33f0a8952395c895", "0xa5fa719ad82e881342c9bae94c85e3d85cef2808e8bb3cad4d8b37a62afc9624", "0x0", "0x0", 2524223, 2524210, 1, 0, 0, 0, 0}
[new] komodo_npptr(2524210): {"0x0b6154c662432d4a6ba9ba9ee72388b576da4069d5e5aa0f33f0a8952395c895", "0xa5fa719ad82e881342c9bae94c85e3d85cef2808e8bb3cad4d8b37a62afc9624", "0x0", "0x0", 2524223, 2524210, 1, 0, 0, 0, 0} - [ PASS ]

[old] komodo_npptr_for_height(2524209): (null)
[new] komodo_npptr(2524209): (null) - [ PASS ]

[old] komodo_npptr_for_height(2524200): (null)
[new] komodo_npptr(2524200): (null) - [ PASS ]

[old] komodo_npptr_for_height(2524190): {"0x0babd5b8bacec1776f1bf97ea2fefff96e3ca0785def3904a858f3c196569fd3", "0x98207abc680d96d791af3184111da76428cbebd54e2a02fad26f92a872b2c7f0", "0x0", "0x0", 2524204, 2524190, 20, 0, 0, 0, 0}
[new] komodo_npptr(2524190): {"0x0babd5b8bacec1776f1bf97ea2fefff96e3ca0785def3904a858f3c196569fd3", "0x98207abc680d96d791af3184111da76428cbebd54e2a02fad26f92a872b2c7f0", "0x0", "0x0", 2524204, 2524190, 20, 0, 0, 0, 0} - [ PASS ]

[old] komodo_npptr_for_height(2524171): {"0x0babd5b8bacec1776f1bf97ea2fefff96e3ca0785def3904a858f3c196569fd3", "0x98207abc680d96d791af3184111da76428cbebd54e2a02fad26f92a872b2c7f0", "0x0", "0x0", 2524204, 2524190, 20, 0, 0, 0, 0}
[new] komodo_npptr(2524171): (null) - [ FAIL ]

[old] komodo_npptr_for_height(2524170): (null)
[new] komodo_npptr(2524170): (null) - [ PASS ]

[old] komodo_notarizeddata(2524224) = 2524210
[new] komodo_notarizeddata(2524224) = 2524210 - [ PASS ]
[old] komodo_notarizeddata(2524223) = 2524200
[new] komodo_notarizeddata(2524223) = 0 - [ FAIL ]
[old] komodo_notarizeddata(2524210) = 2524190
[new] komodo_notarizeddata(2524210) = 2524190 - [ PASS ]
[old] komodo_notarizeddata(2524209) = 2524190
[new] komodo_notarizeddata(2524209) = 2524190 - [ PASS ]
[old] komodo_notarizeddata(2524200) = 2524180
[new] komodo_notarizeddata(2524200) = 2524180 - [ PASS ]
[old] komodo_notarizeddata(2524192) = 2524170
[new] komodo_notarizeddata(2524192) = 0 - [ FAIL ]
[old] komodo_notarizeddata(2524190) = 2524170
[new] komodo_notarizeddata(2524190) = 2524170 - [ PASS ]
[old] komodo_notarizeddata(2524171) = 2524150
[new] komodo_notarizeddata(2524171) = 2524150 - [ PASS ]
[old] komodo_notarizeddata(2524170) = 2524150
[new] komodo_notarizeddata(2524170) = 2524150 - [ PASS ]
[old] komodo_notarizeddata(2441342) = 2441320
[new] komodo_notarizeddata(2441342) = 0 - [ FAIL ]

I mean komodo_npptr and komodo_notarizeddata new implementation gives wrong results (marked with [ FAIL ]) in some cases, different from old, original implementation. And these are the only tests i was able to create, i will no wonder if other hidden errors which is not covered by these tests could exists.

Thinking this way in trying to save original logic seems we have 3 options:

  1. Don't touch the legacy code at all. I don't like this idea, bcz some code indeed needs refactoring, all of these arrays usage instead of STL containers, memsets and strycpys looks ugly. From other side these code is tested in production and at least it works.
  2. Completely rewrite legacy code from scratch. This approach is preferable in compare with ours "point surgeries" / "in-place changes", like changing malloc/realloc dynamically created array to STL's vector, but from other side it will require deep understanding of entire idea, kind of "global vision / plan" of all of that processes in head. I mean, whole and detail plan of how notarisations and other things works.
  3. "Compromise mode" in which we will continue "point surgeries" / "in-place changes" for the places which is not related to global / important things like notarisations, and learn the code in parallel to implement the (2) some day.

My test program for #481 is in https://github.com/DeckerSU/events-refactor-test in fixtry branch. Sorry for the possible inconveniences, i know that more convenient will be to implement it as a unit test in Google Test Framework, but from other side - compiling small separate program and test some issues in it could be bit faster. Anyway, it's separate from main codebase for now 😶

@jmjatlanta
Copy link
Author

@DeckerSU Thank you for your deep efforts in testing. And I agree that we are changing code that affects very sensitive areas. I will take what you have done and work to resolve this issue.

Here are my thoughts and opinions on how to proceed with such refactoring:

  1. Break large methods into smaller methods. It is easier to see what should go in and what should come out.
  2. Unit tests will prove it if time is taken to assure edge cases are thought of.
  3. Document when the code is unclear - Make it easy for the next guy (the reviewer as well as the guy 4 years from now) to follow.
  4. System test in testnet. Put the code through its paces. Test the whole system, but concentrating on the changed areas.

In this case I failed at many of these points. Coincidentally last week I began walking through the notary code to gain a better understanding of how it works. I deployed a testnet with the hope to system test notary changes, among other things.

I agree that not refactoring is not an option. Rewriting from scratch is an option, but often impractical. If you would like a short read, check out Migrating Legacy Systems

I'll get started figuring out what I did wrong and report back. Thanks again for your detailed efforts.

@jmjatlanta
Copy link
Author

@DeckerSU I have modified the code your tests were running to fix the indexing problems that you found. You may take a look at https://github.com/DeckerSU/events-refactor-test/compare/fixtry...jmjatlanta:fixtry?expand=1 . I also took your tests and turned them into a gtest in this PR. The new tests take time to run due to the old linear search, but it checks every entry and then some extra at the end for bounds checking. It also takes extra time to compile in debug mode due to variable following.

@jmjatlanta
Copy link
Author

jmjatlanta commented Jan 28, 2022

As per the suggestions of @dimxy and @DeckerSU boost::multi_index was removed in favor of std::vector. This may lead to longer search times compared to an indexed search, but it is thought that the increased complexity is not worth the risk of introducing bugs. In the end, the search times have probably not increased over the original code.

What has changed is that the code uses far less pointers and no malloc/calloc/realloc. In addition, methods that modify the vector have been moved to within the komodo_state object itself.

src/komodo_structs.h Outdated Show resolved Hide resolved
src/komodo_structs.cpp Outdated Show resolved Hide resolved
src/komodo_notary.cpp Outdated Show resolved Hide resolved
src/komodo_notary.cpp Outdated Show resolved Hide resolved
src/komodo_structs.cpp Outdated Show resolved Hide resolved
src/uint256.h Outdated Show resolved Hide resolved
@DeckerSU
Copy link

DeckerSU commented Feb 1, 2022

I almost finished the review of this PR, tested all the changed functions:

  • struct notarized_checkpoint *komodo_npptr_for_height(int32_t height, int *idx)
  • struct notarized_checkpoint *komodo_npptr(int32_t height)
  • int32_t komodo_prevMoMheight()
  • int32_t komodo_notarized_height(int32_t *prevMoMheightp,uint256 *hashp,uint256 *txidp)
  • int32_t komodo_dpowconfs(int32_t txheight,int32_t numconfs)
  • int32_t komodo_MoMdata(int32_t *notarized_htp,uint256 *MoMp,uint256 *kmdtxidp,int32_t height,uint256 *MoMoMp,int32_t *MoMoMoffsetp,int32_t *MoMoMdepthp,int32_t *kmdstartip,int32_t *kmdendip)
  • int32_t komodo_notarizeddata(int32_t nHeight,uint256 *notarized_hashp,uint256 *notarized_desttxidp)
  • void komodo_notarized_update(struct komodo_state *sp,int32_t nHeight,int32_t notarized_height,uint256 notarized_hash,uint256 notarized_desttxid,uint256 MoM,int32_t MoMdepth)

And corresponding changes in methods of komodo_state object. Tests included manual logic check (in compare with original code) and separate auto-tests set for returned values of functions (in compare with old / original implementation) in my own repo. Current version of PR successfully passed all tests. The only thing separating me from approval - i want to merge / transfer all of these changes to Komodo-Qt (KomodoOcean) and test few global things, such as "sync from scratch" with monitoring intermediate values of functions, etc. After this will complete i will approve the PR.

Thank you very much, @jmjatlanta for your work and prompt handling of all changes.

p.s. I have only two small questions remained which i want to discuss:

  1. I saw the removal of komodo_rwccdata functions and all of it's calls. I know that it marked as demo / example, but i don't so much familiar with CC things. So, @dimxy could you confirm that is 100% safe? I'm sure it's safe, but better to have your opinion too.
  2. We have komodo_mutex lock inside komodo_notarized_update, it come from original implementation. It was used to protect NPOINTS array in old impl. and NPOINTS vector in new impl. But new impl. also have komodo_state::AddCheckpoint called from komodo_notarized_update and komodo_state::AddCheckpoint doesn't protected itself. I mean "next guy" could call this method without any protections.

Also, inside the komodo_state::AddCheckpoint we are changing komodo_state::last. and as it called from komodo_notarized_update it also protected by komodo_mutex. But in all other places, like in SetLastNotarizedXXX methods we also changing / write the komodo_state::last members and these changes doesn't protect with mutex.

My own opinion - let's leave it as is for now (for this PR may be), but in future should we try to make these places "more thread-safe"?

DeckerSU added a commit to DeckerSU/KomodoOcean that referenced this pull request Feb 2, 2022
Backport of @jmjatlanta work from KomodoPlatform/komodo#481
excluded tests and binary data (for tests).
@dimxy
Copy link
Collaborator

dimxy commented Feb 3, 2022

saw the removal of komodo_rwccdata functions and all of it's calls. I know that it marked as demo / example, but i don't so much familiar with CC things. So, @dimxy could you confirm that is 100% safe? I'm sure it's safe, but better to have your opinion too.

We never used this function and as I can see currently it does nothing

Copy link

@DeckerSU DeckerSU left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now, the only thing - i guess we should reduce size of test/notarizationdata.tst to leave there only few data needed for tests executions (some key points, not all values). My offer is leave only data of notarized checkpoints for the heights:

{ 2524224, 2524223, 2524210, 2524209, 2524200, 2524190, 2524171, 2524170 }

and probably +/- 1 for the each of values.

@jmjatlanta
Copy link
Author

jmjatlanta commented Feb 3, 2022

i guess we should reduce size of test/notarizationdata.tst

I was thinking of this this morning. While the size should be reduced, I am thinking of adding a few more combinations. Overlapping depths was the first thing I thought of. I also would like to work through the MoMoM stuff. I have yet to see where that is used and walk through it. IMO a future PR can reduce its size once we believe the edge cases are covered.

@ca333 ca333 merged commit b31954b into KomodoPlatform:dev Feb 20, 2022
DeckerSU added a commit to DeckerSU/KomodoOcean that referenced this pull request Apr 5, 2022
Backport of @jmjatlanta work from KomodoPlatform/komodo#481
excluded tests and binary data (for tests).

Related links:

- 4322dae
- KomodoPlatform/komodo#481
DeckerSU added a commit to DeckerSU/events-refactor-test that referenced this pull request Aug 22, 2022
who-biz pushed a commit to who-biz/komodo that referenced this pull request Mar 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants