Skip to content

Conversation

@marcin-mazurek
Copy link
Contributor

@marcin-mazurek marcin-mazurek commented Apr 21, 2022

This PR fixes the issue with no progress shown on the loading screen on Windows.

Screenshots

Screenshot 2022-04-22 190920

Testing Checklist


Review Checklist

Basics

  • PR assigned to the PR author(s)
  • input-output-hk/daedalus-dev and input-output-hk/daedalus-qa assigned as PR reviewers
  • If there are UI changes, Alexander Rukin assigned as an additional reviewer
  • All visual regression testing has been reviewed (assign run Chromatic label to PR to trigger the run)
  • PR has appropriate labels (release-vNext, feature/bug/chore, WIP)
  • PR link is added to a Jira ticket, ticket moved to In Review
  • PR is updated to the most recent version of the target branch (and there are no conflicts)
  • PR has a good description that summarizes all changes
  • PR contains screenshots (in case of UI changes)
  • CHANGELOG entry has been added to the top of the appropriate section (Features, Fixes, Chores) and is linked to the correct PR on GitHub
  • There are no missing translations (running yarn manage:translations produces no changes)
  • Text changes are proofread and approved (Jane Wild / Amy Reeve)
  • Japanese text changes are proofread and approved (Junko Oda)
  • Storybook works and no stories are broken (yarn storybook)
  • In case of dependency changes yarn.lock file is updated

Code Quality

  • Important parts of the code are properly commented and documented
  • Code is properly typed with typescript types
  • React components are split-up enough to avoid unnecessary re-renderings
  • Any code that only works in main process is neatly separated from components

Testing

  • New feature/change is covered by acceptance tests
  • New feature/change is manually tested and approved by QA team
  • All existing acceptance tests are still up-to-date
  • New feature/change is covered by Daedalus Testing scenario
  • All existing Daedalus Testing scenarios are still up-to-date

After Review

  • Update Slack QA thread by marking it with a green checkmark

@marcin-mazurek marcin-mazurek force-pushed the fix/ddw-1088-no-progress-on-win-splash-screen branch from 5ab1201 to 761d05a Compare April 21, 2022 21:49
@lucas-barros
Copy link
Contributor

@marcin-mazurek I think the MaxListenersExceededWarning might be due to the polling aspect of watchFile. It is batching many progress lines from the file and sending a lot of events at once which will set many listeners at once, I belive. I added a commit to debounce the send event, which will limit the amount of IPC listeners. Let me know what you think.

const percentage = line.match(/Progress:([\s\d.,]+)%/)?.[1];
const progressType = getProgressType(line);
if (!percentage || !progressType) {
const unparsedProgress = line.match(/Progress:([\s\d.,]+)%/)?.[1];
Copy link
Contributor Author

@marcin-mazurek marcin-mazurek Apr 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've decided to make the variable names more concise as the if in line 70 was getting a bit too big to read it easily.


const progress = Math.floor(parseFloat(unparsedProgress));

if (lastReportedProgressByType[type] !== progress) {
Copy link
Contributor Author

@marcin-mazurek marcin-mazurek Apr 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helps us ensure we're not sending duplicate events via IPC, and mitigates the risk of getting the MaxListenersExceededWarning, which could previously be observed while trying to send multiple (10+) messages at once.

The way fs.fileWatch works is, it pools for file changes every 5 seconds (by default) and invokes given callback for each line - so if there are 100 new lines added in 5 seconds = 100 callback calls = 100 IPC messages.

// https://github.com/nodejs/node/issues/36888
// https://github.com/lucagrulla/node-tail/issues/137
// https://nodejs.org/dist/latest-v14.x/docs/api/fs.html#fs_caveats
useWatchFile: environment.isWindows,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the actual fix for Windows.

className={makePercentageCellStyles(props[type] === 100)}
>
{Math.floor(props[type])}%
{props[type]}%
Copy link
Contributor Author

@marcin-mazurek marcin-mazurek Apr 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this to the main process so we can compare against actually displayed values (currently we're getting values as float and then dropping the fractional part in the component itself, which results in unnecessary re-renders)

@marcin-mazurek marcin-mazurek self-assigned this Apr 22, 2022
@gabriela-ponce
Copy link

@marcin-mazurek The behavior is as expected in most scenarios, but I found some inconsistencies:

  • In macOS, the first item never shows the actual progress. It's ok for the rest of the items most of the time (sometimes it also fails) and it also looks ok in production.
  • For Windows, most of the time it's working, but similarly to macOS, the progress is not displayed sometimes for any item. To provide a little more context, this issue in Windows reproduces when I deleted the logs folder to get new logs. After a few tries it would be ok, but as it doesn't reproduce in production, I'm wondering if this can be improved.

@miorsufianiohk
Copy link

Hi @marcin-mazurek ,

For build 21637, on Linux when starting the first time all 3 items were showing 0%. The percentages never went up. But the transition to main screen happened eventually. However, the subsequent attempts did show percentages going up. When retested with production, the production did show percentages going up the first time I ran Daedalus. (ps: I didn't take any video as some of the issues overlap with what @gabriela-ponce found here)

Lucas Araujo and others added 2 commits April 25, 2022 22:23
}

const progress = Math.floor(parseFloat(unparsedProgress));
// In rare cases cardano-node does not log 100%, therefore we need to manually mark the previous step as complete.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mitigates the risk of having a race condition. In rare cases the log entries can be pushed before we set up the file watch mechanism, so this additional logic ensures we mark previous steps as complete when the next steps starts. This also improves UX as currently sometimes cardano-node doesn't report 100% progress.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the initial goal was to have the UI matching 1:1 the logs so the users could report the exactly progress in case they get stuck. @dmitrii-gaico Do you think this new behavior is fine ?

Copy link
Contributor Author

@marcin-mazurek marcin-mazurek Apr 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively we can read past entries from the file. @szymonmaslowski found out that there is a param in tail that allows us to do that.

But to be fair, I believe it just depends how you interpret logs. Eg. you can interpret the following entry:

�[34m[Marcins-:cardano.node.ChainDB:Info:19]�[0m [2022-04-27 12:30:36.94 UTC] Opened lgr db
�[34m[Marcins-:cardano.node.ChainDB:Info:19]�[0m [2022-04-27 12:30:36.94 UTC] Started initial chain selection

as "replayed block is completed". It just doesn't match the Progress: x%. But makes the UX much better and shows the truth. The steps are sequential, but previous logic would often result in showing the previous step is stuck at eg. 67% and next step is complete, while that's not true.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we 100% the steps are sequential? Often cardano-node would output in a different order depending of the ledger state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I observed this behaviour consistently while testing, I might be wrong. Since this implementation is controversial, I'll reimplement this with the approach @szymonmaslowski suggested - using the fromBeginning param from tail.


if (progressReport[type] !== progress) {
progressReport[type] = progress;
getBlockSyncProgressChannel.send(progressReport, mainWindow.webContents);
Copy link
Contributor Author

@marcin-mazurek marcin-mazurek Apr 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of sending report for a specific type, we will send the whole progress report. This resolves the issue where the IPC channel subscription in the renderer is registered late and messages from the main thread are sent into the void in case sync step is very quick.

Question for people more experienced with IPC: is there any way to send a 'sticky' message (messsage which can be delivered to subscriptions registered after the message was sent)? If so, we could just slightly tweak the previous implementation which sends less data... However, I tried searching for it and couldn't find anything.

Copy link
Contributor Author

@marcin-mazurek marcin-mazurek Apr 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know about the electron IPC implementation but the node IPC protocol itself does not support queueing messages. I think sending the entire progressReport object totally makes sense. It is not a concern from the performance perspective.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed sending a snapshot of progressReport , that's smart.

@marcin-mazurek
Copy link
Contributor Author

marcin-mazurek commented Apr 26, 2022

@gabriela-ponce @miorsufianiohk could you please have one more look when you get the chance? Hopefully I addressed the scenarios you've described.

Unfortunately there will be more dev effort - I found another issue which I think we should solve. Namely, we're not stopping the file watch when the sync is complete. In Mac/Linux implementation uses the performant fs.watch API it should be fine, but on Windows, where we use the less performant fs.watchFile, it may cause some performance issues. I'll be addressing that tomorrow.

@dmitrii-gaico dmitrii-gaico self-requested a review April 27, 2022 10:21
Copy link
Contributor

@szymonmaslowski szymonmaslowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job!

tail.on('line', handleNewLogLine);
};

const waitForLogFileToBeCreatedAndWatchLogFile = ({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const waitForLogFileToBeCreatedAndWatchLogFile = ({
const waitForLogFileToBeCreatedAndWatchIt = ({


if (progressReport[type] !== progress) {
progressReport[type] = progress;
getBlockSyncProgressChannel.send(progressReport, mainWindow.webContents);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know about the electron IPC implementation but the node IPC protocol itself does not support queueing messages. I think sending the entire progressReport object totally makes sense. It is not a concern from the performance perspective.

@dmitrii-gaico
Copy link

@marcin-mazurek @szymonmaslowski Hello! Let's keep this PR as a Fix Only and gather all the "suggestions to improve" in a separate card so we can implement those in near future. Thank you in advance!

…ed previous sync steps as complete when 100% was not seen in logs
@marcin-mazurek
Copy link
Contributor Author

marcin-mazurek commented Apr 27, 2022

@gabriela-ponce @miorsufianiohk @dmitrii-gaico I just pushed another change, which is a result of a discussion we had in the code review. Please use the 21648 build for testing. Sorry for last minute change.

Copy link
Contributor

@danielmain danielmain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean and well refactor 💪

Great job @marcin-mazurek !

@danielmain danielmain merged commit 97d7693 into develop Apr 28, 2022
@danielmain danielmain deleted the fix/ddw-1088-no-progress-on-win-splash-screen branch April 28, 2022 08:05
@marcin-mazurek
Copy link
Contributor Author

@danielmain kudos to @lucas-barros too as he contributed to this PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants