Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Load progress percentage sometimes exceeding 100% #8

Closed
wants to merge 16 commits into from

Conversation

Grandro
Copy link
Collaborator

@Grandro Grandro commented Oct 22, 2023

Fixes #6

I couldn't reproduce the issue of the loading progress percentage exceeding 100% after the double change, but I added assert statements to catch if it still happens somehow.
This holds true for the backend (and thus also for the frontend)

I have also declared _curRow to be std::atomic as suggested, in case a GeomCache is used by several threads.

Every thread calls petrimaps::Server::handle individually but I realized one single /loadstatus request is handled by every thread (I only sent out one /loadstatus request, placed a breakpoint where the request is handled in the server and saw every thread pause on that line) and would believe that this is not intended behaviour. In HttpServer::handle getReq(connection) returns the "same" request. What does connection refer to here?

@patrickbr
Copy link
Member

patrickbr commented Oct 24, 2023

Nice, thank you!

Every thread calls petrimaps::Server::handle individually but I realized one single /loadstatus request is handled by every thread (I only sent out one /loadstatus request, placed a breakpoint where the request is handled in the server and saw every thread pause on that line) and would believe that this is not intended behaviour. In HttpServer::handle getReq(connection) returns the "same" request. What does connection refer to here?

Hm, I wasn't able to reproduce this. A single /loadstatus request was only handled by a single thread. The handle()-Method in HttpServer is basically an endless loop and is started by each thread. Each thread then waits in line https://github.com/ad-freiburg/util/blob/af1ced1c82539675b8c9d49c19136224e4af07a9/http/Server.cpp#L109 until a Job is available. Jobs are added in the endless loop here: https://github.com/ad-freiburg/util/blob/af1ced1c82539675b8c9d49c19136224e4af07a9/http/Server.cpp#L342 As soon as a job is available, one of the threads grabs it and handles it. The other threads continue waiting.

The general process is:

  1. Threads are added which all use the handle() method to wait for new jobs on the _jobs queue.
  2. An endless loop calls _jobs.add(socket.wait()). Every time a connection is available, socket.wait() returns a socket file descriptor (this is the connection variable you asked for above), and this socket file descriptor is added to _jobs and exactly one thread waiting _jobs.get() is notified that a new job is available.
  3. For this thread,_jobs.get() now returns the file descriptor. For all other threads, _jobs.get() continues blocking.
  4. The job is immediately removed from the queue, so the same two threads cannot get the same socket file descriptor for handling a connection.
  5. The thread which received the file descriptor handles the connection.

@hannahbast
Copy link
Member

hannahbast commented Jan 12, 2024

@Grandro Thanks for working on this. While you are at it, it would also be good to clarify in the progress bar which step does what. In particular, it should be clarified that the filling of the geomcache is a one-time-thing and does not have to be done for every query. More concretely, the first step could have a message as follows (center-aligned, it doesn't matter that the message is a bit longer, the subtitle can be in a smaller font though):

Filling the geometry cache
This needs to be done only once for each new version of the dataset
and does not have to be repeated for subsequent queries

Similary, when the server has been restarted and the cached geometries are read from disk, there is currently nothing happening with the progress bar at all. There should be together with a message like this:

Reading cached geometries from disk
This needs to be done only once after the server has been started
and does not have to be repeated for subsequent queries

Concerning the 2/2 phase I have two comments:

  1. I don't think that "Parsing geometry IDs" is a good title because the typical user cannot understand what it means. How about "Fetching [number] geometries", where [number] is the number of geometries with thousand separators? Beware to use the singular in case [number] is 1 (even though the message will only be very briefly visible then, but still).

  2. When the result is large, there is still a significant wait after that where nothing happens with the progress bar. I would propose having a third progress bar with an appropriate title (so three phases overall: 1/3 filling the cache or reading it from disk, 2/3 fetching the geometries, 3/3 rendering result or whatever would be a good description for that last phase).

steps that sometimes take time, but where currently no progress bar is shown on the screen. The fist

@Grandro
Copy link
Collaborator Author

Grandro commented Feb 1, 2024

@hannahbast Thanks for your concerns and detailed propositions. I was (and still am) busy with the GeoJSON support I wanted to implement, but I will get back to this as soon as that is done.
I have seen you edited your message but I believe it is incomplete: can I just ignore the last line?

@patrickbr
Copy link
Member

This is now merged in master, I did it manually including some refactorings I have done. Thanks!

@patrickbr patrickbr closed this Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Load progress percentage sometimes exceeds 100%
3 participants