Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show pending items in queue counter on dashboard #96

Merged
merged 3 commits into from Sep 10, 2014

Conversation

mback2k
Copy link
Contributor

@mback2k mback2k commented Aug 21, 2014

@hannahwhy
Copy link
Member

hannahwhy commented Aug 21, 2014

This looks okay to me.

I'd like to hold off merging it until the hooks make it into a new wpull release. That way, we can change requirements.txt accordingly, and the usual pip install --upgrade bit will get the wpull we need.

One thing: for pipelines that don't have this code, we'll see zero for queued and downloaded items. I don't think this is a big problem: it will eventually be solved as all pipelines are updated, and we can adapt our brains in the meantime.

@mback2k
Copy link
Contributor Author

mback2k commented Aug 21, 2014

@yipdw Okay, sounds reasonable.

@mback2k
Copy link
Contributor Author

mback2k commented Aug 23, 2014

I don't understand why you think they are always being reset to 0. They are reset to 0 after their value has been added to the global counter in redis.
See the following lines:

@ivan
Copy link
Contributor

ivan commented Aug 23, 2014

I mean, why are the counters even there in the first place, instead of just

                self.redis.hincrby(ident, 'items_downloaded', count)

@mback2k
Copy link
Contributor Author

mback2k commented Aug 23, 2014

I created those internal counters just like bytes_outstanding for the global bytes_downloaded counter. Please see update_bytes_downloaded at: https://github.com/mback2k/ArchiveBot/blob/topic/queue_counter/pipeline/archivebot/control.py#L127

I guess the idea is to retry adding the value in case something goes wrong during the redis operation.

@ivan
Copy link
Contributor

ivan commented Aug 23, 2014

Ah, okay. @yipdw what's the point of bytes_outstanding?

@hannahwhy
Copy link
Member

hannahwhy commented Aug 24, 2014

Yeah, the idea behind self.bytes_outstanding was to maintain a running tally that was reset once we had confirmation that the job record had been updated.

I'm not sure it's necessary, though -- losing a record here or there might not be worth the odd code flow. It may not actually fulfill its intended purpose: part of with conn(self) involves checking scripts, which in turn requires an active connection.

@hannahwhy
Copy link
Member

hannahwhy commented Sep 10, 2014

wpull 0.1000 has been released and is being used by ArchiveBot, so I'm merging this.

@hannahwhy hannahwhy merged commit 09a527c into ArchiveTeam:master Sep 10, 2014
hannahwhy added a commit that referenced this pull request Sep 10, 2014
Also show total number of queued and downloaded items as a title
attribute.
@hannahwhy
Copy link
Member

hannahwhy commented Sep 10, 2014

This seems to work pretty well; however, I've noticed that items_downloaded tends to be substantially less (a few dozen) than the sum of all responses. I'll push the dashboard changes and start up a pipeline so we can all see what's going on.

For the sort of coarse-grained "are we there yet" information we're looking for, this isn't a problem; however, from an accounting perspective, it'd be nice for all quantities to match up 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants