Skip to content
This repository has been archived by the owner on Jul 2, 2021. It is now read-only.

Show dialog before download #287

Merged
merged 13 commits into from
Jun 23, 2017

Conversation

Hakuyume
Copy link
Member

@Hakuyume Hakuyume commented Jun 16, 2017

Fix #286

Fetching the size of http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar ...
File will be saved to /home/admin/.chainer/dataset/_dl_cache/92029705a99338d0932803388148a725.
It will be use 1907.00 MiB of the disk space.
Proceed? (y/N): y
Downloading from http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar ...
...

@yuyu2172
Copy link
Member

request.Request does not have method argument for Python2.

Also, I would be against asking users to confirm download.
The user has already understood that download will happen when necessary.
If unintended downloading starts, the user can just ctrl-c.

@@ -51,12 +55,12 @@ def cached_download(url):
str: Path to the downloaded file.

"""
cache_root = get_dataset_directory('_dl_cache')
cache_root = os.path.join(get_dataset_root(), '_dl_cache')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change necessary?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, the following try, except were not necessary.

Copy link
Member Author

@Hakuyume Hakuyume Jun 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change necessary?

I followed this discussion chainer/chainer#2839 (comment). This change is not related to this issue directly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.

@Hakuyume
Copy link
Member Author

Hakuyume commented Jun 17, 2017

Also, I would be against asking users to confirm download.
The user has already understood that download will happen when necessary.
If unintended downloading starts, the user can just ctrl-c.

We can divide this problem into two.

  1. cache_download does not tell the size of file and the location where the file will be downloaded. I guess this is the issue reported in Notify user about total dataset size and download location #286. I think there is no bad point about showing these information.
  2. The download process starts without confirmation. As you suggested, users can stop this by Ctrl-C. However, I feel it is more user friendly if it asks explicitly. I'm not sure everyone feels so.

@yuyu2172
Copy link
Member

However, I feel it is more user friendly if it asks explicitly

I find it irritating to find my process hang because of this feature.
This is especially the case because this behavior is different from Chainer's download.

@Hakuyume
Copy link
Member Author

I find it irritating to find my process hang because of this feature.
This is especially the case because this behavior is different from Chainer's download.

OK, I removed confirmation step and changed the progress message as follows.

Downloading from http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar to ***/.chainer/dataset/_dl_cache/92029705a99338d0932803388148a725 ...
... 0 %, 0.16 MiB / 1907.00 MiB, 195.06 KiB/s, 0.8 seconds passed

@yuyu2172
Copy link
Member

OK. Thanks.
Perhaps, we can add the remaining time.

@Hakuyume
Copy link
Member Author

Hakuyume commented Jun 17, 2017

Perhaps, we can add the remaining time.

I added.

Downloading from http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar to ***/.chainer/dataset/_dl_cache/92029705a99338d0932803388148a725 ...
... 9 %, 172.07 MiB / 1907.00 MiB, 1145.80 KiB/s, 153.8 seconds passed, 1550.5 seconds remain

@yuyu2172
Copy link
Member

yuyu2172 commented Jun 22, 2017

On downloading message, I think it is too verbose.

Four suggestions

  • Decimal numbers are not necessary (224.1MB/s --> 224MB/s)
  • I don't think elapsed time is necessary.
  • How about using "eta " instead of seconds remain. "eta" stands for expected time of arrival, and this is abbreviation used by wget.
  • How about using minutes and seconds instead of seconds for "eta"?

Here is a code I suggest to use from line 32.

    eta = (total_size - progress_size) / speed
    sys.stdout.write(
        '\r... {:.0f}%  {:.0f}MiB/{:.0f}MiB '
        '{:.0f}KiB/s eta {:.0f}m {:.0f}s'
        .format(
            percent, progress_size / (1 << 20), total_size / (1 << 20),
            speed / (1 << 10), eta // 60, eta % 60))

The message looks something like below.

Downloading from https://github.com/alexgkendall/SegNet-Tutorial/archive/master.zip to /home/leus/.chainer/dataset/_dl_cache/318f5142ef9ae74b62553a3d8b8598d7 ...
... 1%  2MiB/178MiB 1818KiB/s eta 1m 39s

@yuyu2172
Copy link
Member

yuyu2172 commented Jun 22, 2017

This is something different from what I have said in the last comment, but how about switching eta to hours:minutes:seconds, which can be done with the string below.
Also, I didn't like how outputs length changes from frame to frame.
What do you think about the new output?

    sys.stdout.write(
        '\r... {:3.0f}%  Total {:7.0f}Mib    Current {:7.0f}MiB '
        '{:8.0f}KiB/s    eta {:5.0f}:{:02.0f}:{:02.0f}'    
        .format(
            percent, total_size / (1 << 20), progress_size / (1 << 20),
            speed / (1 << 10), eta // 3600, eta // 60, eta % 60))
...   2%  Total     178Mib    Current       3MiB      570KiB/s    eta     0:05:15

Edit:

I tweaked more, and I found that it looks nice to report only percentage, speed and eta.

def _reporthook(count, block_size, total_size):                        
    global start_time                                                  
    if count == 0:                                                     
        start_time = time.time()                                       
        print('Total size  {:.0f}Mib'.format(total_size / (1 << 20)))  
        return                                                         
    duration = time.time() - start_time                                
    progress_size = count * block_size                                 
    try:                                                               
        speed = progress_size / duration                               
    except ZeroDivisionError:                                          
        speed = float('inf')                                           
    percent = progress_size / total_size * 100                         
                                                                       
    eta = (total_size - progress_size) / speed                         
    sys.stdout.write(                                                  
        '\r... {:3.0f}%  '                                             
        '{:8.0f}KiB/s   {:5.0f}h {:2.0f}m {:2.0f}s left'                
        .format(                                                       
            percent,                                                   
            speed / (1 << 10), eta // 3600, eta // 60, eta % 60))      
    sys.stdout.flush()            
                                                

The message

Downloading from https://github.com/alexgkendall/SegNet-Tutorial/archive/master.zip to /home/leus/.chainer/dataset/_dl_cache/318f5142ef9ae74b62553a3d8b8598d7 ...
Total size  178Mib
...   6%      3127KiB/s       0h  0m 55s left

@Hakuyume
Copy link
Member Author

I tweaked more, and I found that it looks nice to report only percentage, speed and eta.

From my understanding, the original issue is that users can not know the total dataset size #286. I think it is better to show the total size and downloaded size.

@Hakuyume
Copy link
Member Author

Also, I didn't like how outputs length changes from frame to frame.

I don't like either. I accept your modification about the length.

@Hakuyume
Copy link
Member Author

@yuyu2172 How about this output?

Downloading ...
From: http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
To: ***/.chainer/dataset/_dl_cache/c320d1dc0cd031efc99e256bf21d57a6
  %   Total    Recv       Speed  Time left
  1  430MiB    5MiB   1634KiB/s    0:04:26

This format is inspired by curl command.

@yuyu2172
Copy link
Member

LGTM

@yuyu2172 yuyu2172 merged commit 1f960c6 into chainer:master Jun 23, 2017
@yuyu2172 yuyu2172 added this to the v0.6 milestone Jun 23, 2017
@Hakuyume Hakuyume deleted the show-dialog-before-download branch June 25, 2017 03:23
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants