Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding file size to views #3539

Merged
merged 28 commits into from Apr 19, 2018

Conversation

Projects
None yet
4 participants
@ashleytqy
Copy link
Contributor

ashleytqy commented Apr 17, 2018

This PR addresses: #3521

  • Adds file size to files + notebooks, but not for directories (it's undefined for directories)
  • Displays it in the view, with sorting functionality
  • Filesize is humanized (originally in bytes)

@ckilcrease

Sort by ascending

screen shot 2018-04-18 at 1 09 37 pm

@Carreau

This comment has been minimized.

Copy link
Contributor

Carreau commented Apr 18, 2018

Thanks,

I haven't touched this part of the code base in a while so i'll defer to someone else for complete review.

You added looking at filesize in each model when doing a Get request. Was that necessary to get that in directory listing ? Or is the server doing a os.stat on every file when listing a directory ? If is is there might be a perf bottle neck we need to be careful there.

You seem to have 16 commits and a collaborator (good job to both of you for handling Git). Do you wish to reduce your commit number (rebase), to make history cleaner.

From your screenshot I can see that some files are 2B and below, and actually fraction of a Bytes.... Are you sure this is right ? Is there not a factor of 1000 somewhere ? (feel free to use an external library, assuming it is small enough, the license is right... etc). )

I'll do further inline comments.

@@ -814,6 +814,10 @@ definitions:
type: string
description: Last modified timestamp
format: dateTime
size:
type: integer

This comment has been minimized.

@Carreau

Carreau Apr 18, 2018

Contributor

Integer or None ? If size can't be found, probably say that in the description at least.

$("<span/>")
.addClass("file_size")
.addClass("pull-right")
.css("width", "65px")

This comment has been minimized.

@Carreau

Carreau Apr 18, 2018

Contributor

You likely want to avoid inline css.

This comment has been minimized.

@ashleytqy

ashleytqy Apr 18, 2018

Author Contributor

do you know where the css should go? custom.css seems like a user-specific file, and not quite right. from what i've seen, most of the CSS is defined like this:

var add_uploading_button = function (f, item) {
// change buttons, add a progress bar
var uploading_button = item.find('.upload_button').text("Uploading");
uploading_button.off('click'); // Prevent double upload
var progress_bar = $('<span/>')
.addClass('progress-bar')
.css('top', '0')
.css('left', '0')
.css('width', '0')
.css('height', '3px')
.css('border-radius', '0 0 0 0')
.css('display', 'inline-block')
.css('position', 'absolute');

This comment has been minimized.

@Carreau

Carreau Apr 18, 2018

Contributor

Likely in notebook/static/tree/less/tree.less there are case when we do inline css, there might be a reason for progress-bar, maybe not, if not that's likely technical dept we need to fix at some point.

You'll likely need to recompile thee css with python setup.py css after modification of the less file.

@@ -835,6 +866,7 @@ define([
// Add in the date that the file was last modified
item.find(".item_modified").text(utils.format_datetime(model.last_modified));
item.find(".item_modified").attr("title", moment(model.last_modified).format("YYYY-MM-DD HH:mm"));
item.find(".file_size").html(utils.format_filesize(model.size) || "&nbsp;");

This comment has been minimized.

@Carreau

Carreau Apr 18, 2018

Contributor

the use of .html() should be a red flag. From a security standpoint it would not pass an audit. We had security issues before because of HTML usage.

If somewhere where to replace you API to return cannot find <size> for file <file> the you get scrip injection in your page, and your client is compromised.

If you've seen use of .html() in other place of the javascript where .text() should/could be used, please open an issue.

This comment has been minimized.

@ashleytqy

ashleytqy Apr 18, 2018

Author Contributor

re: using .html() – we're using it so that &nbsp; renders because .text() doesn't display an empty string. do you have any suggestions to how to display a &nbsp; without using .html?

it's also used here, which i'm assuming the author also wanted to use the &nbsp; character

$('#counter-select-all').html(checked===0 ? '&nbsp;' : checked);

This comment has been minimized.

@Carreau

Carreau Apr 18, 2018

Contributor

Good question, there are several way to approach that. A simple one would be to use .html() just for &nbsp;:

if (cfoo){
	$(...).text(foo); 
} else {
    $(...).html("&nbsp");
}

That reduce the use of html so it's one less risk.

Thinking about it more, &nbsp; is also just an escape sequence for unicode character Non Breaking Space, which you can enter with your keyboard using some clever keyboard combinaison | | (nbsp), not like | |(space), check the source, on mac it's Alt Space. That's 1) annoying to type 2) annoying to see when reading code. That's not optimal. After thinking a few minutes more you can realize that hopefully most programming language allow you to have escape sequence, knowing that Non-Breaking space is unicode U+00A0 you can go with :

.text(checked===0 ? '\xA0' : checked); 

If you want to even avoid using jQuery you can use the innerText attribute and assign to it.

With your gained knowledge you can even open an issue for $('#counter-select-all').html(checked===0 ? '&nbsp;' : checked); to be fixed.

As an exercise – if you like the fun of hacking – get in a shoes of an attacker and try to modify the server side to not return the size as an integer but a string, and attempt to have a script injection that display a message in the frontend.

This comment has been minimized.

@takluyver

takluyver Apr 18, 2018

Member

Stepping back a bit from unicode discussions: do we actually need a non-breaking space at all? I think the layout should still work if the element has no text.

This comment has been minimized.

@ashleytqy

ashleytqy Apr 18, 2018

Author Contributor

I fixed it with using \xA0 (thanks @Carreau !) . Without text, the span element wrapping it doesn't seem to adhere to the width it's given.

@ashleytqy ashleytqy force-pushed the nyu-ossd-s18:filesize branch from 4c4622f to 2ef8962 Apr 18, 2018

@ashleytqy ashleytqy force-pushed the nyu-ossd-s18:filesize branch from 2ef8962 to acc2056 Apr 18, 2018

@ashleytqy

This comment has been minimized.

Copy link
Contributor Author

ashleytqy commented Apr 18, 2018

@Carreau I tried to install and use pretty-bytes with bower but the library doesn't seem compatible with bower (it threw a module is not recognized error, so i think the library is better suited for an npm install which notebook doesn't use), so i copied over the code with credits if thats okay with you

size:
type: integer
description: "The size of the file or directory."
format: bytes

This comment has been minimized.

@takluyver

takluyver Apr 18, 2018

Member

I think this should just go in the description. Swagger specifies a fixed list of formats for things like dateTime. From the spec's point of view, this is just a number - I don't think it has a way of declaring a unit.

@@ -183,6 +183,14 @@ def is_hidden(self, path):
os_path = self._get_os_path(path=path)
return is_hidden(os_path, self.root_dir)

def _get_file_size(self, path):
try:
size = os.path.getsize(path)

This comment has been minimized.

@takluyver

takluyver Apr 18, 2018

Member

This will do a new stat on the file, which is best avoided for performance reasons. We already get a stat object in _base_model() - perhaps we can use the size from that, and remove it again for directories. It's a bit of an ugly trick, but for a big directory, it could make a difference.

This comment has been minimized.

@ckilcrease

ckilcrease Apr 18, 2018

Contributor

@takluyver thank you -- I've updated so that it uses file size from st_size (retrieved from stat) in _base_model(), but let me know if this isn't what you meant

@@ -1038,7 +1038,34 @@ define([
return (order == 1) ? 1 : -1;
}
};

// source: https://github.com/sindresorhus/pretty-bytes

This comment has been minimized.

@takluyver

takluyver Apr 18, 2018

Member

I'm OK with this since it's one fairly small function, and it's unlikely to change. But we should include a copy of the license in the comment.

https://github.com/sindresorhus/pretty-bytes/blob/master/license

@takluyver

This comment has been minimized.

Copy link
Member

takluyver commented Apr 18, 2018

Looking at the screenshots, I think there should be a bit more whitespace between the last modified time and the size, so it's clear that they're separate fields. Maybe right-align the sizes?

ashleytqy and others added some commits Apr 18, 2018

@takluyver

This comment has been minimized.

Copy link
Member

takluyver commented Apr 18, 2018

Thanks, that was what I meant about using the existing stat info. :-)

I think this is looking good now, though I'm too tired to review it closely. But there's some problem with the JS tests on Travis - can you investigate?

@Carreau

This comment has been minimized.

Copy link
Contributor

Carreau commented Apr 18, 2018

Travis seem happy, Appveyor was not but this seem like an unrelated error. I restarted he build.

@Carreau
Copy link
Contributor

Carreau left a comment

Good job !

try:
# size of file
size = info.st_size
except (ValueError, OSError):

This comment has been minimized.

@takluyver

takluyver Apr 19, 2018

Member

I don't think there can be an error here - if lstat() returns, then we should have a valid stat object from which we can get st_size. But it shouldn't be a problem to be defensive in case I'm wrong.

@@ -814,6 +814,9 @@ definitions:
type: string
description: Last modified timestamp
format: dateTime
size:
type: integer
description: "The size of the file or notebook in bytes. If no size is provided, defaults to None."

This comment has been minimized.

@takluyver

takluyver Apr 19, 2018

Member

Since this is describing a JSON API, it's probably better to refer to null, which is the JSON/Javascript name, rather than None, which is from Python. This isn't worth holding the PR up for, so I'll change it myself and then merge.

@takluyver takluyver merged commit 5875235 into jupyter:master Apr 19, 2018

4 checks passed

codecov/patch 62.5% of diff hit (target 0%)
Details
codecov/project 75% (-0.02%) compared to 8034d71
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@takluyver

This comment has been minimized.

Copy link
Member

takluyver commented Apr 19, 2018

Thanks!

@takluyver takluyver added this to the 5.5 milestone Apr 24, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.