Generate thumbnails of uploaded images #432

Open
pkaske opened this Issue Jan 18, 2016 · 10 comments

Projects

None yet

3 participants

@pkaske
pkaske commented Jan 18, 2016

and load those in the message stream instead of a scaled down version of the full resolution.
If you don't have a quite fast upload speeds it takes like forever to get that stuff from the server.

@timabbott
Member

This is a good idea; if people are sending large images in messages, that can easily be a strong majority of the total data transferred by a Zulip instance.

This is probably not super easy to implement, since there are security issues around thumbnailing images that would need to be handled carefully (unfortunately, image-processing libraries often have security bugs). Unless there's a third-party thumbnailing API service that we could have do the processing to avoid that?

@pkaske
pkaske commented Jan 21, 2016

I fully understand your concerns for security, but I thing a third party thumbnail API is problematic as well. At least for companies in Germany (or EU altogether?). The moment your business data is stored on servers outside the EU you're no longer compliant with data protection laws. At least to my knowing. Thus you wouldn't be allowed to use the thumbnail feature as a German company if your image data is processed by a service outside the EU. Someone please correct me if I'm wrong.
The second thing is, would a company really want their images be send to some third party? I don't belief that.
My suggestion: If a third party API should be used to minimize security risks, then it would probably be a good idea if administrators could configure the API connection them self.

@timabbott
Member

I completely agree that a third-party thumbnailing service would have to be a configuration option so it can be disabled by folks who have more intensive data privacy requirements (whether by law or company security policy). The reason I mentioned that idea is that it could provide a quick path to offering something in this direction for those users who don't have sensitive data (and e.g. get the thumbnailing display code paths working well) without having to do all the work to build a secure thumbnailing system.

@timabbott
Member

http://charlesleifer.com/blog/nginx-a-caching-thumbnailing-reverse-proxying-image-server-/ is relevant and its comments mention some other open source options for this.

@timabbott timabbott added this to the 2016 roadmap milestone Apr 29, 2016
@TigorC
Contributor
TigorC commented Nov 4, 2016 edited

@timabbott
I'm working on first version of this issue

Image uploads, where we can likely borrow reuse our existing code for thumbnailing avatars

and I have questions:
I tried generate thumbnail inside a UploadBackend, how I can set information of thumbnail for a uploaded file? I can return thumbnail url from the upload view but if we plan to use "Nginx thumbnailing" then it won't be work.
Do we need generate Attachment for a thumbnail?

@timabbott
Member

Thinking about this problem in more detail, there's a few ways that Zulip serves image files today:

  • links to images on third-party sites, from inline message previews, which we just want to server a thumbnail to. We currently use camo (https://github.com/atmos/camo) for this purpose to generate HTTPS links. I'm increasingly thinking we should switch to something like thumbor for this purpose, since that would support the existing camo use case (it seems to support a signing key just like camo) and also enable us to do thumbnailing while we're at it. I think it'd be reasonable for you to work on replacing camo with thumbor serving 100px height images for this purpose as the next step.
  • Avatars, which we currently have image resize code for, but which probably would be better off if we converted them to use the thumbor service sitting in front of the original images in S3. There's no urgency to migrate, but it seems like something worth keeping in mind.
  • user-uploaded files, which have the interesting problem that they're not publicly available and thus need the Attachment model to control access to them for security. The way we currently have that setup is basically the URL the user receives points to a Django endpoint, which checks whether the user has a valid session cookie that can access the uploaded file, and if so, serves a redirect to a newly generated short-lived S3 URL pointing to the file.

It would certainly be nice if we can have thumbor handle these as well. The tricky issue is basically how to handle the expiring link issue; one reasonable strategy would be to serve a URL that has thumbor in front of an expiring S3 URL with a short expiry time in the thumbor configuration; I'm pretty sure that would work but am not 100% confident it wouldn't require either patching thumbor or running a second copy of thumbor (since the thumbor expiry configuration seems to not support varying the expiry time depending on the file).

If the thumbor approach doesn't work here, we could handle these the same way we currently resize avatars, and just store the thumbnails in another directory in the S3 uploads bucket from where we store the originals (currently the paths are prefixed with the realm ID, so we could just use "thumb/200x200/<realm_id>/" as the path prefix instead).

Regardless, we don't need Attachment objects for thumbnails, since we can just use the original Attachment object to control access.

Does that make sense?

@TigorC TigorC added a commit to TigorC/zulip that referenced this issue Nov 8, 2016
@TigorC TigorC Generate thumbnails for uploaded images
Fixes: #432
3a19ca1
@TigorC
Contributor
TigorC commented Nov 9, 2016 edited

@timabbott
Regarding solution with thumbor.
For external links we can just use default http loader https://github.com/thumbor/thumbor/wiki/Image-loader#http-loader
For internal links need to create custom image loader https://github.com/thumbor/thumbor/wiki/Image-loader#custom-loaders to verify signatures and url expiration.
Inside serve_file_backend view we'll check permission and generate thumbor link:
on zulip side:

def sign_request(raw, key):
    hashed = hmac.new(key, raw, sha1)
    return hashed.digest().encode("base64").rstrip('\n')

create url

host = getattr(settings, 'THUMBOR_HOST', '')
thumbor_expire = getattr(settings, 'THUMBOR_EXPIRE_DURATION', 60 * 60)
expired = int(time.time() + thumbor_expire)
raw = u'{0}_{1}'.format(file_path, expired)
sign = sign_request(raw, SECRET_KEY)
url = u'{host}/unsafe/{size}/{filters}/{path}'
url += '?expired={expired}&sign={sign}'
return url.format(host=host, path=file_path, expired=expired, sign=sign, size=size, filters=filters)

and check url on thumbor server

def check_url(url):
    data = parse_qs(urlparse(url).query)
    expired = get_param_value(data, 'expired')
    sign = get_param_value(data, 'sign')
    if not expired or not sign:
        return False
    raw = u'{0}_{1}'.format(urlparse(url).path, expired)
    if sign == sign_request(raw, SECRET_KEY):
        if int(expired) > time.time() or int(expired) == 0:
            return True
    return False

For avatars we can set "expires" == 0
What do you think about it?
Also, are you planning to run some sort of CDN in front of thumbor?

@timabbott
Member

That loader strategy sounds good to me!

Should this piece:
if sign == sign_request(raw, SECRET_KEY):
be checking the signature, not re-signing it? I guess maybe it's about the same because of the nature of the system, but that reads wrong to me :).

We're not planning to use a CDN in the near future; it seems like thumbor's own caching should be sufficient for the scale that we expect this to have in the near term.

@TigorC
Contributor
TigorC commented Nov 10, 2016

Should this piece:
if sign == sign_request(raw, SECRET_KEY):
be checking the signature, not re-signing it?

Yes, of course, this is just a check, I'll do another name for this function.
I have some questions:
Should the Thumbor be optional? Do we need a puppet script to set up the Thumbor server?

@timabbott
Member

Yes, we need puppet config to run the thumbor server. From a puppet perspective, we can assume that Zulip's installation process has setup a virtualenv with the Python packages (as long as you modify requirements/ appropriately), so the main thing I think is a supervisord config to run it (and installing any C library dependencies not already in VENV_DEPENDENCIES and scripts/setup/generate_secrets.py code to generate the secret key.

@timabbott timabbott modified the milestone: Zulip roadmap, Old roadmap Nov 18, 2016
@TigorC TigorC added a commit to TigorC/zulip that referenced this issue Nov 21, 2016
@TigorC TigorC Automatic thumbnailing of images with using the Thumbor
This closes #432
6199e60
@TigorC TigorC added a commit to TigorC/zulip that referenced this issue Nov 21, 2016
@TigorC TigorC Automatic thumbnailing of images with using the Thumbor
This closes #432
60c9cf3
@TigorC TigorC added a commit to TigorC/zulip that referenced this issue Nov 22, 2016
@TigorC TigorC Automatic thumbnailing of images with using the Thumbor
This closes #432
b6bf05d
@TigorC TigorC added a commit to TigorC/zulip that referenced this issue Nov 22, 2016
@TigorC TigorC Automatic thumbnailing of images with using the Thumbor
This closes #432
91d8832
@TigorC TigorC added a commit to TigorC/zulip that referenced this issue Nov 22, 2016
@TigorC TigorC Automatic thumbnailing of images with using the Thumbor
This closes #432
edbf8ed
@TigorC TigorC added a commit to TigorC/zulip that referenced this issue Nov 22, 2016
@TigorC TigorC Automatic thumbnailing of images with using the Thumbor
This closes #432
1cc3900
@TigorC TigorC added a commit to TigorC/zulip that referenced this issue Dec 3, 2016
@TigorC TigorC Automatic thumbnailing of images with using the Thumbor
This closes #432
2184fdd
@TigorC TigorC added a commit to TigorC/zulip that referenced this issue Dec 5, 2016
@TigorC TigorC Automatic thumbnailing of images with using the Thumbor
This closes #432
58aad57
@TigorC TigorC added a commit to TigorC/zulip that referenced this issue Dec 13, 2016
@TigorC TigorC Automatic thumbnailing of images with using the Thumbor
This closes #432
7044358
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment