Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Fix routing loop when fetching remote media #1992

Merged
merged 4 commits into from Mar 14, 2017

Conversation

Projects
None yet
3 participants
Member

richvdh commented Mar 13, 2017

When we proxy a media request to a remote server, add a query-param, which will tell the remote server to 404 if it doesn't recognise the server_name.

This should fix a routing loop where the server keeps forwarding back to itself (#1991).

Also improves the error handling on remote media fetches, so that we don't
always return a rather obscure 502. (Fixes the server side of
vector-im/riot-web#2585).

Fix routing loop when fetching remote media
When we proxy a media request to a remote server, add a query-param, which will
tell the remote server to 404 if it doesn't recognise the server_name.

This should fix a routing loop where the server keeps forwarding back to
itself.

Also improves the error handling on remote media fetches, so that we don't
always return a rather obscure 502.
synapse/api/errors.py
+ response_code_message (str): HTTP reason phrase. None for the default.
+ """
+ def __init__(self, code):
+ super(CodeMessageException, self).__init__("%d" % code)
@erikjohnston

erikjohnston Mar 13, 2017 edited

Owner

Why have you changed this?

@richvdh

richvdh Mar 13, 2017

Member

Because I wanted to be able to turn an HTTPResponseException into a SynapseError without having my head explode.

Previously, for an HTTPResponseException, msg was being used for the HTTP reason message, and response_code_message was unused. For a SynapseError, msg was used for the matrix error message, and response_code_message was used for the HTTP reason message.

image

I think. The comments were... unclear.

@erikjohnston

erikjohnston Mar 14, 2017

Owner

Right, if this is about sanitising how errors work then I suggest we get rid of response_code_message entirely, as it appears to be unused. I do think that we should be passing more than just the code to the base exception though.

Previously, for an HTTPResponseException, msg was being used for the HTTP reason message, and response_code_message was unused. For a SynapseError, msg was used for the matrix error message, and response_code_message was used for the HTTP reason message.

As far as I see in errors.py response_code_message is only ever set by LimitExceededError?

Member

richvdh commented Mar 13, 2017

retest this please

synapse/api/errors.py
+ response_code_message (str): HTTP reason phrase. None for the default.
+ """
+ def __init__(self, code):
+ super(CodeMessageException, self).__init__("%d" % code)
@erikjohnston

erikjohnston Mar 14, 2017

Owner

Right, if this is about sanitising how errors work then I suggest we get rid of response_code_message entirely, as it appears to be unused. I do think that we should be passing more than just the code to the base exception though.

Previously, for an HTTPResponseException, msg was being used for the HTTP reason message, and response_code_message was unused. For a SynapseError, msg was used for the matrix error message, and response_code_message was used for the HTTP reason message.

As far as I see in errors.py response_code_message is only ever set by LimitExceededError?

def error_dict(self):
return cs_error(
self.msg,
self.errcode,
)
+ @classmethod
+ def from_http_response_exception(cls, err):
+ """Make a SynapseError based on an HTTPResponseException
@erikjohnston

erikjohnston Mar 14, 2017

Owner

Please document how this converts between the two. It's not clear to me how this conversion should take place. Does a 400 get proxied through? A 401? Or does it just allow through well known OK error codes such as 404?

@richvdh

richvdh Mar 14, 2017

Member

Updated.

+ except twisted.internet.error.DNSLookupError as e:
+ logger.warn("HTTP error fetching remote media %s/%s: %r",
+ server_name, media_id, e)
+ raise NotFoundError()
@erikjohnston

erikjohnston Mar 14, 2017

Owner

Is this true? This might be a transient error? Would a 502 be more appropriate?

@richvdh

richvdh Mar 14, 2017

Member

well, it could be a transient error. My thinking was that it was more likely a permanent error: an invalid hostname, for example - in which case a 4xx is a better response, and NotFoundError seemed the closest fit.

Annoyingly I don't think twisted.internet.endpoints.HostnameEndpoint gives us a way of distinguishing the two cases. We're already wrapping HostnameEndpoint with our own implementation, so we could conceivably improve on this - but I thought it better to go with the common case.

+ except HttpResponseException as e:
+ logger.warn("HTTP error fetching remote media %s/%s: %s",
+ server_name, media_id, e.response)
+ raise SynapseError.from_http_response_exception(e)
@erikjohnston

erikjohnston Mar 14, 2017

Owner

I'm not convinced that we want to blindly forward the error from the other side without any sort of annotation. It makes sense to forward e.g. a 404, but what about a 400? A 401? In both cases it would probably be more appropriate to return a 500

@richvdh

richvdh Mar 14, 2017

Member

Hrm. Possibly. I feel like a 401 might well want to end up as a 401 at the client end too, but I see your point. Fixed.

@erikjohnston

erikjohnston Mar 14, 2017

Owner

I wouldn't be surprised if a 401 ended up logging you out :p

- raise SynapseError(502, "Failed to fetch remoted media")
+ logger.warn("Failed to fetch remote media %s/%s",
+ server_name, media_id,
+ exc_info=True)
@erikjohnston

erikjohnston Mar 14, 2017

Owner

Why are we including a stack trace if this is only a warn? Why is this only a warn when this will catch all sorts of python bugs?

@richvdh

richvdh Mar 14, 2017

Member

good questions. fixed.

+ if isinstance(e, SynapseError):
+ raise e
+ else:
+ raise SynapseError(502, "Failed to fetch remote media")
@erikjohnston

erikjohnston Mar 14, 2017

Owner

Why are we not using multiple except clauses here instead of isinstance?

@richvdh

richvdh Mar 14, 2017

Member

merely to avoid c&ping the logging line. fixed.

@lampholder lampholder removed the in progress label Mar 14, 2017

richvdh added some commits Mar 14, 2017

re-refactor exception heirarchy
Give CodeMessageException back its `msg` attribute, and use that to hold the
HTTP status message for HttpResponseException.
Address review comments
- don't blindly proxy all HTTPRequestExceptions
- log unexpected exceptions at error
- avoid `isinstance`
- improve docs on `from_http_response_exception`
Owner

erikjohnston commented Mar 14, 2017

lgtm

Member

richvdh commented Mar 14, 2017

retest this please

Member

richvdh commented Mar 14, 2017

All checks have passed

miracles will never cease

@richvdh richvdh merged commit f2ed64e into develop Mar 14, 2017

8 checks passed

Sytest Dendron (Commit) Build #1682 origin/rav/fix_media_loop succeeded in 13 min
Details
Sytest Dendron (Merged PR) Build finished.
Details
Sytest Postgres (Commit) Build #2499 origin/rav/fix_media_loop succeeded in 8 min 7 sec
Details
Sytest Postgres (Merged PR) Build finished.
Details
Sytest SQLite (Commit) Build #2565 origin/rav/fix_media_loop succeeded in 6 min 19 sec
Details
Sytest SQLite (Merged PR) Build finished.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@richvdh richvdh deleted the rav/fix_media_loop branch Mar 15, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment