Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
Already on GitHub? Sign in to your account
Fix routing loop when fetching remote media #1992
Conversation
lampholder
added
the
in progress
label
Mar 13, 2017
richvdh
assigned
erikjohnston
Mar 13, 2017
richvdh
referenced this pull request
in vector-im/riot-web
Mar 13, 2017
Open
media download fails with cryptic error if it is larger than max_upload_size #2585
| + response_code_message (str): HTTP reason phrase. None for the default. | ||
| + """ | ||
| + def __init__(self, code): | ||
| + super(CodeMessageException, self).__init__("%d" % code) |
richvdh
Mar 13, 2017
Member
Because I wanted to be able to turn an HTTPResponseException into a SynapseError without having my head explode.
Previously, for an HTTPResponseException, msg was being used for the HTTP reason message, and response_code_message was unused. For a SynapseError, msg was used for the matrix error message, and response_code_message was used for the HTTP reason message.
I think. The comments were... unclear.
erikjohnston
Mar 14, 2017
Owner
Right, if this is about sanitising how errors work then I suggest we get rid of response_code_message entirely, as it appears to be unused. I do think that we should be passing more than just the code to the base exception though.
Previously, for an HTTPResponseException, msg was being used for the HTTP reason message, and response_code_message was unused. For a SynapseError, msg was used for the matrix error message, and response_code_message was used for the HTTP reason message.
As far as I see in errors.py response_code_message is only ever set by LimitExceededError?
|
retest this please |
| + response_code_message (str): HTTP reason phrase. None for the default. | ||
| + """ | ||
| + def __init__(self, code): | ||
| + super(CodeMessageException, self).__init__("%d" % code) |
erikjohnston
Mar 14, 2017
Owner
Right, if this is about sanitising how errors work then I suggest we get rid of response_code_message entirely, as it appears to be unused. I do think that we should be passing more than just the code to the base exception though.
Previously, for an HTTPResponseException, msg was being used for the HTTP reason message, and response_code_message was unused. For a SynapseError, msg was used for the matrix error message, and response_code_message was used for the HTTP reason message.
As far as I see in errors.py response_code_message is only ever set by LimitExceededError?
| def error_dict(self): | ||
| return cs_error( | ||
| self.msg, | ||
| self.errcode, | ||
| ) | ||
| + @classmethod | ||
| + def from_http_response_exception(cls, err): | ||
| + """Make a SynapseError based on an HTTPResponseException |
erikjohnston
Mar 14, 2017
Owner
Please document how this converts between the two. It's not clear to me how this conversion should take place. Does a 400 get proxied through? A 401? Or does it just allow through well known OK error codes such as 404?
| + except twisted.internet.error.DNSLookupError as e: | ||
| + logger.warn("HTTP error fetching remote media %s/%s: %r", | ||
| + server_name, media_id, e) | ||
| + raise NotFoundError() |
erikjohnston
Mar 14, 2017
Owner
Is this true? This might be a transient error? Would a 502 be more appropriate?
richvdh
Mar 14, 2017
Member
well, it could be a transient error. My thinking was that it was more likely a permanent error: an invalid hostname, for example - in which case a 4xx is a better response, and NotFoundError seemed the closest fit.
Annoyingly I don't think twisted.internet.endpoints.HostnameEndpoint gives us a way of distinguishing the two cases. We're already wrapping HostnameEndpoint with our own implementation, so we could conceivably improve on this - but I thought it better to go with the common case.
| + except HttpResponseException as e: | ||
| + logger.warn("HTTP error fetching remote media %s/%s: %s", | ||
| + server_name, media_id, e.response) | ||
| + raise SynapseError.from_http_response_exception(e) |
erikjohnston
Mar 14, 2017
Owner
I'm not convinced that we want to blindly forward the error from the other side without any sort of annotation. It makes sense to forward e.g. a 404, but what about a 400? A 401? In both cases it would probably be more appropriate to return a 500
richvdh
Mar 14, 2017
Member
Hrm. Possibly. I feel like a 401 might well want to end up as a 401 at the client end too, but I see your point. Fixed.
| - raise SynapseError(502, "Failed to fetch remoted media") | ||
| + logger.warn("Failed to fetch remote media %s/%s", | ||
| + server_name, media_id, | ||
| + exc_info=True) |
erikjohnston
Mar 14, 2017
Owner
Why are we including a stack trace if this is only a warn? Why is this only a warn when this will catch all sorts of python bugs?
| + if isinstance(e, SynapseError): | ||
| + raise e | ||
| + else: | ||
| + raise SynapseError(502, "Failed to fetch remote media") |
erikjohnston
Mar 14, 2017
Owner
Why are we not using multiple except clauses here instead of isinstance?
lampholder
removed
the
in progress
label
Mar 14, 2017
richvdh
added some commits
Mar 14, 2017
|
lgtm |
|
retest this please |
miracles will never cease |

richvdh commentedMar 13, 2017
When we proxy a media request to a remote server, add a query-param, which will tell the remote server to 404 if it doesn't recognise the server_name.
This should fix a routing loop where the server keeps forwarding back to itself (#1991).
Also improves the error handling on remote media fetches, so that we don't
always return a rather obscure 502. (Fixes the server side of
vector-im/riot-web#2585).