# webserver: make VFS handler more secure#1480

These commits are meant to make the VFS handler of xbmc's webserver more secure. Currently it allows to access any file which xbmc can reach through its VFS including files like /etc/passwd etc because the only check that is made is whether the file exists or not.

With these commits every requested VFS path will be checked against the list of user-specified sources. If the path does not point to a file within a (sub-)directory of one of those sources a HTTP 401 (Unauthorized) error is returned. For this to work I had to write a method which would take care of ".." etc in paths because otherwise it would be possible to do something like

/path/to/source/../../../etc/passwd


and still access any file. I don't think the implementation is optimal but I couldn't find an existing method that does the same so I wrote URIUtils::GetRealPath() (and I also wrote a unit test for it).

IMO this is the best way to make the VFS handler more secure (short of ripping it out which I plan to do once there is a better way to access media files through the webserver). If someone adds the root path as a source it's his own fault. And if someone adds a symlink to the root directory (or somewhere else) in one of his source directories it's his own problem as well.

I don't have any archives so I don't really know how these work. Do you have an example. The same problem probably applies to stacked items but in that case the stacked files could just be accessed separately?

OK I pushed another commit which will handle rar:// and zip:// paths by using GetHostName() on the original path in the VFS handler. Furthermore I have extended URIUtils::GetRealPath() to also process the hostname and not only the file path.

I haven't tested this yet but will do later.

Confirmed working for zip and rar files.

@jmarshallnz I've taken care of all your comments i.e. added some doxy for URIUtils::GetRealPath(), removed those stupid whitespaces from the empty lines and added recursive behaviour in case of rar/zip paths within a rar/zip path (and also added a test for that in the unit test).

@jmarshallnz I've moved the CURL::Decode logic and IsInArchive check into GetRealPath as it only applies to rar/zip paths within a rar/zip path and therefore only needs to be done for filenames and not for hostnames.

Still not quite right - I think there's a misunderstanding on how rar/zip work (it may be me that's misunderstanding). I thought that a rar within a zip would look like:

rar:///path/inside/rar

i.e.

rar://encode(zip://encode(/path/to/zip.zip)/path/in/zip/to/rar.rar)/path/inside/rar

That is, the protocol is the inner-most rar/zip. The hostname is the encoded path to the innermost rar/zip, and as you recurse in, you're going back "up" the hierarchy.

Thus, you never have to decode/encode the filename portion of the URL - only the hostname. So something like:

CURL url(path)
if (IsInArchive(path))
{
// we need to check the hostname as that's the "parent" rar/zip file
std::string archive = CURL::Decode(url.GetHostName());
url.SetHostName(CURL::Encode(GetRealPath(archive));
}
// finally process the filename
url.SetFileName(GetRealPath(url.GetFileName()));
return url.Get();

@jmarshallnz code seem almost right. But if i remember correctly you don't need to decode the return from GetHostName(), it's already decoded properly. But you do need to recurse it.

while(IsInArchive(path))
path = CURL(path).GetHostName();

should be enough to get to the root archive file which is what should decide access.

I thought it would be rar://< encode(/path/to/rar.rar) >/< encode(zip://< encode(/path/to/rared/zip.zip) >) > (which is why I applied it to the filename/path) but I don't really know either. I'm sure @cptspiff can shed some light on this.

@elupus The recursing happens by calling GetRealPath recursively. I first tried without the calls to Decode and Encode and then it didn't work so I added them. The problem is that you get a double-encoding for an "embedded" rar/zip URL: one to encode the whole "rar://" path and one encoding for the actual path to the rar so. GetHostName() takes care of the first encoding but not of the second.

Then something is very wrong. There should be no double encoding.

Example url that i just tested: rar://smb%3a%2f%2fmisto%2fVideos%2fmovies%2fGhost.Dog.The.Way.of.the.Samurai.1999.DVDRip.XviD.iNT-PFa%2fghostdogcd1-pfa.rar/ghostdogcd1-pfa.avi

There is no double encoding going on.

Oh, this is due to how stuff is served up by the webserver perhaps?

@elupus Ah I thought a rar file within a rar file would be represented by an embedded "rar://" URL as well. Seems like this is not the case.

And no libmicrohttpd url-decodes everything that comes in so we already get the URL-decoded path.

I'll use jmarshalls code without the decoding/encoding and adjust the unit tests.

OK I hope I got it right now. I checked in XBMC with a test zip inside a test rar which results in a path like this:

zip://rar%3A%2F%2F%252Fpath%252Fto%252Frar%2Fpath%2Fto%2Fzip/subpath/to/file


so no extra URL decoding/encoding necessary.

Updated the VFS handler to handle archive paths recursively to get to the path to the actual archive file (thanks @elupus).

Looks good to me

Yup that IsInArchive() check could be dropped.
Concerning using get parent folder(): I have to split the path to be able to find the ".." and "." in the first place so I went with a "vector" based implementation which would re-assemble the path at the end of all processing. I could just append every processed directory to the resulting string and then to a call to GetParentPath() when I find a "..". Not sure if it's really re-implementing as GetParentPath() does a lot more (and includes a lot more checks and conditions) which are not needed in this case and it also has to search in the string again and remove something from the end. That approach contains a lot more string manipulations than the one I went for which takes the string apart once, then operates on a vector and then puts the string back together.

 Montellese URIUtils: add GetRealPath() to handle . and .. in paths 4c882d9 Montellese [test] TestURIUtils: add test for GetRealPath() f593550 Montellese webserver: only allow access to files which are in a (sub-)directory … …of one of the user-specified sources through the VFS handler 71763ce
OK I removed the unnecessary check for IsInArchive() on the path's hostname in GetRealPath().

Looks fine to me.

 Montellese Merge pull request #1480 from Montellese/check_path_in_path webserver: make VFS handler more secure 8b9f91d
Oct 02, 2012
URIUtils: add GetRealPath() to handle . and .. in paths 4c882d9
[test] TestURIUtils: add test for GetRealPath() f593550
webserver: only allow access to files which are in a (sub-)directory …
…of one of the user-specified sources through the VFS handler
71763ce
 @@ -19,8 +19,12 @@ 19 19  */ 20 20   21 21  #include "HTTPVfsHandler.h" 22 -#include "network/WebServer.h" 22 +#include "MediaSource.h" 23 +#include "URL.h" 23 24  #include "filesystem/File.h" 25 +#include "network/WebServer.h" 26 +#include "settings/Settings.h" 27 +#include "utils/URIUtils.h" 24 28   25 29  using namespace std; 26 30   @@ -37,8 +41,44 @@ int CHTTPVfsHandler::HandleHTTPRequest(const HTTPRequest &request) 37 41   38 42  if (XFILE::CFile::Exists(m_path)) 39 43  { 40 - m_responseCode = MHD_HTTP_OK; 41 - m_responseType = HTTPFileDownload; 44 + string sourceTypes[] = { "video", "music", "pictures" }; 45 + unsigned int size = sizeof(sourceTypes) / sizeof(string); 46 + 47 + string realPath = URIUtils::GetRealPath(m_path); 48 + // for rar:// and zip:// paths we need to extract the path to the archive 49 + // instead of using the VFS path 50 + while (URIUtils::IsInArchive(realPath)) 51 + realPath = CURL(realPath).GetHostName(); 52 + 53 + VECSOURCES *sources = NULL; 54 + for (unsigned int index = 0; index < size; index++) 55 + { 56 + sources = g_settings.GetSourcesFromType(sourceTypes[index]); 57 + if (sources == NULL) 58 + continue; 59 + 60 + for (VECSOURCES::const_iterator source = sources->begin(); source != sources->end(); source++) 61 + { 62 + // don't allow access to locked sources 63 + if (source->m_iHasLock == 2) 64 + continue; 65 + 66 + for (vector::const_iterator path = source->vecPaths.begin(); path != source->vecPaths.end(); path++) 67 + { 68 + string realSourcePath = URIUtils::GetRealPath(*path); 69 + if (URIUtils::IsInPath(realPath, realSourcePath)) 70 + { 71 + m_responseCode = MHD_HTTP_OK; 72 + m_responseType = HTTPFileDownload; 73 + return MHD_YES; 74 + } 75 + } 76 + } 77 + } 78 + 79 + // the file exists but not in one of the defined sources so we deny access to it 80 + m_responseCode = MHD_HTTP_UNAUTHORIZED; 81 + m_responseType = HTTPError; 42 82  } 43 83  else 44 84  {
 @@ -1000,3 +1000,59 @@ void URIUtils::CreateArchivePath(CStdString& strUrlPath, 1000 1000  strUrlPath += strBuffer; 1001 1001  #endif 1002 1002  } 1003 + 1004 +string URIUtils::GetRealPath(const string &path) 1005 +{ 1006 + if (path.empty()) 1007 + return path; 1008 + 1009 + CURL url(path); 1010 + url.SetHostName(GetRealPath(url.GetHostName())); 1011 + url.SetFileName(resolvePath(url.GetFileName())); 1012 +  1013 + return url.Get(); 1014 +} 1015 + 1016 +std::string URIUtils::resolvePath(const std::string &path) 1017 +{ 1018 + if (path.empty()) 1019 + return path; 1020 + 1021 + size_t posSlash = path.find('/'); 1022 + size_t posBackslash = path.find('\\'); 1023 + string delim = posSlash < posBackslash ? "/" : "\\"; 1024 + vector parts = StringUtils::Split(path, delim); 1025 + vector realParts; 1026 + 1027 + for (vector::const_iterator part = parts.begin(); part != parts.end(); part++) 1028 + { 1029 + if (part->empty() || part->compare(".") == 0) 1030 + continue; 1031 + 1032 + // go one level back up 1033 + if (part->compare("..") == 0) 1034 + { 1035 + if (!realParts.empty()) 1036 + realParts.pop_back(); 1037 + continue; 1038 + } 1039 + 1040 + realParts.push_back(*part); 1041 + } 1042 + 1043 + CStdString realPath; 1044 + int i = 0; 1045 + // re-add any / or \ at the beginning 1046 + while (path.at(i) == delim.at(0)) 1047 + { 1048 + realPath += delim; 1049 + i++; 1050 + } 1051 + // put together the path 1052 + realPath += StringUtils::Join(realParts, delim); 1053 + // re-add any / or \ at the end 1054 + if (path.at(path.size() - 1) == delim.at(0) && realPath.at(realPath.size() - 1) != delim.at(0)) 1055 + realPath += delim; 1056 + 1057 + return realPath; 1058 +}
 @@ -116,5 +116,22 @@ class URIUtils 116 116  static bool ProtocolHasParentInHostname(const CStdString& prot); 117 117  static bool ProtocolHasEncodedHostname(const CStdString& prot); 118 118  static bool ProtocolHasEncodedFilename(const CStdString& prot); 119 + 120 + /*! 121 + \brief Cleans up the given path by resolving "." and ".." 122 + and returns it. 123 + 124 + This methods goes through the given path and removes any "." 125 + (as it states "this directory") and resolves any ".." by 126 + removing the previous directory from the path. This is done 127 + for file paths and host names (in case of VFS paths). 128 + 129 + \param path Path to be cleaned up 130 + \return Actual path without any "." or ".." 131 + */ 132 + static std::string GetRealPath(const std::string &path); 133 + 134 +private: 135 + static std::string resolvePath(const std::string &path); 119 136  }; 120 137 
 @@ -491,3 +491,62 @@ class TestURIUtils : public testing::Test 491 491  EXPECT_TRUE(URIUtils::ProtocolHasEncodedFilename("rss")); 492 492  EXPECT_TRUE(URIUtils::ProtocolHasEncodedFilename("davs")); 493 493  } 494 + 495 +TEST_F(TestURIUtils, GetRealPath) 496 +{ 497 + std::string ref; 498 +  499 + ref = "/path/to/file/"; 500 + EXPECT_STREQ(ref.c_str(), URIUtils::GetRealPath(ref).c_str()); 501 +  502 + ref = "path/to/file"; 503 + EXPECT_STREQ(ref.c_str(), URIUtils::GetRealPath("../path/to/file").c_str()); 504 + EXPECT_STREQ(ref.c_str(), URIUtils::GetRealPath("./path/to/file").c_str()); 505 + 506 + ref = "/path/to/file"; 507 + EXPECT_STREQ(ref.c_str(), URIUtils::GetRealPath(ref).c_str()); 508 + EXPECT_STREQ(ref.c_str(), URIUtils::GetRealPath("/path/to/./file").c_str()); 509 + EXPECT_STREQ(ref.c_str(), URIUtils::GetRealPath("/./path/to/./file").c_str()); 510 + EXPECT_STREQ(ref.c_str(), URIUtils::GetRealPath("/path/to/some/../file").c_str()); 511 + EXPECT_STREQ(ref.c_str(), URIUtils::GetRealPath("/../path/to/some/../file").c_str()); 512 + 513 + ref = "/path/to"; 514 + EXPECT_STREQ(ref.c_str(), URIUtils::GetRealPath("/path/to/some/../file/..").c_str()); 515 + 516 +#ifdef TARGET_WINDOWS 517 + ref = "\\\\path\\to\\file\\"; 518 + EXPECT_STREQ(ref.c_str(), URIUtils::GetRealPath(ref).c_str()); 519 +  520 + ref = "path\\to\\file"; 521 + EXPECT_STREQ(ref.c_str(), URIUtils::GetRealPath("..\\path\\to\\file").c_str()); 522 + EXPECT_STREQ(ref.c_str(), URIUtils::GetRealPath(".\\path\\to\\file").c_str()); 523 + 524 + ref = "\\\\path\\to\\file"; 525 + EXPECT_STREQ(ref.c_str(), URIUtils::GetRealPath(ref).c_str()); 526 + EXPECT_STREQ(ref.c_str(), URIUtils::GetRealPath("\\\\path\\to\\.\\file").c_str()); 527 + EXPECT_STREQ(ref.c_str(), URIUtils::GetRealPath("\\\\.\\path/to\\.\\file").c_str()); 528 + EXPECT_STREQ(ref.c_str(), URIUtils::GetRealPath("\\\\path\\to\\some\\..\\file").c_str()); 529 + EXPECT_STREQ(ref.c_str(), URIUtils::GetRealPath("\\\\..\\path\\to\\some\\..\\file").c_str()); 530 + 531 + ref = "\\\\path\\to"; 532 + EXPECT_STREQ(ref.c_str(), URIUtils::GetRealPath("\\\\path\\to\\some\\..\\file\\..").c_str()); 533 +#endif 534 + 535 + // test rar/zip paths 536 + ref = "rar://%2fpath%2fto%2frar/subpath/to/file"; 537 + EXPECT_STRCASEEQ(ref.c_str(), URIUtils::GetRealPath(ref).c_str()); 538 +  539 + // test rar/zip paths 540 + ref = "rar://%2fpath%2fto%2frar/subpath/to/file"; 541 + EXPECT_STRCASEEQ(ref.c_str(), URIUtils::GetRealPath("rar://%2fpath%2fto%2frar/../subpath/to/file").c_str()); 542 + EXPECT_STRCASEEQ(ref.c_str(), URIUtils::GetRealPath("rar://%2fpath%2fto%2frar/./subpath/to/file").c_str()); 543 + EXPECT_STRCASEEQ(ref.c_str(), URIUtils::GetRealPath("rar://%2fpath%2fto%2frar/subpath/to/./file").c_str()); 544 + EXPECT_STRCASEEQ(ref.c_str(), URIUtils::GetRealPath("rar://%2fpath%2fto%2frar/subpath/to/some/../file").c_str()); 545 +  546 + EXPECT_STRCASEEQ(ref.c_str(), URIUtils::GetRealPath("rar://%2fpath%2fto%2f.%2frar/subpath/to/file").c_str()); 547 + EXPECT_STRCASEEQ(ref.c_str(), URIUtils::GetRealPath("rar://%2fpath%2fto%2fsome%2f..%2frar/subpath/to/file").c_str()); 548 + 549 + // test rar/zip path in rar/zip path 550 + ref ="zip://rar%3A%2F%2F%252Fpath%252Fto%252Frar%2Fpath%2Fto%2Fzip/subpath/to/file"; 551 + EXPECT_STRCASEEQ(ref.c_str(), URIUtils::GetRealPath("zip://rar%3A%2F%2F%252Fpath%252Fto%252Fsome%252F..%252Frar%2Fpath%2Fto%2Fsome%2F..%2Fzip/subpath/to/some/../file").c_str()); 552 +}