-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
getAllCookies error (src/Net/Http.fs) #904
Comments
I believe I found another example of issue 904 when attempting to scrape stock dividend data from web pages using F# and the FSharp.Data library. An example page can be seen at http://www.nasdaq.com/symbol/ibm/dividend-history. To request the web page, my code is setup as a simple console app as an example and is as follows:
When run, the RequestString method errors with: "An unhandled exception of type 'System.ArgumentOutOfRangeException' occurred in FSharp.Core.dll Additional information: Length cannot be less than zero." The stack trace is as follows:
|
I can confirm this is a bug in our handling of cookies - thanks for reporting it. We'll fix this in the next version. |
Actually, this seems to be tricky. The cookie string we get is:
... which is wrong, because cookies should not contain |
Hi @tpetricek, any thoughts on the above issue with the Nasdaq dividend data? Completely understand if you would not want to handle commas in cookies as they should not contain commas. |
Our handling of this is pretty ad-hoc at the moment. I think there is definitely room for improvement - it should be possible to write a parsing algorithm that handles with the above (e.g. by allowing commas before the first If you want to prototype a parser that would handle our current test cases + the above, that would be great - and I think we'd be very happy to merge it. |
.Net itself won't handle these kind of cookies correctly, we have our own parsing to workaround a similar issue found on the coursera website, I can extend it to also cover this case. |
Handle cookies with commas in their value correctly (closes #904)
If doing the following:
let m3u8 =
Http.RequestString("http://nordond18b-f.akamaihd.net/i/wo/open/62/62b268f9ba1c8de6842dcba213678f17be286115/d254b306-d47d-4330-9a0b-a4f8ee14edeb_,64,192,.mp4.csmil/master.m3u8",
silentHttpErrors = true,
cookieContainer = cc,
httpMethod= "GET")
the returned results (headers):
HTTP/1.1 200 OK
Server: AkamaiGHost
Content-Length: 511
Content-Type: application/vnd.apple.mpegurl
Set-Cookie: alid=jB8JVpIAdoYZ6LSnfT58NA==; path=/i/wo/open/62/62b268f9ba1c8de6842dcba213678f17be286115/d254b306-d47d-4330-9a0b-a4f8ee14edeb_,64,192,.mp4.csmil/; domain=nordond18b-f.akamaihd.net
Mime-Version: 1.0
Access-Control-Allow-Origin: *
Expires: Tue, 05 Jan 2016 11:56:36 GMT
Cache-Control: max-age=0, no-cache
Pragma: no-cache
Date: Tue, 05 Jan 2016 11:56:36 GMT
Connection: keep-alive
then when it tries to fix stuff with getAllCookiesFromHeader it fails. epicly.
System.ArgumentOutOfRangeException: Length cannot be less than zero.
Parameter name: length
at System.String.Substring(Int32 startIndex, Int32 length)
at FSharp.Data.HttpHelpers.getAllCookiesFromHeader@671.Invoke(Int32 i, String cookiePart) in C:\Git\FSharp.Data\src\Net\Http.fs:line 675
at Microsoft.FSharp.Collections.ArrayModule.IterateIndexed[T](FSharpFunc`2 action, T[] array)
the thing with the first split in getAllCookiesFromHeader apperently splits it "wrongly" and then again it fails when (about line 674)
where firstEqual == -1
Does it exist any "override" possibilities of this getAllCookiesFromHeader as a work around?
noooooooo, Im not (trying to) creating an MP3 net-ripper! ;-)
The text was updated successfully, but these errors were encountered: