Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getAllCookies error (src/Net/Http.fs) #904

Closed
zbilbo opened this issue Jan 5, 2016 · 6 comments
Closed

getAllCookies error (src/Net/Http.fs) #904

zbilbo opened this issue Jan 5, 2016 · 6 comments

Comments

@zbilbo
Copy link

zbilbo commented Jan 5, 2016

If doing the following:

let m3u8 =
Http.RequestString("http://nordond18b-f.akamaihd.net/i/wo/open/62/62b268f9ba1c8de6842dcba213678f17be286115/d254b306-d47d-4330-9a0b-a4f8ee14edeb_,64,192,.mp4.csmil/master.m3u8",
silentHttpErrors = true,
cookieContainer = cc,
httpMethod= "GET")

the returned results (headers):
HTTP/1.1 200 OK
Server: AkamaiGHost
Content-Length: 511
Content-Type: application/vnd.apple.mpegurl
Set-Cookie: alid=jB8JVpIAdoYZ6LSnfT58NA==; path=/i/wo/open/62/62b268f9ba1c8de6842dcba213678f17be286115/d254b306-d47d-4330-9a0b-a4f8ee14edeb_,64,192,.mp4.csmil/; domain=nordond18b-f.akamaihd.net
Mime-Version: 1.0
Access-Control-Allow-Origin: *
Expires: Tue, 05 Jan 2016 11:56:36 GMT
Cache-Control: max-age=0, no-cache
Pragma: no-cache
Date: Tue, 05 Jan 2016 11:56:36 GMT
Connection: keep-alive

then when it tries to fix stuff with getAllCookiesFromHeader it fails. epicly.

System.ArgumentOutOfRangeException: Length cannot be less than zero.
Parameter name: length
at System.String.Substring(Int32 startIndex, Int32 length)
at FSharp.Data.HttpHelpers.getAllCookiesFromHeader@671.Invoke(Int32 i, String cookiePart) in C:\Git\FSharp.Data\src\Net\Http.fs:line 675
at Microsoft.FSharp.Collections.ArrayModule.IterateIndexed[T](FSharpFunc`2 action, T[] array)

at FSharp.Data.HttpHelpers.getAllCookiesFromHeader(String header, Uri responseUri, CookieContainer cookieContainer) in C:\Git\FSharp.Data\src\Net\Http.fs:line 671
at <StartupCode$FSharp-Data>.$Http.InnerRequest@803-5.Invoke(WebResponse _arg2) in C:\Git\FSharp.Data\src\Net\Http.fs:line 803
at Microsoft.FSharp.Control.AsyncBuilderImpl.args@835-1.Invoke(a a)

the thing with the first split in getAllCookiesFromHeader apperently splits it "wrongly" and then again it fails when (about line 674)

                 let firstEqual = cookiePart.IndexOf "=" 
                 cookie.Name <- cookiePart.Substring(0, firstEqual) 
                 cookie.Value <- cookiePart.Substring(firstEqual + 1) 

where firstEqual == -1

Does it exist any "override" possibilities of this getAllCookiesFromHeader as a work around?

noooooooo, Im not (trying to) creating an MP3 net-ripper! ;-)

@zbilbo zbilbo changed the title getAllCookies error getAllCookies error (src/Net/Http.fs) Jan 5, 2016
@AtwoodTM
Copy link

I believe I found another example of issue 904 when attempting to scrape stock dividend data from web pages using F# and the FSharp.Data library. An example page can be seen at http://www.nasdaq.com/symbol/ibm/dividend-history.

To request the web page, my code is setup as a simple console app as an example and is as follows:

open FSharp.Data

[<EntryPoint>]
let main argv =
    let url = "http://www.nasdaq.com/symbol/ibm/dividend-history"
    let result = Http.RequestString(url)
    System.Console.ReadLine() |> ignore
    0 // return an integer exit code

When run, the RequestString method errors with:

"An unhandled exception of type 'System.ArgumentOutOfRangeException' occurred in FSharp.Core.dll

Additional information: Length cannot be less than zero."

The stack trace is as follows:

System.ArgumentOutOfRangeException: Length cannot be less than zero.
Parameter name: length
   at System.String.Substring(Int32 startIndex, Int32 length)
   at FSharp.Data.HttpHelpers.getAllCookiesFromHeader@671.Invoke(Int32 i, String cookiePart) in C:\Git\FSharp.Data\src\Net\Http.fs:line 675
   at Microsoft.FSharp.Collections.ArrayModule.IterateIndexed[T](FSharpFunc`2 action, T[] array)
   at FSharp.Data.HttpHelpers.getAllCookiesFromHeader(String header, Uri responseUri, CookieContainer cookieContainer) in C:\Git\FSharp.Data\src\Net\Http.fs:line 671
   at <StartupCode$FSharp-Data>.$Http.InnerRequest@803-5.Invoke(WebResponse _arg2) in C:\Git\FSharp.Data\src\Net\Http.fs:line 803
   at Microsoft.FSharp.Control.AsyncBuilderImpl.args@835-1.Invoke(a a)
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.FSharp.Control.AsyncBuilderImpl.commit[a](Result`1 res)
   at Microsoft.FSharp.Control.CancellationTokenOps.RunSynchronously[a](CancellationToken token, FSharpAsync`1 computation, FSharpOption`1 timeout)
>    at Microsoft.FSharp.Control.FSharpAsync.RunSynchronously[T](FSharpAsync`1 computation, FSharpOption`1 timeout, FSharpOption`1 cancellationToken)
   at <StartupCode$FSI_0004>.$FSI_0004.main@() in C:\Users\helgeu.COMPODEAL\AppData\Local\Temp\~vs2B9.fsx:line 8
Stopped due to error

@tpetricek
Copy link
Member

I can confirm this is a bug in our handling of cookies - thanks for reporting it. We'll fix this in the next version.

@tpetricek
Copy link
Member

Actually, this seems to be tricky. The cookie string we get is:

selectedsymboltype=IBM,COMMON STOCK,NYSE; domain=.nasdaq.com; expires=Tue, 25-Apr-2017 17:53:10 GMT

... which is wrong, because cookies should not contain , - our handling (see #659) does not quite deal with that though.

@AtwoodTM
Copy link

Hi @tpetricek, any thoughts on the above issue with the Nasdaq dividend data? Completely understand if you would not want to handle commas in cookies as they should not contain commas.

@tpetricek
Copy link
Member

Our handling of this is pretty ad-hoc at the moment. I think there is definitely room for improvement - it should be possible to write a parsing algorithm that handles with the above (e.g. by allowing commas before the first ;).

If you want to prototype a parser that would handle our current test cases + the above, that would be great - and I think we'd be very happy to merge it.

@ovatsus
Copy link

ovatsus commented May 21, 2016

.Net itself won't handle these kind of cookies correctly, we have our own parsing to workaround a similar issue found on the coursera website, I can extend it to also cover this case.
Ideally we would need a proper parser, but I don't have time to do it now, so will wait for someone else to send a PR with that

ovatsus pushed a commit that referenced this issue May 21, 2016
Handle cookies with commas in their value correctly (closes #904)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants